mbox series

[v2,0/4] Introduce security_create_user_ns()

Message ID 20220707223228.1940249-1-fred@cloudflare.com (mailing list archive)
Headers show
Series Introduce security_create_user_ns() | expand

Message

Frederick Lawler July 7, 2022, 10:32 p.m. UTC
While creating a LSM BPF MAC policy to block user namespace creation, we
used the LSM cred_prepare hook because that is the closest hook to prevent
a call to create_user_ns().

The calls look something like this:

    cred = prepare_creds()
        security_prepare_creds()
            call_int_hook(cred_prepare, ...
    if (cred)
        create_user_ns(cred)

We noticed that error codes were not propagated from this hook and
introduced a patch [1] to propagate those errors.

The discussion notes that security_prepare_creds()
is not appropriate for MAC policies, and instead the hook is
meant for LSM authors to prepare credentials for mutation. [2]

Ultimately, we concluded that a better course of action is to introduce
a new security hook for LSM authors. [3]

This patch set first introduces a new security_create_user_ns() function
and create_user_ns LSM hook, then marks the hook as sleepable in BPF.

Links:
1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/

Changes since v1:
- Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
- Add selinux: Implement create_user_ns hook patch
- Change function signature of security_create_user_ns() to only take
  struct cred
- Move security_create_user_ns() call after id mapping check in
  create_user_ns()
- Update documentation to reflect changes

Frederick Lawler (4):
  security, lsm: Introduce security_create_user_ns()
  bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
  selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
  selinux: Implement create_user_ns hook

 include/linux/lsm_hook_defs.h                 |  1 +
 include/linux/lsm_hooks.h                     |  4 +
 include/linux/security.h                      |  6 ++
 kernel/bpf/bpf_lsm.c                          |  1 +
 kernel/user_namespace.c                       |  5 ++
 security/security.c                           |  5 ++
 security/selinux/hooks.c                      |  9 ++
 security/selinux/include/classmap.h           |  2 +
 .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
 .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
 10 files changed, 160 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c

Comments

Christian Göttsche July 8, 2022, 12:10 p.m. UTC | #1
,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com> wrote:
>
> While creating a LSM BPF MAC policy to block user namespace creation, we
> used the LSM cred_prepare hook because that is the closest hook to prevent
> a call to create_user_ns().
>
> The calls look something like this:
>
>     cred = prepare_creds()
>         security_prepare_creds()
>             call_int_hook(cred_prepare, ...
>     if (cred)
>         create_user_ns(cred)
>
> We noticed that error codes were not propagated from this hook and
> introduced a patch [1] to propagate those errors.
>
> The discussion notes that security_prepare_creds()
> is not appropriate for MAC policies, and instead the hook is
> meant for LSM authors to prepare credentials for mutation. [2]
>
> Ultimately, we concluded that a better course of action is to introduce
> a new security hook for LSM authors. [3]
>
> This patch set first introduces a new security_create_user_ns() function
> and create_user_ns LSM hook, then marks the hook as sleepable in BPF.

Some thoughts:

I.

Why not make the hook more generic, e.g. support all other existing
and potential future namespaces?
Also I think the naming scheme is <object>_<verb>.

    LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
unsigned int flags)

where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
(like CLONE_NEWUSER).

II.

While adding policing for namespaces maybe also add a new hook for setns(2)

    LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
struct cred *obj, unsigned int flags)

III.

Maybe even attach a security context to namespaces so they can be
further governed?
SELinux example:

    type domainA_userns_t;
    type_transition domainA_t domainA_t : namespace domainA_userns_t "user";
    allow domainA_t domainA_userns_t:namespace create;

    # domainB calling setns(2) with domainA as target
    allow domainB_t domainA_userns_t:namespace join;

>
> Links:
> 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>
> Changes since v1:
> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> - Add selinux: Implement create_user_ns hook patch
> - Change function signature of security_create_user_ns() to only take
>   struct cred
> - Move security_create_user_ns() call after id mapping check in
>   create_user_ns()
> - Update documentation to reflect changes
>
> Frederick Lawler (4):
>   security, lsm: Introduce security_create_user_ns()
>   bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
>   selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>   selinux: Implement create_user_ns hook
>
>  include/linux/lsm_hook_defs.h                 |  1 +
>  include/linux/lsm_hooks.h                     |  4 +
>  include/linux/security.h                      |  6 ++
>  kernel/bpf/bpf_lsm.c                          |  1 +
>  kernel/user_namespace.c                       |  5 ++
>  security/security.c                           |  5 ++
>  security/selinux/hooks.c                      |  9 ++
>  security/selinux/include/classmap.h           |  2 +
>  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>  10 files changed, 160 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
>
> --
> 2.30.2
>
Frederick Lawler July 8, 2022, 2:01 p.m. UTC | #2
On 7/8/22 7:10 AM, Christian Göttsche wrote:
> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com> wrote:
>>
>> While creating a LSM BPF MAC policy to block user namespace creation, we
>> used the LSM cred_prepare hook because that is the closest hook to prevent
>> a call to create_user_ns().
>>
>> The calls look something like this:
>>
>>      cred = prepare_creds()
>>          security_prepare_creds()
>>              call_int_hook(cred_prepare, ...
>>      if (cred)
>>          create_user_ns(cred)
>>
>> We noticed that error codes were not propagated from this hook and
>> introduced a patch [1] to propagate those errors.
>>
>> The discussion notes that security_prepare_creds()
>> is not appropriate for MAC policies, and instead the hook is
>> meant for LSM authors to prepare credentials for mutation. [2]
>>
>> Ultimately, we concluded that a better course of action is to introduce
>> a new security hook for LSM authors. [3]
>>
>> This patch set first introduces a new security_create_user_ns() function
>> and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
> 
> Some thoughts:
> 
> I.
> 
> Why not make the hook more generic, e.g. support all other existing
> and potential future namespaces?

The main issue with a generic hook is that different namespaces have 
different calling contexts. We decided in a previous discussion to 
opt-out of a generic hook for this reason. [1]

> Also I think the naming scheme is <object>_<verb>.

That's a good call out. I was originally hoping to keep the security_*() 
match with the hook name matched with the caller function to keep things 
all aligned. If no one objects to renaming the hook, I can rename the 
hook for v3.

> 
>      LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
> unsigned int flags)
> 
> where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
> (like CLONE_NEWUSER).
> 
> II.
> 
> While adding policing for namespaces maybe also add a new hook for setns(2)
> 
>      LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
> struct cred *obj, unsigned int flags)
> 

IIUC, setns() will create a new namespace for the other namespaces 
except for user namespace. If we add a security hook for the other 
create_*_ns() functions, then we can catch setns() at that point.

> III.
> 
> Maybe even attach a security context to namespaces so they can be
> further governed?
> SELinux example:
> 
>      type domainA_userns_t;
>      type_transition domainA_t domainA_t : namespace domainA_userns_t "user";
>      allow domainA_t domainA_userns_t:namespace create;
> 
>      # domainB calling setns(2) with domainA as target
>      allow domainB_t domainA_userns_t:namespace join;
> 

Links:
1. 
https://lore.kernel.org/all/CAHC9VhSTkEMT90Tk+=iTyp3npWEm+3imrkFVX2qb=XsOPp9F=A@mail.gmail.com/

>>
>> Links:
>> 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
>> 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
>> 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>>
>> Changes since v1:
>> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
>> - Add selinux: Implement create_user_ns hook patch
>> - Change function signature of security_create_user_ns() to only take
>>    struct cred
>> - Move security_create_user_ns() call after id mapping check in
>>    create_user_ns()
>> - Update documentation to reflect changes
>>
>> Frederick Lawler (4):
>>    security, lsm: Introduce security_create_user_ns()
>>    bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
>>    selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>>    selinux: Implement create_user_ns hook
>>
>>   include/linux/lsm_hook_defs.h                 |  1 +
>>   include/linux/lsm_hooks.h                     |  4 +
>>   include/linux/security.h                      |  6 ++
>>   kernel/bpf/bpf_lsm.c                          |  1 +
>>   kernel/user_namespace.c                       |  5 ++
>>   security/security.c                           |  5 ++
>>   security/selinux/hooks.c                      |  9 ++
>>   security/selinux/include/classmap.h           |  2 +
>>   .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>>   .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>>   10 files changed, 160 insertions(+)
>>   create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>>   create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
>>
>> --
>> 2.30.2
>>
Christian Brauner July 8, 2022, 2:35 p.m. UTC | #3
On Fri, Jul 08, 2022 at 09:01:32AM -0500, Frederick Lawler wrote:
> On 7/8/22 7:10 AM, Christian Göttsche wrote:
> > ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com> wrote:
> > > 
> > > While creating a LSM BPF MAC policy to block user namespace creation, we
> > > used the LSM cred_prepare hook because that is the closest hook to prevent
> > > a call to create_user_ns().
> > > 
> > > The calls look something like this:
> > > 
> > >      cred = prepare_creds()
> > >          security_prepare_creds()
> > >              call_int_hook(cred_prepare, ...
> > >      if (cred)
> > >          create_user_ns(cred)
> > > 
> > > We noticed that error codes were not propagated from this hook and
> > > introduced a patch [1] to propagate those errors.
> > > 
> > > The discussion notes that security_prepare_creds()
> > > is not appropriate for MAC policies, and instead the hook is
> > > meant for LSM authors to prepare credentials for mutation. [2]
> > > 
> > > Ultimately, we concluded that a better course of action is to introduce
> > > a new security hook for LSM authors. [3]
> > > 
> > > This patch set first introduces a new security_create_user_ns() function
> > > and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
> > 
> > Some thoughts:
> > 
> > I.
> > 
> > Why not make the hook more generic, e.g. support all other existing
> > and potential future namespaces?
> 
> The main issue with a generic hook is that different namespaces have
> different calling contexts. We decided in a previous discussion to opt-out
> of a generic hook for this reason. [1]

Agreed.

> 
> > Also I think the naming scheme is <object>_<verb>.
> 
> That's a good call out. I was originally hoping to keep the security_*()
> match with the hook name matched with the caller function to keep things all
> aligned. If no one objects to renaming the hook, I can rename the hook for
> v3.
> 
> > 
> >      LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
> > unsigned int flags)
> > 
> > where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
> > (like CLONE_NEWUSER).
> > 
> > II.
> > 
> > While adding policing for namespaces maybe also add a new hook for setns(2)
> > 
> >      LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
> > struct cred *obj, unsigned int flags)
> > 
> 
> IIUC, setns() will create a new namespace for the other namespaces except
> for user namespace. If we add a security hook for the other create_*_ns()

setns() doesn't create new namespaces. It just switches to already
existing ones:

setns(<pidfd>, <flags>)
-> prepare_nsset()
      /* 
       * Notice the 0 passed as flags which means all namespaces will
       * just take a reference.
       */
   -> create_new_namespaces(0, ...)

you're thinking about unshare() and unshare() will be caught in
create_user_ns().

> functions, then we can catch setns() at that point.

If you block the creation of user namespaces by unprivileged users in
create_user_ns() you can only create user namespaces as a privileged
user. Consequently only a privileged users can setns() to a user
namespace. So either the caller has CAP_SYS_ADMIN in the parent userns
or they are located in the parent userns and are the owner of the userns
they are attaching to. So if you lock create_user_ns() to
capable(CAP_SYS_ADMIN) you should be done.
Casey Schaufler July 8, 2022, 4:11 p.m. UTC | #4
On 7/8/2022 7:01 AM, Frederick Lawler wrote:
> On 7/8/22 7:10 AM, Christian Göttsche wrote:
>> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
>> wrote:
>>>
>>> While creating a LSM BPF MAC policy to block user namespace
>>> creation, we
>>> used the LSM cred_prepare hook because that is the closest hook to
>>> prevent
>>> a call to create_user_ns().
>>>
>>> The calls look something like this:
>>>
>>>      cred = prepare_creds()
>>>          security_prepare_creds()
>>>              call_int_hook(cred_prepare, ...
>>>      if (cred)
>>>          create_user_ns(cred)
>>>
>>> We noticed that error codes were not propagated from this hook and
>>> introduced a patch [1] to propagate those errors.
>>>
>>> The discussion notes that security_prepare_creds()
>>> is not appropriate for MAC policies, and instead the hook is
>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>
>>> Ultimately, we concluded that a better course of action is to introduce
>>> a new security hook for LSM authors. [3]
>>>
>>> This patch set first introduces a new security_create_user_ns()
>>> function
>>> and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
>>
>> Some thoughts:
>>
>> I.
>>
>> Why not make the hook more generic, e.g. support all other existing
>> and potential future namespaces?
>
> The main issue with a generic hook is that different namespaces have
> different calling contexts. We decided in a previous discussion to
> opt-out of a generic hook for this reason. [1]
>
>> Also I think the naming scheme is <object>_<verb>.
>
> That's a good call out. I was originally hoping to keep the
> security_*() match with the hook name matched with the caller function
> to keep things all aligned. If no one objects to renaming the hook, I
> can rename the hook for v3.
>
>>
>>      LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
>> unsigned int flags)
>>
>> where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
>> (like CLONE_NEWUSER).
>>
>> II.
>>
>> While adding policing for namespaces maybe also add a new hook for
>> setns(2)
>>
>>      LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
>> struct cred *obj, unsigned int flags)
>>
>
> IIUC, setns() will create a new namespace for the other namespaces
> except for user namespace. If we add a security hook for the other
> create_*_ns() functions, then we can catch setns() at that point.
>
>> III.
>>
>> Maybe even attach a security context to namespaces so they can be
>> further governed?

That would likely add confusion to the existing security module namespace
efforts. SELinux, Smack and AppArmor have all developed namespace models.
That, or it could replace the various independent efforts with a single,
unified security module namespace effort. There's more work to that than
adding a context to a namespace. Treating namespaces as objects is almost,
but not quite, solidifying containers as a kernel construct. We know we
can't do that.

>> SELinux example:
>>
>>      type domainA_userns_t;
>>      type_transition domainA_t domainA_t : namespace domainA_userns_t
>> "user";
>>      allow domainA_t domainA_userns_t:namespace create;
>>
>>      # domainB calling setns(2) with domainA as target
>>      allow domainB_t domainA_userns_t:namespace join;

While I'm not an expert on SELinux policy, I'd bet a refreshing beverage
that there's already a way to achieve this with existing constructs.
Smack, which is subject+object MAC couldn't care less about the user
namespace configuration. User namespaces are DAC constructs.

>>
>
> Links:
> 1.
> https://lore.kernel.org/all/CAHC9VhSTkEMT90Tk+=iTyp3npWEm+3imrkFVX2qb=XsOPp9F=A@mail.gmail.com/
>
>>>
>>> Links:
>>> 1.
>>> https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
>>>
>>> 2.
>>> https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
>>> 3.
>>> https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>>>
>>> Changes since v1:
>>> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>>> patch
>>> - Add selinux: Implement create_user_ns hook patch
>>> - Change function signature of security_create_user_ns() to only take
>>>    struct cred
>>> - Move security_create_user_ns() call after id mapping check in
>>>    create_user_ns()
>>> - Update documentation to reflect changes
>>>
>>> Frederick Lawler (4):
>>>    security, lsm: Introduce security_create_user_ns()
>>>    bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
>>>    selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>>>    selinux: Implement create_user_ns hook
>>>
>>>   include/linux/lsm_hook_defs.h                 |  1 +
>>>   include/linux/lsm_hooks.h                     |  4 +
>>>   include/linux/security.h                      |  6 ++
>>>   kernel/bpf/bpf_lsm.c                          |  1 +
>>>   kernel/user_namespace.c                       |  5 ++
>>>   security/security.c                           |  5 ++
>>>   security/selinux/hooks.c                      |  9 ++
>>>   security/selinux/include/classmap.h           |  2 +
>>>   .../selftests/bpf/prog_tests/deny_namespace.c | 88
>>> +++++++++++++++++++
>>>   .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>>>   10 files changed, 160 insertions(+)
>>>   create mode 100644
>>> tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>>>   create mode 100644
>>> tools/testing/selftests/bpf/progs/test_deny_namespace.c
>>>
>>> -- 
>>> 2.30.2
>>>
>
Serge Hallyn July 14, 2022, 2:27 p.m. UTC | #5
On Fri, Jul 08, 2022 at 09:11:15AM -0700, Casey Schaufler wrote:
> On 7/8/2022 7:01 AM, Frederick Lawler wrote:
> > On 7/8/22 7:10 AM, Christian Göttsche wrote:
> >> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
> >> wrote:
> >>>
> >>> While creating a LSM BPF MAC policy to block user namespace
> >>> creation, we
> >>> used the LSM cred_prepare hook because that is the closest hook to
> >>> prevent
> >>> a call to create_user_ns().
> >>>
> >>> The calls look something like this:
> >>>
> >>>      cred = prepare_creds()
> >>>          security_prepare_creds()
> >>>              call_int_hook(cred_prepare, ...
> >>>      if (cred)
> >>>          create_user_ns(cred)
> >>>
> >>> We noticed that error codes were not propagated from this hook and
> >>> introduced a patch [1] to propagate those errors.
> >>>
> >>> The discussion notes that security_prepare_creds()
> >>> is not appropriate for MAC policies, and instead the hook is
> >>> meant for LSM authors to prepare credentials for mutation. [2]
> >>>
> >>> Ultimately, we concluded that a better course of action is to introduce
> >>> a new security hook for LSM authors. [3]
> >>>
> >>> This patch set first introduces a new security_create_user_ns()
> >>> function
> >>> and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
> >>
> >> Some thoughts:
> >>
> >> I.
> >>
> >> Why not make the hook more generic, e.g. support all other existing
> >> and potential future namespaces?
> >
> > The main issue with a generic hook is that different namespaces have
> > different calling contexts. We decided in a previous discussion to
> > opt-out of a generic hook for this reason. [1]
> >
> >> Also I think the naming scheme is <object>_<verb>.
> >
> > That's a good call out. I was originally hoping to keep the
> > security_*() match with the hook name matched with the caller function
> > to keep things all aligned. If no one objects to renaming the hook, I
> > can rename the hook for v3.
> >
> >>
> >>      LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
> >> unsigned int flags)
> >>
> >> where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
> >> (like CLONE_NEWUSER).
> >>
> >> II.
> >>
> >> While adding policing for namespaces maybe also add a new hook for
> >> setns(2)
> >>
> >>      LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
> >> struct cred *obj, unsigned int flags)
> >>
> >
> > IIUC, setns() will create a new namespace for the other namespaces
> > except for user namespace. If we add a security hook for the other
> > create_*_ns() functions, then we can catch setns() at that point.
> >
> >> III.
> >>
> >> Maybe even attach a security context to namespaces so they can be
> >> further governed?
> 
> That would likely add confusion to the existing security module namespace
> efforts. SELinux, Smack and AppArmor have all developed namespace models.
> That, or it could replace the various independent efforts with a single,

I feel like you're attaching more meaning to this than there needs to be.
I *think* he's just talking about a user_namespace->u_security void*.
So that for instance while deciding whether to allow some transition,
selinux could check whether the caller's user namespace was created by
a task in an selinux context authorized to create user namespaces.

The "user namespaces are DAC and orthogonal to MAC" is of course true
(where the LSM does not itself tie them together), except that we all
know that a process running as root in a user namespace gains access to
often-less-trustworthy code gated under CAP_SYS_ADMIN.

> unified security module namespace effort. There's more work to that than
> adding a context to a namespace. Treating namespaces as objects is almost,
> but not quite, solidifying containers as a kernel construct. We know we
> can't do that.

What we "can't do" (imo) is to create a "full container" construct which
ties together the various namespaces and other concepts in a restrictive
way.

> >> SELinux example:
> >>
> >>      type domainA_userns_t;
> >>      type_transition domainA_t domainA_t : namespace domainA_userns_t
> >> "user";
> >>      allow domainA_t domainA_userns_t:namespace create;
> >>
> >>      # domainB calling setns(2) with domainA as target
> >>      allow domainB_t domainA_userns_t:namespace join;
> 
> While I'm not an expert on SELinux policy, I'd bet a refreshing beverage
> that there's already a way to achieve this with existing constructs.
> Smack, which is subject+object MAC couldn't care less about the user
> namespace configuration. User namespaces are DAC constructs.
> 
> >>
> >
> > Links:
> > 1.
> > https://lore.kernel.org/all/CAHC9VhSTkEMT90Tk+=iTyp3npWEm+3imrkFVX2qb=XsOPp9F=A@mail.gmail.com/
> >
> >>>
> >>> Links:
> >>> 1.
> >>> https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> >>>
> >>> 2.
> >>> https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> >>> 3.
> >>> https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
> >>>
> >>> Changes since v1:
> >>> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
> >>> patch
> >>> - Add selinux: Implement create_user_ns hook patch
> >>> - Change function signature of security_create_user_ns() to only take
> >>>    struct cred
> >>> - Move security_create_user_ns() call after id mapping check in
> >>>    create_user_ns()
> >>> - Update documentation to reflect changes
> >>>
> >>> Frederick Lawler (4):
> >>>    security, lsm: Introduce security_create_user_ns()
> >>>    bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
> >>>    selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
> >>>    selinux: Implement create_user_ns hook
> >>>
> >>>   include/linux/lsm_hook_defs.h                 |  1 +
> >>>   include/linux/lsm_hooks.h                     |  4 +
> >>>   include/linux/security.h                      |  6 ++
> >>>   kernel/bpf/bpf_lsm.c                          |  1 +
> >>>   kernel/user_namespace.c                       |  5 ++
> >>>   security/security.c                           |  5 ++
> >>>   security/selinux/hooks.c                      |  9 ++
> >>>   security/selinux/include/classmap.h           |  2 +
> >>>   .../selftests/bpf/prog_tests/deny_namespace.c | 88
> >>> +++++++++++++++++++
> >>>   .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
> >>>   10 files changed, 160 insertions(+)
> >>>   create mode 100644
> >>> tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> >>>   create mode 100644
> >>> tools/testing/selftests/bpf/progs/test_deny_namespace.c
> >>>
> >>> -- 
> >>> 2.30.2
> >>>
> >
Frederick Lawler July 19, 2022, 7:59 p.m. UTC | #6
On 7/14/22 9:27 AM, Serge E. Hallyn wrote:
> On Fri, Jul 08, 2022 at 09:11:15AM -0700, Casey Schaufler wrote:
>> On 7/8/2022 7:01 AM, Frederick Lawler wrote:
>>> On 7/8/22 7:10 AM, Christian Göttsche wrote:
>>>> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
>>>> wrote:
>>>>>
>>>>> While creating a LSM BPF MAC policy to block user namespace
>>>>> creation, we
>>>>> used the LSM cred_prepare hook because that is the closest hook to
>>>>> prevent
>>>>> a call to create_user_ns().
>>>>>
>>>>> The calls look something like this:
>>>>>
>>>>>       cred = prepare_creds()
>>>>>           security_prepare_creds()
>>>>>               call_int_hook(cred_prepare, ...
>>>>>       if (cred)
>>>>>           create_user_ns(cred)
>>>>>
>>>>> We noticed that error codes were not propagated from this hook and
>>>>> introduced a patch [1] to propagate those errors.
>>>>>
>>>>> The discussion notes that security_prepare_creds()
>>>>> is not appropriate for MAC policies, and instead the hook is
>>>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>>>
>>>>> Ultimately, we concluded that a better course of action is to introduce
>>>>> a new security hook for LSM authors. [3]
>>>>>
>>>>> This patch set first introduces a new security_create_user_ns()
>>>>> function
>>>>> and create_user_ns LSM hook, then marks the hook as sleepable in BPF.
>>>>
>>>> Some thoughts:
>>>>
>>>> I.
>>>>
>>>> Why not make the hook more generic, e.g. support all other existing
>>>> and potential future namespaces?
>>>
>>> The main issue with a generic hook is that different namespaces have
>>> different calling contexts. We decided in a previous discussion to
>>> opt-out of a generic hook for this reason. [1]
>>>
>>>> Also I think the naming scheme is <object>_<verb>.
>>>
>>> That's a good call out. I was originally hoping to keep the
>>> security_*() match with the hook name matched with the caller function
>>> to keep things all aligned. If no one objects to renaming the hook, I
>>> can rename the hook for v3.
>>>
>>>>
>>>>       LSM_HOOK(int, 0, namespace_create, const struct cred *cred,
>>>> unsigned int flags)
>>>>
>>>> where flags is a bitmap of CLONE flags from include/uapi/linux/sched.h
>>>> (like CLONE_NEWUSER).
>>>>
>>>> II.
>>>>
>>>> While adding policing for namespaces maybe also add a new hook for
>>>> setns(2)
>>>>
>>>>       LSM_HOOK(int, 0, namespace_join, const struct cred *subj,  const
>>>> struct cred *obj, unsigned int flags)
>>>>
>>>
>>> IIUC, setns() will create a new namespace for the other namespaces
>>> except for user namespace. If we add a security hook for the other
>>> create_*_ns() functions, then we can catch setns() at that point.
>>>
>>>> III.
>>>>
>>>> Maybe even attach a security context to namespaces so they can be
>>>> further governed?
>>
>> That would likely add confusion to the existing security module namespace
>> efforts. SELinux, Smack and AppArmor have all developed namespace models.
>> That, or it could replace the various independent efforts with a single,
> 
> I feel like you're attaching more meaning to this than there needs to be.
> I *think* he's just talking about a user_namespace->u_security void*.
> So that for instance while deciding whether to allow some transition,
> selinux could check whether the caller's user namespace was created by
> a task in an selinux context authorized to create user namespaces.
> 
> The "user namespaces are DAC and orthogonal to MAC" is of course true
> (where the LSM does not itself tie them together), except that we all
> know that a process running as root in a user namespace gains access to
> often-less-trustworthy code gated under CAP_SYS_ADMIN.
> 
>> unified security module namespace effort. There's more work to that than
>> adding a context to a namespace. Treating namespaces as objects is almost,
>> but not quite, solidifying containers as a kernel construct. We know we
>> can't do that.
> 
> What we "can't do" (imo) is to create a "full container" construct which
> ties together the various namespaces and other concepts in a restrictive
> way.
> 

Is this the direction we want to go with the SELinux implementation? If 
so, where can I find a similar implementation to make the userns_create 
work with this? If not, I have a v3 with the hook name change ready to post.

>>>> SELinux example:
>>>>
>>>>       type domainA_userns_t;
>>>>       type_transition domainA_t domainA_t : namespace domainA_userns_t
>>>> "user";
>>>>       allow domainA_t domainA_userns_t:namespace create;
>>>>
>>>>       # domainB calling setns(2) with domainA as target
>>>>       allow domainB_t domainA_userns_t:namespace join;
>>
>> While I'm not an expert on SELinux policy, I'd bet a refreshing beverage
>> that there's already a way to achieve this with existing constructs.
>> Smack, which is subject+object MAC couldn't care less about the user
>> namespace configuration. User namespaces are DAC constructs.
>>
>>>>
>>>
>>> Links:
>>> 1.
>>> https://lore.kernel.org/all/CAHC9VhSTkEMT90Tk+=iTyp3npWEm+3imrkFVX2qb=XsOPp9F=A@mail.gmail.com/
>>>
>>>>>
>>>>> Links:
>>>>> 1.
>>>>> https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
>>>>>
>>>>> 2.
>>>>> https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
>>>>> 3.
>>>>> https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>>>>>
>>>>> Changes since v1:
>>>>> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>>>>> patch
>>>>> - Add selinux: Implement create_user_ns hook patch
>>>>> - Change function signature of security_create_user_ns() to only take
>>>>>     struct cred
>>>>> - Move security_create_user_ns() call after id mapping check in
>>>>>     create_user_ns()
>>>>> - Update documentation to reflect changes
>>>>>
>>>>> Frederick Lawler (4):
>>>>>     security, lsm: Introduce security_create_user_ns()
>>>>>     bpf-lsm: Make bpf_lsm_create_user_ns() sleepable
>>>>>     selftests/bpf: Add tests verifying bpf lsm create_user_ns hook
>>>>>     selinux: Implement create_user_ns hook
>>>>>
>>>>>    include/linux/lsm_hook_defs.h                 |  1 +
>>>>>    include/linux/lsm_hooks.h                     |  4 +
>>>>>    include/linux/security.h                      |  6 ++
>>>>>    kernel/bpf/bpf_lsm.c                          |  1 +
>>>>>    kernel/user_namespace.c                       |  5 ++
>>>>>    security/security.c                           |  5 ++
>>>>>    security/selinux/hooks.c                      |  9 ++
>>>>>    security/selinux/include/classmap.h           |  2 +
>>>>>    .../selftests/bpf/prog_tests/deny_namespace.c | 88
>>>>> +++++++++++++++++++
>>>>>    .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>>>>>    10 files changed, 160 insertions(+)
>>>>>    create mode 100644
>>>>> tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>>>>>    create mode 100644
>>>>> tools/testing/selftests/bpf/progs/test_deny_namespace.c
>>>>>
>>>>> -- 
>>>>> 2.30.2
>>>>>
>>>
Paul Moore July 20, 2022, 1:32 a.m. UTC | #7
On Fri, Jul 8, 2022 at 12:11 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/8/2022 7:01 AM, Frederick Lawler wrote:
> > On 7/8/22 7:10 AM, Christian Göttsche wrote:
> >> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
> >> wrote:

...

> >> Also I think the naming scheme is <object>_<verb>.
> >
> > That's a good call out. I was originally hoping to keep the
> > security_*() match with the hook name matched with the caller function
> > to keep things all aligned. If no one objects to renaming the hook, I
> > can rename the hook for v3.

No objection from me.

[Sorry for the delay, the last week or two has been pretty busy.]

> >> III.
> >>
> >> Maybe even attach a security context to namespaces so they can be
> >> further governed?
>
> That would likely add confusion to the existing security module namespace
> efforts. SELinux, Smack and AppArmor have all developed namespace models.

I'm not sure I fully understand what Casey is saying here as SELinux
does not yet have an established namespace model to the best of my
understanding, but perhaps we are talking about different concepts for
the word "namespace"?

From a SELinux perspective, if we are going to control access to a
namespace beyond simple creation, we would need to assign the
namespace a label (inherited from the creating process).  Although
that would need some discussion among the SELinux folks as this would
mean treating a userns as a proper system entity from a policy
perspective which is ... interesting.

> That, or it could replace the various independent efforts with a single,
> unified security module namespace effort.

We've talked about this before and I just don't see how that could
ever work, the LSM implementations are just too different to do
namespacing at the LSM layer.  If a LSM is going to namespace
themselves, they need the ability to define what that means without
having to worry about what other LSMs want to do.


--
paul-moore.com
Casey Schaufler July 20, 2022, 9:42 p.m. UTC | #8
On 7/19/2022 6:32 PM, Paul Moore wrote:
> On Fri, Jul 8, 2022 at 12:11 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>> On 7/8/2022 7:01 AM, Frederick Lawler wrote:
>>> On 7/8/22 7:10 AM, Christian Göttsche wrote:
>>>> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
>>>> wrote:
> ..
>
>>>> Also I think the naming scheme is <object>_<verb>.
>>> That's a good call out. I was originally hoping to keep the
>>> security_*() match with the hook name matched with the caller function
>>> to keep things all aligned. If no one objects to renaming the hook, I
>>> can rename the hook for v3.
> No objection from me.
>
> [Sorry for the delay, the last week or two has been pretty busy.]
>
>>>> III.
>>>>
>>>> Maybe even attach a security context to namespaces so they can be
>>>> further governed?
>> That would likely add confusion to the existing security module namespace
>> efforts. SELinux, Smack and AppArmor have all developed namespace models.
> I'm not sure I fully understand what Casey is saying here as SELinux
> does not yet have an established namespace model to the best of my
> understanding, but perhaps we are talking about different concepts for
> the word "namespace"?

Stephen Smalley proposed a SELinux namespace model, with patches,
some time back. It hasn't been adopted, but I've seen at least one
attempt to revive it. You're right that there isn't an established
model. The model proposed for Smack wasn't adopted either. My point
is that models have been developed and refinements and/or alternatives
are likely to be suggested.

>
> >From a SELinux perspective, if we are going to control access to a
> namespace beyond simple creation, we would need to assign the
> namespace a label (inherited from the creating process).  Although
> that would need some discussion among the SELinux folks as this would
> mean treating a userns as a proper system entity from a policy
> perspective which is ... interesting.
>
>> That, or it could replace the various independent efforts with a single,
>> unified security module namespace effort.
> We've talked about this before and I just don't see how that could
> ever work, the LSM implementations are just too different to do
> namespacing at the LSM layer.

It's possible that fresh eyes might see options that those who have
been staring at the current state and historical proposals may have
missed.

>   If a LSM is going to namespace
> themselves, they need the ability to define what that means without
> having to worry about what other LSMs want to do.

Possibly. On the other hand, if someone came up with a rational scheme
for general xattr namespacing I don't see that anyone would pass it up.
Paul Moore July 20, 2022, 10:39 p.m. UTC | #9
On Wed, Jul 20, 2022 at 5:42 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 7/19/2022 6:32 PM, Paul Moore wrote:
> > On Fri, Jul 8, 2022 at 12:11 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
> >> On 7/8/2022 7:01 AM, Frederick Lawler wrote:
> >>> On 7/8/22 7:10 AM, Christian Göttsche wrote:
> >>>> ,On Fri, 8 Jul 2022 at 00:32, Frederick Lawler <fred@cloudflare.com>
> >>>> wrote:

...

> >>>> III.
> >>>>
> >>>> Maybe even attach a security context to namespaces so they can be
> >>>> further governed?
> >> That would likely add confusion to the existing security module namespace
> >> efforts. SELinux, Smack and AppArmor have all developed namespace models.
> >
> > I'm not sure I fully understand what Casey is saying here as SELinux
> > does not yet have an established namespace model to the best of my
> > understanding, but perhaps we are talking about different concepts for
> > the word "namespace"?
>
> Stephen Smalley proposed a SELinux namespace model, with patches,
> some time back. It hasn't been adopted, but I've seen at least one
> attempt to revive it. You're right that there isn't an established
> model.

If it isn't in the mainline kernel, it isn't an established namespace model.

I ported Stephen's initial namespace patches to new kernels for quite
some time, look at the working-selinuxns branch in the main SELinux
repository, but that doesn't mean they are ready for upstreaming.
Aside from some pretty critical implementation holes, there is the
much larger conceptual issue of how to deal with persistent filesystem
objects.  We've discussed that quite a bit among the SELinux
developers but have yet to arrive at a good-enough solution.  I have
some thoughts on how we might be able to make forward progress on
that, but it's wildly off-topic for this patchset discussion.  I
mostly wanted to make sure I was understanding what you were
referencing when you talked about a "SELinux namespace model", and it
is what I suspected ... which I believe is unrelated to the patches
being discussed here.

> >> That, or it could replace the various independent efforts with a single,
> >> unified security module namespace effort.
> >
> > We've talked about this before and I just don't see how that could
> > ever work, the LSM implementations are just too different to do
> > namespacing at the LSM layer.
>
> It's possible that fresh eyes might see options that those who have
> been staring at the current state and historical proposals may have
> missed.

That's always a possibility, and I'm definitely open to a clever
approach that would resolve all the current issues and not paint us
into a corner in the future, but I haven't seen anything close (or any
serious effort for that matter).

... and this still remains way off-topic for a discussion around
adding a hook to allow LSMs to enforce access controls on user
namespace creation.

> >   If a LSM is going to namespace
> > themselves, they need the ability to define what that means without
> > having to worry about what other LSMs want to do.
>
> Possibly. On the other hand, if someone came up with a rational scheme
> for general xattr namespacing I don't see that anyone would pass it up.

Oh geez ...

Namespacing xattrs is not the same thing as namespacing LSMs.  LSMs
may make use of xattrs, and namespacing xattrs may make it easier to
namespace a given LSM, but I'm not aware of an in-tree LSM that would
be magically namespaced if xattrs were namespaced.

This patchset has nothing to do with xattrs, it deals with adding a
LSM hook to implement LSM-based access controls for user namespace
creation.