mbox series

[v7,00/14] ima: Namespace IMA with audit support in IMA-ns

Message ID 20211216054323.1707384-1-stefanb@linux.vnet.ibm.com (mailing list archive)
Headers show
Series ima: Namespace IMA with audit support in IMA-ns | expand

Message

Stefan Berger Dec. 16, 2021, 5:43 a.m. UTC
From: Stefan Berger <stefanb@linux.ibm.com>

The goal of this series of patches is to start with the namespacing of
IMA and support auditing within an IMA namespace (IMA-ns) as the first
step.

In this series the IMA namespace is piggy backing on the user namespace
and therefore an IMA namespace gets created when a user namespace is
created. The advantage of this is that the user namespace can provide
the keys infrastructure that IMA appraisal support will need later on.

We chose the goal of supporting auditing within an IMA namespace since it
requires the least changes to IMA. Following this series, auditing within
an IMA namespace can be activated by a user running the following lines
that rely on a statically linked busybox to be installed on the host for
execution within the minimal container environment:

mkdir -p rootfs/{bin,mnt,proc}
cp /sbin/busybox rootfs/bin
cp /sbin/busybox rootfs/bin/busybox2
echo >> rootfs/bin/busybox2
PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
  --root rootfs busybox sh -c \
 "busybox mount -t securityfs /mnt /mnt; \
  busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
  busybox2 cat /mnt/ima/policy"

[busybox2 is used to demonstrate 2 measurements; see below]

Following the audit log on the host the last line cat'ing the IMA policy
inside the namespace would have been audited. Unfortunately the auditing
line is not distinguishable from one stemming from actions on the host.
The hope here is that Richard Brigg's container id support for auditing
would help resolve the problem.

The following lines added to a suitable IMA policy on the host would
cause the execution of the commands inside the container (by uid 1000)
to be measured and audited as well on the host, thus leading to two
auditing messages for the 'busybox2 cat' above and log entries in IMA's
system log.

echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
        "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
    > /sys/kernel/security/ima/policy

The goal of supporting measurement and auditing by the host, of actions
occurring within IMA namespaces, is that users, particularly root,
should not be able to evade the host's IMA policy just by spawning
new IMA namespaces, running programs there, and discarding the namespaces
again. This is achieved through 'hierarchical processing' of file
accesses that are evaluated against the policy of the namespace where
the action occurred and against all namespaces' and their policies leading
back to the root IMA namespace (init_ima_ns).

The patch series adds support for a virtualized SecurityFS with a few
new API calls that are used by IMA namespacing. Only the data relevant
to the IMA namespace are shown. The files and directories of other
security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
up when secruityfs is mounted inside a user namespace.

Much of the code leading up to the virtualization of SecurityFS deals
with moving IMA's variables from various files into the IMA namespace
structure called 'ima_namespace'. When it comes to determining the
current IMA namespace I took the approach to get the current IMA
namespace (get_current_ns()) on the top level and pass the pointer all
the way down to those functions that now need access to the ima_namespace
to get to their variables. This later on comes in handy once hierarchical
processing is implemented in this series where we walk the list of
namespaces backwards and again need to pass the pointer into functions.

This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
IMA policy via reduced capabilities. We would again later on use this
capability to allow users to set file extended attributes for IMA appraisal
support.

My tree with these patches is here:

git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted 

Regards,
   Stefan

v7:
 - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
   from functions related to key queues where calls need ima_namespace
 - Moved ima_namespace to security/integrity/ima/ima.h
 - Extended API descriptions with ns parameter where needed
 - Using init_ima_ns in functions related to appraisal and xattrs
 - SecurityFS: Using ima_ns_from_file() to get ns pointer 
 - Reformatted to 80 columns per line

v6:
 - Removed kref and pointer to user_ns in ima_namespace (patch 1)
 - Moved only the policy file dentry into ima_namespace; other dentries are on
   stack now and can be discarded
 - Merged James's patch simplifying securityfs_remove and dropping dget()
 - Added patch with Christian's suggestion to tie opened SecurityFS file to
   the user/IMA namespace it belongs to
 - Passing missing ima_namespace parameter in functions in ima_kexec.c (ppc64)
 - Reverted v5's change to patch 4 related to protection of ima_namespace

v5:
 - Followed Christian's suggestions on patch 1. Also, reverted increased reference
   counter on init_user_ns since ima_ns doesn't take reference to its user_ns.
 - No addtional reference is taken on securityfs dentries for user_ns != init_user_ns.
   Updated documentation and removed cleanup of dentries on superblock kill.
   (patches 12 & 16)
 - Moved else branch to earlier patch (patch 11)
 - Protect ima_namespace by taking reference on user namespace for delayed work queue.
   (patch 4)

v4:
 - For consistency moved 'ns = get_current_ns()' to top of functions
 - Merge in James's latest SecurityFS patch

v3:
 - Further modifications to virtualized SecurityFS following James's posted patch
 - Dropping of early teardown for user_namespaces since not needed anymore

v2:
 - Folllwed Christian's suggestion to virtualize securitytfs; no more securityfs_ns
 - Followed James's advice for late 'population' of securityfs for IMA namespaces
 - Squashed 2 patches dealing with capabilities
 - Added missing 'depends on USER_NS' to Kconfig
 - Added missing 'static' to several functions


Mehmet Kayaalp (2):
  ima: Define ns_status for storing namespaced iint data
  ima: Namespace audit status flags

Stefan Berger (12):
  ima: Add IMA namespace support
  ima: Move policy related variables into ima_namespace
  ima: Move ima_htable into ima_namespace
  ima: Move measurement list related variables into ima_namespace
  ima: Only accept AUDIT rules for IMA non-init_ima_ns namespaces for
    now
  ima: Implement hierarchical processing of file accesses
  securityfs: Only use simple_pin_fs/simple_release_fs for init_user_ns
  securityfs: Extend securityfs with namespacing support
  ima: Move some IMA policy and filesystem related variables into
    ima_namespace
  ima: Use mac_admin_ns_capable() to check corresponding capability
  ima: Move dentry into ima_namespace and others onto stack
  ima: Setup securityfs for IMA namespace

 include/linux/capability.h                   |   6 +
 include/linux/ima.h                          |  56 ++++++
 include/linux/user_namespace.h               |   4 +
 init/Kconfig                                 |  13 ++
 kernel/user.c                                |   7 +
 kernel/user_namespace.c                      |   8 +
 security/inode.c                             |  77 ++++++--
 security/integrity/ima/Makefile              |   4 +-
 security/integrity/ima/ima.h                 | 178 ++++++++++++++-----
 security/integrity/ima/ima_api.c             |  34 ++--
 security/integrity/ima/ima_appraise.c        |  28 +--
 security/integrity/ima/ima_asymmetric_keys.c |   4 +-
 security/integrity/ima/ima_fs.c              | 126 ++++++++-----
 security/integrity/ima/ima_init.c            |  19 +-
 security/integrity/ima/ima_init_ima_ns.c     |  53 ++++++
 security/integrity/ima/ima_kexec.c           |  15 +-
 security/integrity/ima/ima_main.c            | 147 ++++++++++-----
 security/integrity/ima/ima_ns.c              |  88 +++++++++
 security/integrity/ima/ima_ns_status.c       | 127 +++++++++++++
 security/integrity/ima/ima_policy.c          | 168 ++++++++++-------
 security/integrity/ima/ima_queue.c           |  80 +++++----
 security/integrity/ima/ima_queue_keys.c      |  11 +-
 security/integrity/ima/ima_template.c        |   5 +-
 23 files changed, 967 insertions(+), 291 deletions(-)
 create mode 100644 security/integrity/ima/ima_init_ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns.c
 create mode 100644 security/integrity/ima/ima_ns_status.c

Comments

Christian Brauner Dec. 16, 2021, 12:50 p.m. UTC | #1
On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
> From: Stefan Berger <stefanb@linux.ibm.com>
> 
> The goal of this series of patches is to start with the namespacing of
> IMA and support auditing within an IMA namespace (IMA-ns) as the first
> step.
> 
> In this series the IMA namespace is piggy backing on the user namespace
> and therefore an IMA namespace gets created when a user namespace is
> created. The advantage of this is that the user namespace can provide
> the keys infrastructure that IMA appraisal support will need later on.
> 
> We chose the goal of supporting auditing within an IMA namespace since it
> requires the least changes to IMA. Following this series, auditing within
> an IMA namespace can be activated by a user running the following lines
> that rely on a statically linked busybox to be installed on the host for
> execution within the minimal container environment:
> 
> mkdir -p rootfs/{bin,mnt,proc}
> cp /sbin/busybox rootfs/bin
> cp /sbin/busybox rootfs/bin/busybox2
> echo >> rootfs/bin/busybox2
> PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
>   --root rootfs busybox sh -c \
>  "busybox mount -t securityfs /mnt /mnt; \
>   busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
>   busybox2 cat /mnt/ima/policy"
> 
> [busybox2 is used to demonstrate 2 measurements; see below]
> 
> Following the audit log on the host the last line cat'ing the IMA policy
> inside the namespace would have been audited. Unfortunately the auditing
> line is not distinguishable from one stemming from actions on the host.
> The hope here is that Richard Brigg's container id support for auditing
> would help resolve the problem.
> 
> The following lines added to a suitable IMA policy on the host would
> cause the execution of the commands inside the container (by uid 1000)
> to be measured and audited as well on the host, thus leading to two
> auditing messages for the 'busybox2 cat' above and log entries in IMA's
> system log.
> 
> echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
>         "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
>     > /sys/kernel/security/ima/policy
> 
> The goal of supporting measurement and auditing by the host, of actions
> occurring within IMA namespaces, is that users, particularly root,
> should not be able to evade the host's IMA policy just by spawning
> new IMA namespaces, running programs there, and discarding the namespaces
> again. This is achieved through 'hierarchical processing' of file
> accesses that are evaluated against the policy of the namespace where
> the action occurred and against all namespaces' and their policies leading
> back to the root IMA namespace (init_ima_ns).

Note that your worst-case is 32 levels (maximum supported userns
nesting) where each ima namespace defines a separate policy.

So make sure you don't run into locking issues when hierarchically
processing rules. So far I think it's fine since the locks aren't held
across the hierarchial walk but are dropped and reaqcuired for each
level.

But that could still mean a lot of contention on iint->mutex since this
lock is global, i.e. in this context: for all ima namespaces. You might
want to consider coming up with some rough ideas for how to solve this
_if_ this becomes a problem in the future.

> 
> The patch series adds support for a virtualized SecurityFS with a few
> new API calls that are used by IMA namespacing. Only the data relevant
> to the IMA namespace are shown. The files and directories of other
> security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
> up when secruityfs is mounted inside a user namespace.
> 
> Much of the code leading up to the virtualization of SecurityFS deals
> with moving IMA's variables from various files into the IMA namespace
> structure called 'ima_namespace'. When it comes to determining the
> current IMA namespace I took the approach to get the current IMA
> namespace (get_current_ns()) on the top level and pass the pointer all
> the way down to those functions that now need access to the ima_namespace
> to get to their variables. This later on comes in handy once hierarchical
> processing is implemented in this series where we walk the list of
> namespaces backwards and again need to pass the pointer into functions.

Just to repeat the point from earlier reviews, all those functions need
to be guaranteed to call from syscall context. Functions that operate on
files have different semantics.

> 
> This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
> IMA policy via reduced capabilities. We would again later on use this
> capability to allow users to set file extended attributes for IMA appraisal
> support.
> 
> My tree with these patches is here:
> 
> git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted 
> 
> Regards,
>    Stefan
> 
> v7:
>  - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
>    from functions related to key queues where calls need ima_namespace
>  - Moved ima_namespace to security/integrity/ima/ima.h
>  - Extended API descriptions with ns parameter where needed
>  - Using init_ima_ns in functions related to appraisal and xattrs
>  - SecurityFS: Using ima_ns_from_file() to get ns pointer 
>  - Reformatted to 80 columns per line

Since we're starting to be fairly along I would ask you to please write
detailed commit messages for the next revision.

I would also like to see all links for prior versions of this patchset
in the commit message since the discussion has been fairly extensive so
for this series it makes a lot of sense. So something like:

Link: https://lore.kernel.org/r/$MSGID (v1)
Link: https://lore.kernel.org/r/$MSGID (v2)
Link: https://lore.kernel.org/r/$MSGID (v3)
Link: https://lore.kernel.org/r/$MSGID (v4)
Link: https://lore.kernel.org/r/$MSGID (v5)
Link: https://lore.kernel.org/r/$MSGID (v6)
Link: https://lore.kernel.org/r/$MSGID (v7)
Signed-off-by: meh
Signed-off-by: mih
Signed-off-by: muh

I find that extremely pleasant in case we need to revisit things later.
(Technically you can get the same by searching lore via the final link
but I find it be pretty pleasing to just copy+paste directly from the
commit message to the discussion for the earlier patch.)
Christian Brauner Dec. 16, 2021, 1:31 p.m. UTC | #2
On Thu, Dec 16, 2021 at 01:50:27PM +0100, Christian Brauner wrote:
> On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
> > From: Stefan Berger <stefanb@linux.ibm.com>
> > 
> > The goal of this series of patches is to start with the namespacing of
> > IMA and support auditing within an IMA namespace (IMA-ns) as the first
> > step.
> > 
> > In this series the IMA namespace is piggy backing on the user namespace
> > and therefore an IMA namespace gets created when a user namespace is
> > created. The advantage of this is that the user namespace can provide
> > the keys infrastructure that IMA appraisal support will need later on.
> > 
> > We chose the goal of supporting auditing within an IMA namespace since it
> > requires the least changes to IMA. Following this series, auditing within
> > an IMA namespace can be activated by a user running the following lines
> > that rely on a statically linked busybox to be installed on the host for
> > execution within the minimal container environment:
> > 
> > mkdir -p rootfs/{bin,mnt,proc}
> > cp /sbin/busybox rootfs/bin
> > cp /sbin/busybox rootfs/bin/busybox2
> > echo >> rootfs/bin/busybox2
> > PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
> >   --root rootfs busybox sh -c \
> >  "busybox mount -t securityfs /mnt /mnt; \
> >   busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
> >   busybox2 cat /mnt/ima/policy"
> > 
> > [busybox2 is used to demonstrate 2 measurements; see below]
> > 
> > Following the audit log on the host the last line cat'ing the IMA policy
> > inside the namespace would have been audited. Unfortunately the auditing
> > line is not distinguishable from one stemming from actions on the host.
> > The hope here is that Richard Brigg's container id support for auditing
> > would help resolve the problem.
> > 
> > The following lines added to a suitable IMA policy on the host would
> > cause the execution of the commands inside the container (by uid 1000)
> > to be measured and audited as well on the host, thus leading to two
> > auditing messages for the 'busybox2 cat' above and log entries in IMA's
> > system log.
> > 
> > echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
> >         "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
> >     > /sys/kernel/security/ima/policy
> > 
> > The goal of supporting measurement and auditing by the host, of actions
> > occurring within IMA namespaces, is that users, particularly root,
> > should not be able to evade the host's IMA policy just by spawning
> > new IMA namespaces, running programs there, and discarding the namespaces
> > again. This is achieved through 'hierarchical processing' of file
> > accesses that are evaluated against the policy of the namespace where
> > the action occurred and against all namespaces' and their policies leading
> > back to the root IMA namespace (init_ima_ns).
> 
> Note that your worst-case is 32 levels (maximum supported userns
> nesting) where each ima namespace defines a separate policy.
> 
> So make sure you don't run into locking issues when hierarchically
> processing rules. So far I think it's fine since the locks aren't held
> across the hierarchial walk but are dropped and reaqcuired for each
> level.
> 
> But that could still mean a lot of contention on iint->mutex since this
> lock is global, i.e. in this context: for all ima namespaces. You might
> want to consider coming up with some rough ideas for how to solve this
> _if_ this becomes a problem in the future.
> 
> > 
> > The patch series adds support for a virtualized SecurityFS with a few
> > new API calls that are used by IMA namespacing. Only the data relevant
> > to the IMA namespace are shown. The files and directories of other
> > security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
> > up when secruityfs is mounted inside a user namespace.
> > 
> > Much of the code leading up to the virtualization of SecurityFS deals
> > with moving IMA's variables from various files into the IMA namespace
> > structure called 'ima_namespace'. When it comes to determining the
> > current IMA namespace I took the approach to get the current IMA
> > namespace (get_current_ns()) on the top level and pass the pointer all
> > the way down to those functions that now need access to the ima_namespace
> > to get to their variables. This later on comes in handy once hierarchical
> > processing is implemented in this series where we walk the list of
> > namespaces backwards and again need to pass the pointer into functions.
> 
> Just to repeat the point from earlier reviews, all those functions need
> to be guaranteed to call from syscall context. Functions that operate on
> files have different semantics.
> 
> > 
> > This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
> > IMA policy via reduced capabilities. We would again later on use this
> > capability to allow users to set file extended attributes for IMA appraisal
> > support.
> > 
> > My tree with these patches is here:
> > 
> > git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted 
> > 
> > Regards,
> >    Stefan
> > 
> > v7:
> >  - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
> >    from functions related to key queues where calls need ima_namespace
> >  - Moved ima_namespace to security/integrity/ima/ima.h
> >  - Extended API descriptions with ns parameter where needed
> >  - Using init_ima_ns in functions related to appraisal and xattrs
> >  - SecurityFS: Using ima_ns_from_file() to get ns pointer 
> >  - Reformatted to 80 columns per line
> 
> Since we're starting to be fairly along I would ask you to please write
> detailed commit messages for the next revision.
> 
> I would also like to see all links for prior versions of this patchset
> in the commit message since the discussion has been fairly extensive so
> for this series it makes a lot of sense. So something like:
> 
> Link: https://lore.kernel.org/r/$MSGID (v1)
> Link: https://lore.kernel.org/r/$MSGID (v2)
> Link: https://lore.kernel.org/r/$MSGID (v3)
> Link: https://lore.kernel.org/r/$MSGID (v4)
> Link: https://lore.kernel.org/r/$MSGID (v5)
> Link: https://lore.kernel.org/r/$MSGID (v6)
> Link: https://lore.kernel.org/r/$MSGID (v7)
> Signed-off-by: meh
> Signed-off-by: mih
> Signed-off-by: muh
> 
> I find that extremely pleasant in case we need to revisit things later.
> (Technically you can get the same by searching lore via the final link
> but I find it be pretty pleasing to just copy+paste directly from the
> commit message to the discussion for the earlier patch.)

So I looked through the series from a high-level view for once and I
would like to change how it is currently structured.

Currently, it looks a lot like you end up with a half-namespaced ima if
you compile and run a kernel in the middle of this patch series. Not
just is this asking for semantic chaos if we need to debug something it
also makes bisection a giant pain later.

In addition, the fact that you need a hack like

> +struct ima_namespace {
> +	int avoid_zero_size;

in the first patch is another good sign that this should be restructured.

Here's how I would prefer to see this done. I think we should organize
this in three big chunks (bullet points are not meant to signify
individual patches):

1. namespace securityfs
   This patch is thematically standalone and should move to the
   beginning of the series.
   I would strongly recommend to fold patch 9 and 10 into a single patch
   and add a lengthy explanation. You should be able to recycle a lof of
   stuff I wrote in earlier reviews.

2. Introduce struct ima_namespace and pass it through to all callers:
   - introduce struct ima_namespace
   - move all the relevant things into this structure (this also avoids
     the "avoid_zero_size" hack).
   - define, setup, and expose init_ima_ns 
   - introduce get_current_ns() and always have it return &init_ima_ns for now
   - replace all accesses to global variables to go through &init_ima_ns
   - add new infrastructure you'll need later on
   Bonus is that you can extend all the functions that later need access
   to a specific ima namespace to take a struct ima_namespace * argument
   and pass down &init_ima_ns down (retrieved via get_current_ns()). This
   will make the actual namespace patch very easy to follow.

3. namespace ima
   - add a new entry for struct ima_namespace to struct user_namespace
   - add creation helpers, kmem cache etc.
   - create files in securityfs per ns

This way at all points in the series we have clearly defined semantics
where ima namespacing is either fully working or fully not working and
the switch is atomic in the patch(es) part of 3.
Stefan Berger Dec. 16, 2021, 9 p.m. UTC | #3
On 12/16/21 07:50, Christian Brauner wrote:
> On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
>> From: Stefan Berger <stefanb@linux.ibm.com>
>>
>> The goal of this series of patches is to start with the namespacing of
>> IMA and support auditing within an IMA namespace (IMA-ns) as the first
>> step.
>>
>> In this series the IMA namespace is piggy backing on the user namespace
>> and therefore an IMA namespace gets created when a user namespace is
>> created. The advantage of this is that the user namespace can provide
>> the keys infrastructure that IMA appraisal support will need later on.
>>
>> We chose the goal of supporting auditing within an IMA namespace since it
>> requires the least changes to IMA. Following this series, auditing within
>> an IMA namespace can be activated by a user running the following lines
>> that rely on a statically linked busybox to be installed on the host for
>> execution within the minimal container environment:
>>
>> mkdir -p rootfs/{bin,mnt,proc}
>> cp /sbin/busybox rootfs/bin
>> cp /sbin/busybox rootfs/bin/busybox2
>> echo >> rootfs/bin/busybox2
>> PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
>>    --root rootfs busybox sh -c \
>>   "busybox mount -t securityfs /mnt /mnt; \
>>    busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
>>    busybox2 cat /mnt/ima/policy"
>>
>> [busybox2 is used to demonstrate 2 measurements; see below]
>>
>> Following the audit log on the host the last line cat'ing the IMA policy
>> inside the namespace would have been audited. Unfortunately the auditing
>> line is not distinguishable from one stemming from actions on the host.
>> The hope here is that Richard Brigg's container id support for auditing
>> would help resolve the problem.
>>
>> The following lines added to a suitable IMA policy on the host would
>> cause the execution of the commands inside the container (by uid 1000)
>> to be measured and audited as well on the host, thus leading to two
>> auditing messages for the 'busybox2 cat' above and log entries in IMA's
>> system log.
>>
>> echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
>>          "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
>>      > /sys/kernel/security/ima/policy
>>
>> The goal of supporting measurement and auditing by the host, of actions
>> occurring within IMA namespaces, is that users, particularly root,
>> should not be able to evade the host's IMA policy just by spawning
>> new IMA namespaces, running programs there, and discarding the namespaces
>> again. This is achieved through 'hierarchical processing' of file
>> accesses that are evaluated against the policy of the namespace where
>> the action occurred and against all namespaces' and their policies leading
>> back to the root IMA namespace (init_ima_ns).
> Note that your worst-case is 32 levels (maximum supported userns
> nesting) where each ima namespace defines a separate policy.
>
> So make sure you don't run into locking issues when hierarchically
> processing rules. So far I think it's fine since the locks aren't held
> across the hierarchial walk but are dropped and reaqcuired for each
> level.
>
> But that could still mean a lot of contention on iint->mutex since this
> lock is global, i.e. in this context: for all ima namespaces. You might
> want to consider coming up with some rough ideas for how to solve this
> _if_ this becomes a problem in the future.


The plan is that each IMA namespace will have its own rbtree with its 
own set of iints. We cannot do it all at the same time, so this will 
take while until things can be completely moved over into a per-IMA 
namespace rbtree and each IMA namespace becomes fully independent.


>
>> The patch series adds support for a virtualized SecurityFS with a few
>> new API calls that are used by IMA namespacing. Only the data relevant
>> to the IMA namespace are shown. The files and directories of other
>> security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
>> up when secruityfs is mounted inside a user namespace.
>>
>> Much of the code leading up to the virtualization of SecurityFS deals
>> with moving IMA's variables from various files into the IMA namespace
>> structure called 'ima_namespace'. When it comes to determining the
>> current IMA namespace I took the approach to get the current IMA
>> namespace (get_current_ns()) on the top level and pass the pointer all
>> the way down to those functions that now need access to the ima_namespace
>> to get to their variables. This later on comes in handy once hierarchical
>> processing is implemented in this series where we walk the list of
>> namespaces backwards and again need to pass the pointer into functions.
> Just to repeat the point from earlier reviews, all those functions need
> to be guaranteed to call from syscall context. Functions that operate on
> files have different semantics.


You mean files in general or SecurityFS files in particular?


>
>> This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
>> IMA policy via reduced capabilities. We would again later on use this
>> capability to allow users to set file extended attributes for IMA appraisal
>> support.
>>
>> My tree with these patches is here:
>>
>> git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted
>>
>> Regards,
>>     Stefan
>>
>> v7:
>>   - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
>>     from functions related to key queues where calls need ima_namespace
>>   - Moved ima_namespace to security/integrity/ima/ima.h
>>   - Extended API descriptions with ns parameter where needed
>>   - Using init_ima_ns in functions related to appraisal and xattrs
>>   - SecurityFS: Using ima_ns_from_file() to get ns pointer
>>   - Reformatted to 80 columns per line
> Since we're starting to be fairly along I would ask you to please write
> detailed commit messages for the next revision.

Expand the existing commit texts, is that what you suggest that I do?


>
> I would also like to see all links for prior versions of this patchset
> in the commit message since the discussion has been fairly extensive so
> for this series it makes a lot of sense. So something like:
>
> Link: https://lore.kernel.org/r/$MSGID (v1)
> Link: https://lore.kernel.org/r/$MSGID (v2)
> Link: https://lore.kernel.org/r/$MSGID (v3)
> Link: https://lore.kernel.org/r/$MSGID (v4)
> Link: https://lore.kernel.org/r/$MSGID (v5)
> Link: https://lore.kernel.org/r/$MSGID (v6)
> Link: https://lore.kernel.org/r/$MSGID (v7)
> Signed-off-by: meh
> Signed-off-by: mih
> Signed-off-by: muh

So that's a link per patch to all its previous versions?


> I find that extremely pleasant in case we need to revisit things later.
> (Technically you can get the same by searching lore via the final link
> but I find it be pretty pleasing to just copy+paste directly from the
> commit message to the discussion for the earlier patch.)
Stefan Berger Dec. 16, 2021, 9:27 p.m. UTC | #4
On 12/16/21 08:31, Christian Brauner wrote:
> On Thu, Dec 16, 2021 at 01:50:27PM +0100, Christian Brauner wrote:
>> On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
>>
>>> This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
>>> IMA policy via reduced capabilities. We would again later on use this
>>> capability to allow users to set file extended attributes for IMA appraisal
>>> support.
>>>
>>> My tree with these patches is here:
>>>
>>> git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted
>>>
>>> Regards,
>>>     Stefan
>>>
>>> v7:
>>>   - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
>>>     from functions related to key queues where calls need ima_namespace
>>>   - Moved ima_namespace to security/integrity/ima/ima.h
>>>   - Extended API descriptions with ns parameter where needed
>>>   - Using init_ima_ns in functions related to appraisal and xattrs
>>>   - SecurityFS: Using ima_ns_from_file() to get ns pointer
>>>   - Reformatted to 80 columns per line
>> Since we're starting to be fairly along I would ask you to please write
>> detailed commit messages for the next revision.
>>
>> I would also like to see all links for prior versions of this patchset
>> in the commit message since the discussion has been fairly extensive so
>> for this series it makes a lot of sense. So something like:
>>
>> Link: https://lore.kernel.org/r/$MSGID (v1)
>> Link: https://lore.kernel.org/r/$MSGID (v2)
>> Link: https://lore.kernel.org/r/$MSGID (v3)
>> Link: https://lore.kernel.org/r/$MSGID (v4)
>> Link: https://lore.kernel.org/r/$MSGID (v5)
>> Link: https://lore.kernel.org/r/$MSGID (v6)
>> Link: https://lore.kernel.org/r/$MSGID (v7)
>> Signed-off-by: meh
>> Signed-off-by: mih
>> Signed-off-by: muh
>>
>> I find that extremely pleasant in case we need to revisit things later.
>> (Technically you can get the same by searching lore via the final link
>> but I find it be pretty pleasing to just copy+paste directly from the
>> commit message to the discussion for the earlier patch.)
> So I looked through the series from a high-level view for once and I
> would like to change how it is currently structured.
>
> Currently, it looks a lot like you end up with a half-namespaced ima if
> you compile and run a kernel in the middle of this patch series. Not
> just is this asking for semantic chaos if we need to debug something it
> also makes bisection a giant pain later.
>
> In addition, the fact that you need a hack like
>
>> +struct ima_namespace {
>> +	int avoid_zero_size;
> in the first patch is another good sign that this should be restructured.
>
> Here's how I would prefer to see this done. I think we should organize
> this in three big chunks (bullet points are not meant to signify
> individual patches):
>
> 1. namespace securityfs
>     This patch is thematically standalone and should move to the
>     beginning of the series.
>     I would strongly recommend to fold patch 9 and 10 into a single patch
>     and add a lengthy explanation. You should be able to recycle a lof of
>     stuff I wrote in earlier reviews.
>
> 2. Introduce struct ima_namespace and pass it through to all callers:
>     - introduce struct ima_namespace
>     - move all the relevant things into this structure (this also avoids
>       the "avoid_zero_size" hack).

Before I start any move and don't get it right:

Is this to be done like in the current set of patches in those steps 
where one thing is moved after another?


>     - define, setup, and expose init_ima_ns


We do this alongside the move of the individual pieces into 
ima_namesapce as is done across the patches now? Most of those 'move' 
patches haven't received much feedback so far.


>     - introduce get_current_ns() and always have it return &init_ima_ns for now
>     - replace all accesses to global variables to go through &init_ima_ns

And not pass get_current_ns() down from the top level and access the 
right away via ns->foobar but use init_ima_ns.foobar instead?


>     - add new infrastructure you'll need later on
>     Bonus is that you can extend all the functions that later need access
>     to a specific ima namespace to take a struct ima_namespace * argument
>     and pass down &init_ima_ns down (retrieved via get_current_ns()). This
>     will make the actual namespace patch very easy to follow.


I am wondering how this new series is going to relate to the existing 
series and the links you suggest be added on a per patch basis? How much 
of the existing patches can be preserved?


>
> 3. namespace ima
>     - add a new entry for struct ima_namespace to struct user_namespace
>     - add creation helpers, kmem cache etc.
>     - create files in securityfs per ns
>
> This way at all points in the series we have clearly defined semantics
> where ima namespacing is either fully working or fully not working and
> the switch is atomic in the patch(es) part of 3.


The existing series tried this by enabling IMA namespacing support when 
SecurityFS is enabled... Is that also the last step then in what you 
suggest?
Christian Brauner Dec. 17, 2021, 10:06 a.m. UTC | #5
On Thu, Dec 16, 2021 at 04:00:40PM -0500, Stefan Berger wrote:
> 
> On 12/16/21 07:50, Christian Brauner wrote:
> > On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
> > > From: Stefan Berger <stefanb@linux.ibm.com>
> > > 
> > > The goal of this series of patches is to start with the namespacing of
> > > IMA and support auditing within an IMA namespace (IMA-ns) as the first
> > > step.
> > > 
> > > In this series the IMA namespace is piggy backing on the user namespace
> > > and therefore an IMA namespace gets created when a user namespace is
> > > created. The advantage of this is that the user namespace can provide
> > > the keys infrastructure that IMA appraisal support will need later on.
> > > 
> > > We chose the goal of supporting auditing within an IMA namespace since it
> > > requires the least changes to IMA. Following this series, auditing within
> > > an IMA namespace can be activated by a user running the following lines
> > > that rely on a statically linked busybox to be installed on the host for
> > > execution within the minimal container environment:
> > > 
> > > mkdir -p rootfs/{bin,mnt,proc}
> > > cp /sbin/busybox rootfs/bin
> > > cp /sbin/busybox rootfs/bin/busybox2
> > > echo >> rootfs/bin/busybox2
> > > PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
> > >    --root rootfs busybox sh -c \
> > >   "busybox mount -t securityfs /mnt /mnt; \
> > >    busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
> > >    busybox2 cat /mnt/ima/policy"
> > > 
> > > [busybox2 is used to demonstrate 2 measurements; see below]
> > > 
> > > Following the audit log on the host the last line cat'ing the IMA policy
> > > inside the namespace would have been audited. Unfortunately the auditing
> > > line is not distinguishable from one stemming from actions on the host.
> > > The hope here is that Richard Brigg's container id support for auditing
> > > would help resolve the problem.
> > > 
> > > The following lines added to a suitable IMA policy on the host would
> > > cause the execution of the commands inside the container (by uid 1000)
> > > to be measured and audited as well on the host, thus leading to two
> > > auditing messages for the 'busybox2 cat' above and log entries in IMA's
> > > system log.
> > > 
> > > echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
> > >          "audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
> > >      > /sys/kernel/security/ima/policy
> > > 
> > > The goal of supporting measurement and auditing by the host, of actions
> > > occurring within IMA namespaces, is that users, particularly root,
> > > should not be able to evade the host's IMA policy just by spawning
> > > new IMA namespaces, running programs there, and discarding the namespaces
> > > again. This is achieved through 'hierarchical processing' of file
> > > accesses that are evaluated against the policy of the namespace where
> > > the action occurred and against all namespaces' and their policies leading
> > > back to the root IMA namespace (init_ima_ns).
> > Note that your worst-case is 32 levels (maximum supported userns
> > nesting) where each ima namespace defines a separate policy.
> > 
> > So make sure you don't run into locking issues when hierarchically
> > processing rules. So far I think it's fine since the locks aren't held
> > across the hierarchial walk but are dropped and reaqcuired for each
> > level.
> > 
> > But that could still mean a lot of contention on iint->mutex since this
> > lock is global, i.e. in this context: for all ima namespaces. You might
> > want to consider coming up with some rough ideas for how to solve this
> > _if_ this becomes a problem in the future.
> 
> 
> The plan is that each IMA namespace will have its own rbtree with its own
> set of iints. We cannot do it all at the same time, so this will take while
> until things can be completely moved over into a per-IMA namespace rbtree
> and each IMA namespace becomes fully independent.

Ok, good to hear that you have already thought about that.

> 
> 
> > 
> > > The patch series adds support for a virtualized SecurityFS with a few
> > > new API calls that are used by IMA namespacing. Only the data relevant
> > > to the IMA namespace are shown. The files and directories of other
> > > security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
> > > up when secruityfs is mounted inside a user namespace.
> > > 
> > > Much of the code leading up to the virtualization of SecurityFS deals
> > > with moving IMA's variables from various files into the IMA namespace
> > > structure called 'ima_namespace'. When it comes to determining the
> > > current IMA namespace I took the approach to get the current IMA
> > > namespace (get_current_ns()) on the top level and pass the pointer all
> > > the way down to those functions that now need access to the ima_namespace
> > > to get to their variables. This later on comes in handy once hierarchical
> > > processing is implemented in this series where we walk the list of
> > > namespaces backwards and again need to pass the pointer into functions.
> > Just to repeat the point from earlier reviews, all those functions need
> > to be guaranteed to call from syscall context. Functions that operate on
> > files have different semantics.
> 
> 
> You mean files in general or SecurityFS files in particular?

Files in general but I was specifically referring to securityfs here as
that's the relevant bit here.

> 
> 
> > 
> > > This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
> > > IMA policy via reduced capabilities. We would again later on use this
> > > capability to allow users to set file extended attributes for IMA appraisal
> > > support.
> > > 
> > > My tree with these patches is here:
> > > 
> > > git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted
> > > 
> > > Regards,
> > >     Stefan
> > > 
> > > v7:
> > >   - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
> > >     from functions related to key queues where calls need ima_namespace
> > >   - Moved ima_namespace to security/integrity/ima/ima.h
> > >   - Extended API descriptions with ns parameter where needed
> > >   - Using init_ima_ns in functions related to appraisal and xattrs
> > >   - SecurityFS: Using ima_ns_from_file() to get ns pointer
> > >   - Reformatted to 80 columns per line
> > Since we're starting to be fairly along I would ask you to please write
> > detailed commit messages for the next revision.
> 
> Expand the existing commit texts, is that what you suggest that I do?

Yeah, right now they don't really explain the how and why. What I'm
saying is that the commit messages - at least for the namespace part and
the securityfs part - aren't detailed enough.

> 
> 
> > 
> > I would also like to see all links for prior versions of this patchset
> > in the commit message since the discussion has been fairly extensive so
> > for this series it makes a lot of sense. So something like:
> > 
> > Link: https://lore.kernel.org/r/$MSGID (v1)
> > Link: https://lore.kernel.org/r/$MSGID (v2)
> > Link: https://lore.kernel.org/r/$MSGID (v3)
> > Link: https://lore.kernel.org/r/$MSGID (v4)
> > Link: https://lore.kernel.org/r/$MSGID (v5)
> > Link: https://lore.kernel.org/r/$MSGID (v6)
> > Link: https://lore.kernel.org/r/$MSGID (v7)
> > Signed-off-by: meh
> > Signed-off-by: mih
> > Signed-off-by: muh
> 
> So that's a link per patch to all its previous versions?

When Mimi ends up sending the PR for this she can then put links to all
previous versions in the pull request mail or in the tag. I'm only
suggesting you do it that way. Since this will be Mimi's PR it'll be her
decision.
So for now it would just be nice to put links to previous versions at
the end of your cover letter. Because if you rework the series as I
suggested you won't have a 1:1 correspondence between patches anymore as
you did vor v1 to v7.
Christian Brauner Dec. 17, 2021, 10:25 a.m. UTC | #6
On Thu, Dec 16, 2021 at 04:27:24PM -0500, Stefan Berger wrote:
> 
> On 12/16/21 08:31, Christian Brauner wrote:
> > On Thu, Dec 16, 2021 at 01:50:27PM +0100, Christian Brauner wrote:
> > > On Thu, Dec 16, 2021 at 12:43:09AM -0500, Stefan Berger wrote:
> > > 
> > > > This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
> > > > IMA policy via reduced capabilities. We would again later on use this
> > > > capability to allow users to set file extended attributes for IMA appraisal
> > > > support.
> > > > 
> > > > My tree with these patches is here:
> > > > 
> > > > git fetch https://github.com/stefanberger/linux-ima-namespaces v5.15+imans.v7.posted
> > > > 
> > > > Regards,
> > > >     Stefan
> > > > 
> > > > v7:
> > > >   - Dropped 2 patches related to key queues; using &init_ima_ns for all calls
> > > >     from functions related to key queues where calls need ima_namespace
> > > >   - Moved ima_namespace to security/integrity/ima/ima.h
> > > >   - Extended API descriptions with ns parameter where needed
> > > >   - Using init_ima_ns in functions related to appraisal and xattrs
> > > >   - SecurityFS: Using ima_ns_from_file() to get ns pointer
> > > >   - Reformatted to 80 columns per line
> > > Since we're starting to be fairly along I would ask you to please write
> > > detailed commit messages for the next revision.
> > > 
> > > I would also like to see all links for prior versions of this patchset
> > > in the commit message since the discussion has been fairly extensive so
> > > for this series it makes a lot of sense. So something like:
> > > 
> > > Link: https://lore.kernel.org/r/$MSGID (v1)
> > > Link: https://lore.kernel.org/r/$MSGID (v2)
> > > Link: https://lore.kernel.org/r/$MSGID (v3)
> > > Link: https://lore.kernel.org/r/$MSGID (v4)
> > > Link: https://lore.kernel.org/r/$MSGID (v5)
> > > Link: https://lore.kernel.org/r/$MSGID (v6)
> > > Link: https://lore.kernel.org/r/$MSGID (v7)
> > > Signed-off-by: meh
> > > Signed-off-by: mih
> > > Signed-off-by: muh
> > > 
> > > I find that extremely pleasant in case we need to revisit things later.
> > > (Technically you can get the same by searching lore via the final link
> > > but I find it be pretty pleasing to just copy+paste directly from the
> > > commit message to the discussion for the earlier patch.)
> > So I looked through the series from a high-level view for once and I
> > would like to change how it is currently structured.
> > 
> > Currently, it looks a lot like you end up with a half-namespaced ima if
> > you compile and run a kernel in the middle of this patch series. Not
> > just is this asking for semantic chaos if we need to debug something it
> > also makes bisection a giant pain later.
> > 
> > In addition, the fact that you need a hack like
> > 
> > > +struct ima_namespace {
> > > +	int avoid_zero_size;
> > in the first patch is another good sign that this should be restructured.
> > 
> > Here's how I would prefer to see this done. I think we should organize
> > this in three big chunks (bullet points are not meant to signify
> > individual patches):
> > 
> > 1. namespace securityfs
> >     This patch is thematically standalone and should move to the
> >     beginning of the series.
> >     I would strongly recommend to fold patch 9 and 10 into a single patch
> >     and add a lengthy explanation. You should be able to recycle a lof of
> >     stuff I wrote in earlier reviews.
> > 
> > 2. Introduce struct ima_namespace and pass it through to all callers:
> >     - introduce struct ima_namespace
> >     - move all the relevant things into this structure (this also avoids
> >       the "avoid_zero_size" hack).
> 
> Before I start any move and don't get it right:
> 
> Is this to be done like in the current set of patches in those steps where
> one thing is moved after another?

Yeah, I think that's perfectly fine and I liked that part of your
series.
I would introduce struct ima_namespace together with a first or
multiple easy member(s) to move into as one patch. And then individual
patches for other, larger members as you did.

> 
> 
> >     - define, setup, and expose init_ima_ns
> 
> 
> We do this alongside the move of the individual pieces into ima_namesapce as
> is done across the patches now? Most of those 'move' patches haven't

- One patch to introduce struct ima_namespace including init_ima_ns and
  a single member that can be moved into it converting all accesses to
  that member to access it via struct ima_namespace.
- Individual patches for other members as you did now.

> received much feedback so far.
> 
> 
> >     - introduce get_current_ns() and always have it return &init_ima_ns for now
> >     - replace all accesses to global variables to go through &init_ima_ns
> 
> And not pass get_current_ns() down from the top level and access the right
> away via ns->foobar but use init_ima_ns.foobar instead?

I would recommend starting to pass down the ima_namespace down right
away so you don't need to do it later when you actually introduce the
ima namespace proper. This could probably be done as part of the first
patch that introduces struct ima_namespace.

If I'm right this will make the actual semantic changes to enable ima
namespaces proper easier to review as they don't contain the jitter that
comes with introducing a new argument into functions.

> 
> 
> >     - add new infrastructure you'll need later on
> >     Bonus is that you can extend all the functions that later need access
> >     to a specific ima namespace to take a struct ima_namespace * argument
> >     and pass down &init_ima_ns down (retrieved via get_current_ns()). This
> >     will make the actual namespace patch very easy to follow.
> 
> 
> I am wondering how this new series is going to relate to the existing series
> and the links you suggest be added on a per patch basis? How much of the
> existing patches can be preserved?

I answered that in the other mail. Just put them in the cover letter. :)

> 
> 
> > 
> > 3. namespace ima
> >     - add a new entry for struct ima_namespace to struct user_namespace
> >     - add creation helpers, kmem cache etc.
> >     - create files in securityfs per ns
> > 
> > This way at all points in the series we have clearly defined semantics
> > where ima namespacing is either fully working or fully not working and
> > the switch is atomic in the patch(es) part of 3.
> 
> 
> The existing series tried this by enabling IMA namespacing support when
> SecurityFS is enabled... Is that also the last step then in what you

It did enable ima namespace support before that. It only wasn't
reachable from userspace since securityfs couldn't be mounted and a
custom policy applied.
But from a pure implementation/development perspective namespacing had
already taken place in the first patch.
Say, if I wanted to debug something in the patch series and compiled in
the middle of the previous patch series then I would have already had to
deal with per-userns ima namespaces before userspace could do anything
with them. So the regression and bug potential that comes with that had
already been fully realized so to speak.

What I want is to have this be one step: namespacing is fully
implemented and turned on at the same time.

> suggest?

No, securityfs can be mounted and just stay empty until ima starts
creating files in there in the last patch. I see no harm in that at all.
Stefan Berger Dec. 18, 2021, 2:38 a.m. UTC | #7
On 12/16/21 08:31, Christian Brauner wrote:
>
> 1. namespace securityfs
>     This patch is thematically standalone and should move to the
>     beginning of the series.
>     I would strongly recommend to fold patch 9 and 10 into a single patch
>     and add a lengthy explanation. You should be able to recycle a lof of
>     stuff I wrote in earlier reviews.
>
> 2. Introduce struct ima_namespace and pass it through to all callers:
>     - introduce struct ima_namespace
>     - move all the relevant things into this structure (this also avoids
>       the "avoid_zero_size" hack).


We could defer the kmalloc() that doesn't work on a zero-sized request. 
I would say this  is minor.


>     - define, setup, and expose init_ima_ns
>     - introduce get_current_ns() and always have it return &init_ima_ns for now
>     - replace all accesses to global variables to go through &init_ima_ns
>     - add new infrastructure you'll need later on
>     Bonus is that you can extend all the functions that later need access
>     to a specific ima namespace to take a struct ima_namespace * argument
>     and pass down &init_ima_ns down (retrieved via get_current_ns()). This
>     will make the actual namespace patch very easy to follow.
>
> 3. namespace ima
>     - add a new entry for struct ima_namespace to struct user_namespace
>     - add creation helpers, kmem cache etc.
>     - create files in securityfs per ns

I have tried this now and I am looking at 4 remaining patches that need 
to somehow find its way into v8 without causing too many disturbances. 
At what point (over how many patches) can I introduce CONFIG_IMA_NS 
without anything related to IMA namespacing happening? I need it early 
in 'your 3rd part' since it is also used for conditional compilation 
(Makefile) and #ifdef's where Makefile content and what the #ifdefs are 
doing probably shouldn't be squeezed into a single patch just so it's 
all enabled in one patch, but it should probably still remain logically 
separated into different patches. Enablement of IMA namespace would be 
in the very last patch. But there may be several patches between the 
very last one and CONFIG_IMA_NS is introduced...

v7 at least, before the requirement to do late/lazy initialization, 
enabled CONFIG_IMA_NS right away and built ever step on top of it, even 
if the IMA namespace only became **configurable** in the last patch when 
securityfs was enbled and one could set a policy. From that perspective 
it would be easier to switch to late initialization in a patch on top of 
v7 but .. ok, we cannot do that.


> This way at all points in the series we have clearly defined semantics
> where ima namespacing is either fully working or fully not working and
> the switch is atomic in the patch(es) part of 3.
Atomic over multiple patches? So introducing CONFIG_IMA_NS that doesn't 
do anything for several patches is still considered 'atomic' then ?
Stefan Berger Dec. 27, 2021, 5:29 p.m. UTC | #8
On 12/17/21 05:06, Christian Brauner wrote:
> On Thu, Dec 16, 2021 at 04:00:40PM -0500, Stefan Berger wrote:
>>
>> But that could still mean a lot of contention on iint->mutex since this
>> lock is global, i.e. in this context: for all ima namespaces. You might
>> want to consider coming up with some rough ideas for how to solve this
>> _if_ this becomes a problem in the future.
>>
>> The plan is that each IMA namespace will have its own rbtree with its own
>> set of iints. We cannot do it all at the same time, so this will take while
>> until things can be completely moved over into a per-IMA namespace rbtree
>> and each IMA namespace becomes fully independent.
> Ok, good to hear that you have already thought about that.


Well, yes, we thought about it. However, as far as I can look ahead we 
cannot get rid of the iint->mutex:

Obviously we have to organize the data structures where IMA is recording 
what it has done with a file/inode in such a way that each namespace can 
efficiently determine whether it needs to audit/measure/appraise a file 
or re-audit/re-measure/re-appraise it after file modification. The 
organization of these data structures also has to reflect the fact that 
files can be shared between IMA namespaces via setns() on mount 
namespaces or shared files or shared mount namespaces between containers 
etc.. So, the first thing we do already is move audit-related flags into 
what is called the ns_status (namespace status) structure that are kept 
in a per-IMA namespace rbtree. This allows IMA to remember that a file 
was already audited and it doesn't need to audit it again. The lookup 
via rbtree is quick: O(log(n).

Unfortunately the previous series had a bug so that files were not 
re-audited after they were modified. I fixed this now in the new series 
(upcoming v8) by connecting each ns_status also to a list. This list 
starts in the global inode integrity cache (the iint rbtree) where each 
inode that any IMA namespace accessed has an iint entry today. The lists 
start on the iint entries representing inodes.  When files are deleted 
or modified or xattrs are modified then all IMA namespaces need to 
re-audit/re-measure/re-appraise the file (depending on policy) and for 
this we have to reset flags across all the IMA namespaces by walking the 
list of ns_status entries. The organization via iint rbtree and 
ns_status list allows for quick lookup of the inode where the 
modification happened and quick reset of the flags: O(log(n)) + O(n). 
This is better than having to search all namespaces to reset the flags 
(O(log(n) * n) if there was no list.


     Stefan