diff mbox series

[RFC,v14,01/10] landlock: Add object and rule management

Message ID 20200224160215.4136-2-mic@digikod.net (mailing list archive)
State New, archived
Headers show
Series Landlock LSM | expand

Commit Message

Mickaël Salaün Feb. 24, 2020, 4:02 p.m. UTC
A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we can't rely on a system-wide
object identification such as file extended attributes.  Indeed, we need
innocuous, composable and modular access-controls.

The main challenge with this constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its object.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
---

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous version:
https://lore.kernel.org/lkml/20190721213116.23476-6-mic@digikod.net/
---
 MAINTAINERS                |  10 ++
 security/Kconfig           |   1 +
 security/Makefile          |   2 +
 security/landlock/Kconfig  |  15 ++
 security/landlock/Makefile |   3 +
 security/landlock/object.c | 339 +++++++++++++++++++++++++++++++++++++
 security/landlock/object.h | 134 +++++++++++++++
 7 files changed, 504 insertions(+)
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/object.c
 create mode 100644 security/landlock/object.h

Comments

Jann Horn Feb. 25, 2020, 8:49 p.m. UTC | #1
On Mon, Feb 24, 2020 at 5:05 PM Mickaël Salaün <mic@digikod.net> wrote:
> A Landlock object enables to identify a kernel object (e.g. an inode).
> A Landlock rule is a set of access rights allowed on an object.  Rules
> are grouped in rulesets that may be tied to a set of processes (i.e.
> subjects) to enforce a scoped access-control (i.e. a domain).
>
> Because Landlock's goal is to empower any process (especially
> unprivileged ones) to sandbox themselves, we can't rely on a system-wide
> object identification such as file extended attributes.  Indeed, we need
> innocuous, composable and modular access-controls.
>
> The main challenge with this constraints is to identify kernel objects
> while this identification is useful (i.e. when a security policy makes
> use of this object).  But this identification data should be freed once
> no policy is using it.  This ephemeral tagging should not and may not be
> written in the filesystem.  We then need to manage the lifetime of a
> rule according to the lifetime of its object.  To avoid a global lock,
> this implementation make use of RCU and counters to safely reference
> objects.
>
> A following commit uses this generic object management for inodes.
[...]
> diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
> new file mode 100644
> index 000000000000..4a321d5b3f67
> --- /dev/null
> +++ b/security/landlock/Kconfig
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config SECURITY_LANDLOCK
> +       bool "Landlock support"
> +       depends on SECURITY
> +       default n

(I think "default n" is implicit?)

> +       help
> +         This selects Landlock, a safe sandboxing mechanism.  It enables to
> +         restrict processes on the fly (i.e. enforce an access control policy),
> +         which can complement seccomp-bpf.  The security policy is a set of access
> +         rights tied to an object, which could be a file, a socket or a process.
> +
> +         See Documentation/security/landlock/ for further information.
> +
> +         If you are unsure how to answer this question, answer N.
[...]
> diff --git a/security/landlock/object.c b/security/landlock/object.c
> new file mode 100644
> index 000000000000..38fbbb108120
> --- /dev/null
> +++ b/security/landlock/object.c
> @@ -0,0 +1,339 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Landlock LSM - Object and rule management
> + *
> + * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
> + * Copyright © 2018-2020 ANSSI
> + *
> + * Principles and constraints of the object and rule management:
> + * - Do not leak memory.
> + * - Try as much as possible to free a memory allocation as soon as it is
> + *   unused.
> + * - Do not use global lock.
> + * - Do not charge processes other than the one requesting a Landlock
> + *   operation.
> + */
> +
> +#include <linux/bug.h>
> +#include <linux/compiler.h>
> +#include <linux/compiler_types.h>
> +#include <linux/err.h>
> +#include <linux/errno.h>
> +#include <linux/fs.h>
> +#include <linux/kernel.h>
> +#include <linux/list.h>
> +#include <linux/rbtree.h>
> +#include <linux/rcupdate.h>
> +#include <linux/refcount.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/workqueue.h>
> +
> +#include "object.h"
> +
> +struct landlock_object *landlock_create_object(
> +               const enum landlock_object_type type, void *underlying_object)
> +{
> +       struct landlock_object *object;
> +
> +       if (WARN_ON_ONCE(!underlying_object))
> +               return NULL;
> +       object = kzalloc(sizeof(*object), GFP_KERNEL);
> +       if (!object)
> +               return NULL;
> +       refcount_set(&object->usage, 1);
> +       refcount_set(&object->cleaners, 1);
> +       spin_lock_init(&object->lock);
> +       INIT_LIST_HEAD(&object->rules);
> +       object->type = type;
> +       WRITE_ONCE(object->underlying_object, underlying_object);

`object` is not globally visible at this point, so WRITE_ONCE() is unnecessary.

> +       return object;
> +}
> +
> +struct landlock_object *landlock_get_object(struct landlock_object *object)
> +       __acquires(object->usage)
> +{
> +       __acquire(object->usage);
> +       /*
> +        * If @object->usage equal 0, then it will be ignored by writers, and
> +        * underlying_object->object may be replaced, but this is not an issue
> +        * for release_object().
> +        */
> +       if (object && refcount_inc_not_zero(&object->usage)) {
> +               /*
> +                * It should not be possible to get a reference to an object if
> +                * its underlying object is being terminated (e.g. with
> +                * landlock_release_object()), because an object is only
> +                * modifiable through such underlying object.  This is not the
> +                * case with landlock_get_object_cleaner().
> +                */
> +               WARN_ON_ONCE(!READ_ONCE(object->underlying_object));
> +               return object;
> +       }
> +       return NULL;
> +}
> +
> +static struct landlock_object *get_object_cleaner(
> +               struct landlock_object *object)
> +       __acquires(object->cleaners)
> +{
> +       __acquire(object->cleaners);
> +       if (object && refcount_inc_not_zero(&object->cleaners))
> +               return object;
> +       return NULL;
> +}

I don't get this whole "cleaners" thing. Can you give a quick
description of why this is necessary, and what benefits it has over a
standard refcounting+RCU scheme? I don't immediately see anything that
requires this.

> +/*
> + * There is two cases when an object should be free and the reference to the
> + * underlying object should be put:
> + * - when the last rule tied to this object is removed, which is handled by
> + *   landlock_put_rule() and then release_object();
> + * - when the object is being terminated (e.g. no more reference to an inode),
> + *   which is handled by landlock_put_object().
> + */
> +static void put_object_free(struct landlock_object *object)
> +       __releases(object->cleaners)
> +{
> +       __release(object->cleaners);
> +       if (!refcount_dec_and_test(&object->cleaners))
> +               return;
> +       WARN_ON_ONCE(refcount_read(&object->usage));
> +       /*
> +        * Ensures a safe use of @object in the RCU block from
> +        * landlock_put_rule().
> +        */
> +       kfree_rcu(object, rcu_free);
> +}
> +
> +/*
> + * Destroys a newly created and useless object.
> + */
> +void landlock_drop_object(struct landlock_object *object)
> +{
> +       if (WARN_ON_ONCE(!refcount_dec_and_test(&object->usage)))
> +               return;
> +       __acquire(object->cleaners);
> +       put_object_free(object);
> +}
> +
> +/*
> + * Puts the underlying object (e.g. inode) if it is the first request to
> + * release @object, without calling landlock_put_object().
> + *
> + * Return true if this call effectively marks @object as released, false
> + * otherwise.
> + */
> +static bool release_object(struct landlock_object *object)
> +       __releases(&object->lock)
> +{
> +       void *underlying_object;
> +
> +       lockdep_assert_held(&object->lock);
> +
> +       underlying_object = xchg(&object->underlying_object, NULL);
> +       spin_unlock(&object->lock);
> +       might_sleep();
> +       if (!underlying_object)
> +               return false;
> +
> +       switch (object->type) {
> +       case LANDLOCK_OBJECT_INODE:
> +               break;
> +       default:
> +               WARN_ON_ONCE(1);
> +       }
> +       return true;
> +}
> +
> +static void put_object_cleaner(struct landlock_object *object)
> +       __releases(object->cleaners)
> +{
> +       /* Let's try an early lockless check. */
> +       if (list_empty(&object->rules) &&
> +                       READ_ONCE(object->underlying_object)) {
> +               /*
> +                * Puts @object if there is no rule tied to it and the
> +                * remaining user is the underlying object.  This check is
> +                * atomic because @object->rules and @object->underlying_object
> +                * are protected by @object->lock.
> +                */
> +               spin_lock(&object->lock);
> +               if (list_empty(&object->rules) &&
> +                               READ_ONCE(object->underlying_object) &&
> +                               refcount_dec_if_one(&object->usage)) {
> +                       /*
> +                        * Releases @object, in place of
> +                        * landlock_release_object().
> +                        *
> +                        * @object is already empty, implying that all its
> +                        * previous rules are already disabled.
> +                        *
> +                        * Unbalance the @object->cleaners counter to reflect
> +                        * the underlying object release.
> +                        */
> +                       if (!WARN_ON_ONCE(!release_object(object))) {
> +                               __acquire(object->cleaners);
> +                               put_object_free(object);
> +                       }
> +               } else {
> +                       spin_unlock(&object->lock);
> +               }
> +       }
> +       put_object_free(object);
> +}
> +
> +/*
> + * Putting an object is easy when the object is being terminated, but it is
> + * much more tricky when the reason is that there is no more rule tied to this
> + * object.  Indeed, new rules could be added at the same time.
> + */
> +void landlock_put_object(struct landlock_object *object)
> +       __releases(object->usage)
> +{
> +       struct landlock_object *object_cleaner;
> +
> +       __release(object->usage);
> +       might_sleep();
> +       if (!object)
> +               return;
> +       /*
> +        * Guards against concurrent termination to be able to terminate
> +        * @object if it is empty and not referenced by another rule-appender
> +        * other than the underlying object.
> +        */
> +       object_cleaner = get_object_cleaner(object);
> +       if (WARN_ON_ONCE(!object_cleaner)) {
> +               __release(object->cleaners);
> +               return;
> +       }
> +       /*
> +        * Decrements @object->usage and if it reach zero, also decrement
> +        * @object->cleaners.  If both reach zero, then release and free
> +        * @object.
> +        */
> +       if (refcount_dec_and_test(&object->usage)) {
> +               struct landlock_rule *rule_walker, *rule_walker2;
> +
> +               spin_lock(&object->lock);
> +               /*
> +                * Disables all the rules tied to @object when it is forbidden
> +                * to add new rule but still allowed to remove them with
> +                * landlock_put_rule().  This is crucial to be able to safely
> +                * free a rule according to landlock_rule_is_disabled().
> +                */
> +               list_for_each_entry_safe(rule_walker, rule_walker2,
> +                               &object->rules, list)
> +                       list_del_rcu(&rule_walker->list);
> +
> +               /*
> +                * Releases @object if it is not already released (e.g. with
> +                * landlock_release_object()).
> +                */
> +               release_object(object);
> +               /*
> +                * Unbalances the @object->cleaners counter to reflect the
> +                * underlying object release.
> +                */
> +               __acquire(object->cleaners);
> +               put_object_free(object);
> +       }
> +       put_object_cleaner(object_cleaner);
> +}
> +
> +void landlock_put_rule(struct landlock_object *object,
> +               struct landlock_rule *rule)
> +{
> +       if (!rule)
> +               return;
> +       WARN_ON_ONCE(!object);
> +       /*
> +        * Guards against a concurrent @object self-destruction with
> +        * landlock_put_object() or put_object_cleaner().
> +        */
> +       rcu_read_lock();
> +       if (landlock_rule_is_disabled(rule)) {
> +               rcu_read_unlock();
> +               if (refcount_dec_and_test(&rule->usage))
> +                       kfree_rcu(rule, rcu_free);
> +               return;
> +       }
> +       if (refcount_dec_and_test(&rule->usage)) {
> +               struct landlock_object *safe_object;
> +
> +               /*
> +                * Now, @rule may still be enabled, or in the process of being
> +                * untied to @object by put_object_cleaner().  However, we know
> +                * that @object will not be freed until rcu_read_unlock() and
> +                * until @object->cleaners reach zero.  Furthermore, we may not
> +                * be the only one willing to free a @rule linked with @object.
> +                * If we succeed to hold @object with get_object_cleaner(), we
> +                * know that until put_object_cleaner(), we can safely use
> +                * @object to remove @rule.
> +                */
> +               safe_object = get_object_cleaner(object);
> +               rcu_read_unlock();
> +               if (!safe_object) {
> +                       __release(safe_object->cleaners);
> +                       /*
> +                        * We can safely free @rule because it is already
> +                        * removed from @object's list.
> +                        */
> +                       WARN_ON_ONCE(!landlock_rule_is_disabled(rule));
> +                       kfree_rcu(rule, rcu_free);
> +               } else {
> +                       spin_lock(&safe_object->lock);
> +                       if (!landlock_rule_is_disabled(rule))
> +                               list_del(&rule->list);
> +                       spin_unlock(&safe_object->lock);
> +                       kfree_rcu(rule, rcu_free);
> +                       put_object_cleaner(safe_object);
> +               }
> +       } else {
> +               rcu_read_unlock();
> +       }
> +       /*
> +        * put_object_cleaner() might sleep, but it is only reachable if
> +        * !landlock_rule_is_disabled().  Therefore, clean_ref() can not sleep.
> +        */
> +       might_sleep();
> +}
> +
> +void landlock_release_object(struct landlock_object __rcu *rcu_object)
> +{
> +       struct landlock_object *object;
> +
> +       if (!rcu_object)
> +               return;
> +       rcu_read_lock();
> +       object = get_object_cleaner(rcu_dereference(rcu_object));

This is not how RCU works. You need the rcu annotation on the access
to the data structure member (or global variable) that's actually
being accessed. A "struct foo __rcu *foo" argument is essentially
always wrong.

> +struct landlock_rule {
> +       struct landlock_access access;
> +       /*
> +        * @list: Linked list with other rules tied to the same object, which
> +        * enable to manage their lifetimes.  This is also used to identify if
> +        * a rule is still valid, thanks to landlock_rule_is_disabled(), which
> +        * is important in the matching process because the original object
> +        * address might have been recycled.
> +        */
> +       struct list_head list;
> +       union {
> +               /*
> +                * @usage: Number of rulesets pointing to this rule.  This
> +                * field is never used by RCU readers.
> +                */
> +               refcount_t usage;
> +               struct rcu_head rcu_free;
> +       };
> +};

An object that is subject to RCU but whose refcount must not be
accessed from RCU context? That seems a weird.

> +enum landlock_object_type {
> +       LANDLOCK_OBJECT_INODE = 1,
> +};
> +
> +struct landlock_object {
> +       /*
> +        * @usage: Main usage counter, used to tie an object to it's underlying
> +        * object (i.e. create a lifetime) and potentially add new rules.

I can't really follow this by reading this patch on its own. As one
suggestion to make things at least a bit better, how about documenting
here that `usage` always reaches zero before `cleaners` does?

> +        */
> +       refcount_t usage;
> +       /*
> +        * @cleaners: Usage counter used to free a rule from @rules (thanks to
> +        * put_rule()).  Enables to get a reference to this object until it
> +        * really become freed.  Cf. put_object().

Maybe add: @usage being non-zero counts as one reference to @cleaners.
Once @cleaners has become zero, the object is freed after an RCU grace
period.

> +        */
> +       refcount_t cleaners;
> +       union {
> +               /*
> +                * The use of this struct is controlled by @usage and
> +                * @cleaners, which makes it safe to union it with @rcu_free.
> +                */
[...]
> +               struct rcu_head rcu_free;
> +       };
> +};
[...]
> +static inline bool landlock_rule_is_disabled(
> +               struct landlock_rule *rule)
> +{
> +       /*
> +        * Disabling (i.e. unlinking) a landlock_rule is a one-way operation.
> +        * It is not possible to re-enable such a rule, then there is no need
> +        * for smp_load_acquire().
> +        *
> +        * LIST_POISON2 is set by list_del() and list_del_rcu().
> +        */
> +       return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2;

You're not allowed to do this, the comment above list_del() states:

 * Note: list_empty() on entry does not return true after this, the entry is
 * in an undefined state.

If you want to be able to test whether the element is on a list
afterwards, use stuff like list_del_init().

> +}
Mickaël Salaün Feb. 26, 2020, 3:31 p.m. UTC | #2
On 25/02/2020 21:49, Jann Horn wrote:
> On Mon, Feb 24, 2020 at 5:05 PM Mickaël Salaün <mic@digikod.net> wrote:
>> A Landlock object enables to identify a kernel object (e.g. an inode).
>> A Landlock rule is a set of access rights allowed on an object.  Rules
>> are grouped in rulesets that may be tied to a set of processes (i.e.
>> subjects) to enforce a scoped access-control (i.e. a domain).
>>
>> Because Landlock's goal is to empower any process (especially
>> unprivileged ones) to sandbox themselves, we can't rely on a system-wide
>> object identification such as file extended attributes.  Indeed, we need
>> innocuous, composable and modular access-controls.
>>
>> The main challenge with this constraints is to identify kernel objects
>> while this identification is useful (i.e. when a security policy makes
>> use of this object).  But this identification data should be freed once
>> no policy is using it.  This ephemeral tagging should not and may not be
>> written in the filesystem.  We then need to manage the lifetime of a
>> rule according to the lifetime of its object.  To avoid a global lock,
>> this implementation make use of RCU and counters to safely reference
>> objects.
>>
>> A following commit uses this generic object management for inodes.
> [...]
>> diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
>> new file mode 100644
>> index 000000000000..4a321d5b3f67
>> --- /dev/null
>> +++ b/security/landlock/Kconfig
>> @@ -0,0 +1,15 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +
>> +config SECURITY_LANDLOCK
>> +       bool "Landlock support"
>> +       depends on SECURITY
>> +       default n
> 
> (I think "default n" is implicit?)

It seems that most (all?) Kconfig are written like this.

> 
>> +       help
>> +         This selects Landlock, a safe sandboxing mechanism.  It enables to
>> +         restrict processes on the fly (i.e. enforce an access control policy),
>> +         which can complement seccomp-bpf.  The security policy is a set of access
>> +         rights tied to an object, which could be a file, a socket or a process.
>> +
>> +         See Documentation/security/landlock/ for further information.
>> +
>> +         If you are unsure how to answer this question, answer N.
> [...]
>> diff --git a/security/landlock/object.c b/security/landlock/object.c
>> new file mode 100644
>> index 000000000000..38fbbb108120
>> --- /dev/null
>> +++ b/security/landlock/object.c
>> @@ -0,0 +1,339 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Landlock LSM - Object and rule management
>> + *
>> + * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
>> + * Copyright © 2018-2020 ANSSI
>> + *
>> + * Principles and constraints of the object and rule management:
>> + * - Do not leak memory.
>> + * - Try as much as possible to free a memory allocation as soon as it is
>> + *   unused.
>> + * - Do not use global lock.
>> + * - Do not charge processes other than the one requesting a Landlock
>> + *   operation.
>> + */
>> +
>> +#include <linux/bug.h>
>> +#include <linux/compiler.h>
>> +#include <linux/compiler_types.h>
>> +#include <linux/err.h>
>> +#include <linux/errno.h>
>> +#include <linux/fs.h>
>> +#include <linux/kernel.h>
>> +#include <linux/list.h>
>> +#include <linux/rbtree.h>
>> +#include <linux/rcupdate.h>
>> +#include <linux/refcount.h>
>> +#include <linux/slab.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/workqueue.h>
>> +
>> +#include "object.h"
>> +
>> +struct landlock_object *landlock_create_object(
>> +               const enum landlock_object_type type, void *underlying_object)
>> +{
>> +       struct landlock_object *object;
>> +
>> +       if (WARN_ON_ONCE(!underlying_object))
>> +               return NULL;
>> +       object = kzalloc(sizeof(*object), GFP_KERNEL);
>> +       if (!object)
>> +               return NULL;
>> +       refcount_set(&object->usage, 1);
>> +       refcount_set(&object->cleaners, 1);
>> +       spin_lock_init(&object->lock);
>> +       INIT_LIST_HEAD(&object->rules);
>> +       object->type = type;
>> +       WRITE_ONCE(object->underlying_object, underlying_object);
> 
> `object` is not globally visible at this point, so WRITE_ONCE() is unnecessary.

Right. It was written like this to have a uniform use of this pointer,
but I'll remove it.

> 
>> +       return object;
>> +}
>> +
>> +struct landlock_object *landlock_get_object(struct landlock_object *object)
>> +       __acquires(object->usage)
>> +{
>> +       __acquire(object->usage);
>> +       /*
>> +        * If @object->usage equal 0, then it will be ignored by writers, and
>> +        * underlying_object->object may be replaced, but this is not an issue
>> +        * for release_object().
>> +        */
>> +       if (object && refcount_inc_not_zero(&object->usage)) {
>> +               /*
>> +                * It should not be possible to get a reference to an object if
>> +                * its underlying object is being terminated (e.g. with
>> +                * landlock_release_object()), because an object is only
>> +                * modifiable through such underlying object.  This is not the
>> +                * case with landlock_get_object_cleaner().
>> +                */
>> +               WARN_ON_ONCE(!READ_ONCE(object->underlying_object));
>> +               return object;
>> +       }
>> +       return NULL;
>> +}
>> +
>> +static struct landlock_object *get_object_cleaner(
>> +               struct landlock_object *object)
>> +       __acquires(object->cleaners)
>> +{
>> +       __acquire(object->cleaners);
>> +       if (object && refcount_inc_not_zero(&object->cleaners))
>> +               return object;
>> +       return NULL;
>> +}
> 
> I don't get this whole "cleaners" thing. Can you give a quick
> description of why this is necessary, and what benefits it has over a
> standard refcounting+RCU scheme? I don't immediately see anything that
> requires this.

This indeed needs more documentation here. Here is a comment I'll add to
get_object_cleaner():

This enables to safely get a reference to an object to potentially free
it if it is not already being freed by a concurrent thread. Indeed, the
object's address may still be read and dereferenced while a concurrent
thread is attempting to clean the object. Cf. &struct
landlock_object->usage and &struct landlock_object->cleaners.

See below the explanation about "usage" and "cleaners".

> 
>> +/*
>> + * There is two cases when an object should be free and the reference to the
>> + * underlying object should be put:
>> + * - when the last rule tied to this object is removed, which is handled by
>> + *   landlock_put_rule() and then release_object();
>> + * - when the object is being terminated (e.g. no more reference to an inode),
>> + *   which is handled by landlock_put_object().
>> + */
>> +static void put_object_free(struct landlock_object *object)
>> +       __releases(object->cleaners)
>> +{
>> +       __release(object->cleaners);
>> +       if (!refcount_dec_and_test(&object->cleaners))
>> +               return;
>> +       WARN_ON_ONCE(refcount_read(&object->usage));
>> +       /*
>> +        * Ensures a safe use of @object in the RCU block from
>> +        * landlock_put_rule().
>> +        */
>> +       kfree_rcu(object, rcu_free);
>> +}
>> +
>> +/*
>> + * Destroys a newly created and useless object.
>> + */
>> +void landlock_drop_object(struct landlock_object *object)
>> +{
>> +       if (WARN_ON_ONCE(!refcount_dec_and_test(&object->usage)))
>> +               return;
>> +       __acquire(object->cleaners);
>> +       put_object_free(object);
>> +}
>> +
>> +/*
>> + * Puts the underlying object (e.g. inode) if it is the first request to
>> + * release @object, without calling landlock_put_object().
>> + *
>> + * Return true if this call effectively marks @object as released, false
>> + * otherwise.
>> + */
>> +static bool release_object(struct landlock_object *object)
>> +       __releases(&object->lock)
>> +{
>> +       void *underlying_object;
>> +
>> +       lockdep_assert_held(&object->lock);
>> +
>> +       underlying_object = xchg(&object->underlying_object, NULL);
>> +       spin_unlock(&object->lock);
>> +       might_sleep();
>> +       if (!underlying_object)
>> +               return false;
>> +
>> +       switch (object->type) {
>> +       case LANDLOCK_OBJECT_INODE:
>> +               break;
>> +       default:
>> +               WARN_ON_ONCE(1);
>> +       }
>> +       return true;
>> +}
>> +
>> +static void put_object_cleaner(struct landlock_object *object)
>> +       __releases(object->cleaners)
>> +{
>> +       /* Let's try an early lockless check. */
>> +       if (list_empty(&object->rules) &&
>> +                       READ_ONCE(object->underlying_object)) {
>> +               /*
>> +                * Puts @object if there is no rule tied to it and the
>> +                * remaining user is the underlying object.  This check is
>> +                * atomic because @object->rules and @object->underlying_object
>> +                * are protected by @object->lock.
>> +                */
>> +               spin_lock(&object->lock);
>> +               if (list_empty(&object->rules) &&
>> +                               READ_ONCE(object->underlying_object) &&
>> +                               refcount_dec_if_one(&object->usage)) {
>> +                       /*
>> +                        * Releases @object, in place of
>> +                        * landlock_release_object().
>> +                        *
>> +                        * @object is already empty, implying that all its
>> +                        * previous rules are already disabled.
>> +                        *
>> +                        * Unbalance the @object->cleaners counter to reflect
>> +                        * the underlying object release.
>> +                        */
>> +                       if (!WARN_ON_ONCE(!release_object(object))) {
>> +                               __acquire(object->cleaners);
>> +                               put_object_free(object);
>> +                       }
>> +               } else {
>> +                       spin_unlock(&object->lock);
>> +               }
>> +       }
>> +       put_object_free(object);
>> +}
>> +
>> +/*
>> + * Putting an object is easy when the object is being terminated, but it is
>> + * much more tricky when the reason is that there is no more rule tied to this
>> + * object.  Indeed, new rules could be added at the same time.
>> + */
>> +void landlock_put_object(struct landlock_object *object)
>> +       __releases(object->usage)
>> +{
>> +       struct landlock_object *object_cleaner;
>> +
>> +       __release(object->usage);
>> +       might_sleep();
>> +       if (!object)
>> +               return;
>> +       /*
>> +        * Guards against concurrent termination to be able to terminate
>> +        * @object if it is empty and not referenced by another rule-appender
>> +        * other than the underlying object.
>> +        */
>> +       object_cleaner = get_object_cleaner(object);
>> +       if (WARN_ON_ONCE(!object_cleaner)) {
>> +               __release(object->cleaners);
>> +               return;
>> +       }
>> +       /*
>> +        * Decrements @object->usage and if it reach zero, also decrement
>> +        * @object->cleaners.  If both reach zero, then release and free
>> +        * @object.
>> +        */
>> +       if (refcount_dec_and_test(&object->usage)) {
>> +               struct landlock_rule *rule_walker, *rule_walker2;
>> +
>> +               spin_lock(&object->lock);
>> +               /*
>> +                * Disables all the rules tied to @object when it is forbidden
>> +                * to add new rule but still allowed to remove them with
>> +                * landlock_put_rule().  This is crucial to be able to safely
>> +                * free a rule according to landlock_rule_is_disabled().
>> +                */
>> +               list_for_each_entry_safe(rule_walker, rule_walker2,
>> +                               &object->rules, list)
>> +                       list_del_rcu(&rule_walker->list);
>> +
>> +               /*
>> +                * Releases @object if it is not already released (e.g. with
>> +                * landlock_release_object()).
>> +                */
>> +               release_object(object);
>> +               /*
>> +                * Unbalances the @object->cleaners counter to reflect the
>> +                * underlying object release.
>> +                */
>> +               __acquire(object->cleaners);
>> +               put_object_free(object);
>> +       }
>> +       put_object_cleaner(object_cleaner);
>> +}
>> +
>> +void landlock_put_rule(struct landlock_object *object,
>> +               struct landlock_rule *rule)
>> +{
>> +       if (!rule)
>> +               return;
>> +       WARN_ON_ONCE(!object);
>> +       /*
>> +        * Guards against a concurrent @object self-destruction with
>> +        * landlock_put_object() or put_object_cleaner().
>> +        */
>> +       rcu_read_lock();
>> +       if (landlock_rule_is_disabled(rule)) {
>> +               rcu_read_unlock();
>> +               if (refcount_dec_and_test(&rule->usage))
>> +                       kfree_rcu(rule, rcu_free);
>> +               return;
>> +       }
>> +       if (refcount_dec_and_test(&rule->usage)) {
>> +               struct landlock_object *safe_object;
>> +
>> +               /*
>> +                * Now, @rule may still be enabled, or in the process of being
>> +                * untied to @object by put_object_cleaner().  However, we know
>> +                * that @object will not be freed until rcu_read_unlock() and
>> +                * until @object->cleaners reach zero.  Furthermore, we may not
>> +                * be the only one willing to free a @rule linked with @object.
>> +                * If we succeed to hold @object with get_object_cleaner(), we
>> +                * know that until put_object_cleaner(), we can safely use
>> +                * @object to remove @rule.
>> +                */
>> +               safe_object = get_object_cleaner(object);
>> +               rcu_read_unlock();
>> +               if (!safe_object) {
>> +                       __release(safe_object->cleaners);
>> +                       /*
>> +                        * We can safely free @rule because it is already
>> +                        * removed from @object's list.
>> +                        */
>> +                       WARN_ON_ONCE(!landlock_rule_is_disabled(rule));
>> +                       kfree_rcu(rule, rcu_free);
>> +               } else {
>> +                       spin_lock(&safe_object->lock);
>> +                       if (!landlock_rule_is_disabled(rule))
>> +                               list_del(&rule->list);
>> +                       spin_unlock(&safe_object->lock);
>> +                       kfree_rcu(rule, rcu_free);
>> +                       put_object_cleaner(safe_object);
>> +               }
>> +       } else {
>> +               rcu_read_unlock();
>> +       }
>> +       /*
>> +        * put_object_cleaner() might sleep, but it is only reachable if
>> +        * !landlock_rule_is_disabled().  Therefore, clean_ref() can not sleep.
>> +        */
>> +       might_sleep();
>> +}
>> +
>> +void landlock_release_object(struct landlock_object __rcu *rcu_object)
>> +{
>> +       struct landlock_object *object;
>> +
>> +       if (!rcu_object)
>> +               return;
>> +       rcu_read_lock();
>> +       object = get_object_cleaner(rcu_dereference(rcu_object));
> 
> This is not how RCU works. You need the rcu annotation on the access
> to the data structure member (or global variable) that's actually
> being accessed. A "struct foo __rcu *foo" argument is essentially
> always wrong.

Absolutely! I fixed this with the following patch:

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 7f3bd4fd04bb..01a48c75f210 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -98,7 +98,9 @@ void landlock_release_inodes(struct super_block *sb)
 		if (iput_inode)
 			iput(iput_inode);

-		landlock_release_object(inode_landlock(inode)->object);
+		rcu_read_lock();
+		landlock_release_object(rcu_dereference(
+					inode_landlock(inode)->object));

 		iput_inode = inode;
 		spin_lock(&sb->s_inode_list_lock);
diff --git a/security/landlock/object.c b/security/landlock/object.c
index 2d373f224989..a0e65a78068d 100644
--- a/security/landlock/object.c
+++ b/security/landlock/object.c
@@ -300,14 +300,16 @@ void landlock_put_rule(struct landlock_object *object,
 	might_sleep();
 }

-void landlock_release_object(struct landlock_object __rcu *rcu_object)
+void landlock_release_object(struct landlock_object *rcu_object)
+	__releases(RCU)
 {
 	struct landlock_object *object;

-	if (!rcu_object)
+	if (!rcu_object) {
+		rcu_read_unlock();
 		return;
-	rcu_read_lock();
-	object = get_object_cleaner(rcu_dereference(rcu_object));
+	}
+	object = get_object_cleaner(rcu_object);
 	rcu_read_unlock();
 	if (unlikely(!object)) {
 		__release(object->cleaners);
diff --git a/security/landlock/object.h b/security/landlock/object.h
index 15dfc9a75a82..78bfb25d4bcc 100644
--- a/security/landlock/object.h
+++ b/security/landlock/object.h
@@ -12,9 +12,9 @@
 #include <linux/compiler_types.h>
 #include <linux/list.h>
 #include <linux/poison.h>
-#include <linux/rcupdate.h>
 #include <linux/refcount.h>
 #include <linux/spinlock.h>
+#include <linux/types.h>

 struct landlock_access {
 	/*
@@ -105,7 +105,8 @@ struct landlock_object {
 void landlock_put_rule(struct landlock_object *object,
 		struct landlock_rule *rule);

-void landlock_release_object(struct landlock_object __rcu *rcu_object);
+void landlock_release_object(struct landlock_object *object)
+	__releases(RCU);

 struct landlock_object *landlock_create_object(
 		const enum landlock_object_type type, void *underlying_object);


> 
>> +struct landlock_rule {
>> +       struct landlock_access access;
>> +       /*
>> +        * @list: Linked list with other rules tied to the same object, which
>> +        * enable to manage their lifetimes.  This is also used to identify if
>> +        * a rule is still valid, thanks to landlock_rule_is_disabled(), which
>> +        * is important in the matching process because the original object
>> +        * address might have been recycled.
>> +        */
>> +       struct list_head list;
>> +       union {
>> +               /*
>> +                * @usage: Number of rulesets pointing to this rule.  This
>> +                * field is never used by RCU readers.
>> +                */
>> +               refcount_t usage;
>> +               struct rcu_head rcu_free;
>> +       };
>> +};
> 
> An object that is subject to RCU but whose refcount must not be
> accessed from RCU context? That seems a weird.

The fields "access" and "list" are read (in a RCU-read block) by
ruleset.c:landlock_find_access() (cf. patch 2). The use of the "usage"
counter is in landlock_insert_ruleset_rule() and landlock_put_rule(),
but in these cases the rule is always owned/held by the caller. I should
say something like "This field must only be used when already holding
the rule."

> 
>> +enum landlock_object_type {
>> +       LANDLOCK_OBJECT_INODE = 1,
>> +};
>> +
>> +struct landlock_object {
>> +       /*
>> +        * @usage: Main usage counter, used to tie an object to it's underlying
>> +        * object (i.e. create a lifetime) and potentially add new rules.
> 
> I can't really follow this by reading this patch on its own. As one
> suggestion to make things at least a bit better, how about documenting
> here that `usage` always reaches zero before `cleaners` does?

What about this?

This counter is used to tie an object to its underlying object (e.g. an
inode) and to modify it (e.g. add or remove a rule). If this counter
reaches zero, the object must not be modified, but it may still be used
from within an RCU-read block. When adding a new rule to an object with
a usage counter of zero, the underlying object must be locked and its
object pointer can then be replaced with a new empty object (while
ignoring the disabled object which is being handled by another thread).
This counter always reaches zero before @cleaners does.


> 
>> +        */
>> +       refcount_t usage;
>> +       /*
>> +        * @cleaners: Usage counter used to free a rule from @rules (thanks to
>> +        * put_rule()).  Enables to get a reference to this object until it
>> +        * really become freed.  Cf. put_object().
> 
> Maybe add: @usage being non-zero counts as one reference to @cleaners.
> Once @cleaners has become zero, the object is freed after an RCU grace
> period.

What about this?

This counter can only reach zero if the @usage counter already reached
zero. Indeed, @usage being non-zero counts as one reference to
@cleaners. Once @cleaners has become zero, the object is freed after an
RCU grace period. This enables concurrent threads to safely get an
object reference to terminate it if there is no more concurrent cleaners
for this object. This mechanism is required to enable concurrent threads
to safely dereference an object from potentially different pointers
(e.g. the underlying object, or a rule tied to this object), to
potentially terminate and free it (i.e. if there is no more rules tied
to it, or if the underlying object is being terminated).

> 
>> +        */
>> +       refcount_t cleaners;
>> +       union {
>> +               /*
>> +                * The use of this struct is controlled by @usage and
>> +                * @cleaners, which makes it safe to union it with @rcu_free.
>> +                */
> [...]
>> +               struct rcu_head rcu_free;
>> +       };
>> +};
> [...]
>> +static inline bool landlock_rule_is_disabled(
>> +               struct landlock_rule *rule)
>> +{
>> +       /*
>> +        * Disabling (i.e. unlinking) a landlock_rule is a one-way operation.
>> +        * It is not possible to re-enable such a rule, then there is no need
>> +        * for smp_load_acquire().
>> +        *
>> +        * LIST_POISON2 is set by list_del() and list_del_rcu().
>> +        */
>> +       return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2;
> 
> You're not allowed to do this, the comment above list_del() states:
> 
>  * Note: list_empty() on entry does not return true after this, the entry is
>  * in an undefined state.

list_del() checks READ_ONCE(head->next) == head, but
landlock_rule_is_disabled() checks READ_ONCE(rule->list.prev) ==
LIST_POISON2.
The comment about LIST_POISON2 is right but may be misleading. There is
no use of list_empty() with a landlock_rule->list, only
landlock_object->rules. The only list_del() is in landlock_put_rule()
when there is a guarantee that there is no other reference to it, hence
no possible use of landlock_rule_is_disabled() with this rule. I could
replace it with a call to list_del_rcu() to make it more consistent.

> 
> If you want to be able to test whether the element is on a list
> afterwards, use stuff like list_del_init().

There is no need to re-initialize the list but using list_del_init() and
list_empty() could work too. However, there is no list_del_init_rcu()
helper. Moreover, resetting the list's pointer with LIST_POISON2 might
help to detect bugs.


Thanks for this review!
Jann Horn Feb. 26, 2020, 8:24 p.m. UTC | #3
On Wed, Feb 26, 2020 at 4:32 PM Mickaël Salaün <mic@digikod.net> wrote:
> On 25/02/2020 21:49, Jann Horn wrote:
> > On Mon, Feb 24, 2020 at 5:05 PM Mickaël Salaün <mic@digikod.net> wrote:
> >> A Landlock object enables to identify a kernel object (e.g. an inode).
> >> A Landlock rule is a set of access rights allowed on an object.  Rules
> >> are grouped in rulesets that may be tied to a set of processes (i.e.
> >> subjects) to enforce a scoped access-control (i.e. a domain).
> >>
> >> Because Landlock's goal is to empower any process (especially
> >> unprivileged ones) to sandbox themselves, we can't rely on a system-wide
> >> object identification such as file extended attributes.  Indeed, we need
> >> innocuous, composable and modular access-controls.
> >>
> >> The main challenge with this constraints is to identify kernel objects
> >> while this identification is useful (i.e. when a security policy makes
> >> use of this object).  But this identification data should be freed once
> >> no policy is using it.  This ephemeral tagging should not and may not be
> >> written in the filesystem.  We then need to manage the lifetime of a
> >> rule according to the lifetime of its object.  To avoid a global lock,
> >> this implementation make use of RCU and counters to safely reference
> >> objects.
> >>
> >> A following commit uses this generic object management for inodes.
[...]
> >> +config SECURITY_LANDLOCK
> >> +       bool "Landlock support"
> >> +       depends on SECURITY
> >> +       default n
> >
> > (I think "default n" is implicit?)
>
> It seems that most (all?) Kconfig are written like this.

See e.g. <https://lore.kernel.org/lkml/c187bb77-e804-93bd-64db-9418be58f191@infradead.org/>.

[...]
> >> +       return object;
> >> +}
> >> +
> >> +struct landlock_object *landlock_get_object(struct landlock_object *object)
> >> +       __acquires(object->usage)
> >> +{
> >> +       __acquire(object->usage);
> >> +       /*
> >> +        * If @object->usage equal 0, then it will be ignored by writers, and
> >> +        * underlying_object->object may be replaced, but this is not an issue
> >> +        * for release_object().
> >> +        */
> >> +       if (object && refcount_inc_not_zero(&object->usage)) {
> >> +               /*
> >> +                * It should not be possible to get a reference to an object if
> >> +                * its underlying object is being terminated (e.g. with
> >> +                * landlock_release_object()), because an object is only
> >> +                * modifiable through such underlying object.  This is not the
> >> +                * case with landlock_get_object_cleaner().
> >> +                */
> >> +               WARN_ON_ONCE(!READ_ONCE(object->underlying_object));
> >> +               return object;
> >> +       }
> >> +       return NULL;
> >> +}
> >> +
> >> +static struct landlock_object *get_object_cleaner(
> >> +               struct landlock_object *object)
> >> +       __acquires(object->cleaners)
> >> +{
> >> +       __acquire(object->cleaners);
> >> +       if (object && refcount_inc_not_zero(&object->cleaners))
> >> +               return object;
> >> +       return NULL;
> >> +}
> >
> > I don't get this whole "cleaners" thing. Can you give a quick
> > description of why this is necessary, and what benefits it has over a
> > standard refcounting+RCU scheme? I don't immediately see anything that
> > requires this.
>
> This indeed needs more documentation here. Here is a comment I'll add to
> get_object_cleaner():
>
> This enables to safely get a reference to an object to potentially free
> it if it is not already being freed by a concurrent thread.

"get a reference to an object to potentially free it" just sounds all
wrong to me. You free an object when you're *dropping* a reference to
it. Your refcounting scheme doesn't fit my mental models of how normal
refcounting works at all...

[...]
> >> +/*
> >> + * Putting an object is easy when the object is being terminated, but it is
> >> + * much more tricky when the reason is that there is no more rule tied to this
> >> + * object.  Indeed, new rules could be added at the same time.
> >> + */
> >> +void landlock_put_object(struct landlock_object *object)
> >> +       __releases(object->usage)
> >> +{
> >> +       struct landlock_object *object_cleaner;
> >> +
> >> +       __release(object->usage);
> >> +       might_sleep();
> >> +       if (!object)
> >> +               return;
> >> +       /*
> >> +        * Guards against concurrent termination to be able to terminate
> >> +        * @object if it is empty and not referenced by another rule-appender
> >> +        * other than the underlying object.
> >> +        */
> >> +       object_cleaner = get_object_cleaner(object);
[...]
> >> +       /*
> >> +        * Decrements @object->usage and if it reach zero, also decrement
> >> +        * @object->cleaners.  If both reach zero, then release and free
> >> +        * @object.
> >> +        */
> >> +       if (refcount_dec_and_test(&object->usage)) {
> >> +               struct landlock_rule *rule_walker, *rule_walker2;
> >> +
> >> +               spin_lock(&object->lock);
> >> +               /*
> >> +                * Disables all the rules tied to @object when it is forbidden
> >> +                * to add new rule but still allowed to remove them with
> >> +                * landlock_put_rule().  This is crucial to be able to safely
> >> +                * free a rule according to landlock_rule_is_disabled().
> >> +                */
> >> +               list_for_each_entry_safe(rule_walker, rule_walker2,
> >> +                               &object->rules, list)
> >> +                       list_del_rcu(&rule_walker->list);

So... rules don't take references on the landlock_objects they use?
Instead, the landlock_object knows which rules use it, and when the
landlock_object goes away, it nukes all the rules associated with
itself?

That seems terrible to me - AFAICS it means that if some random
process decides to install a landlock rule that uses inode X, and then
that process dies together with all its landlock rules, the inode
still stays pinned in kernel memory as long as the superblock is
mounted. In other words, it's a resource leak. (And if I'm not missing
something in patch 5, that applies even if the inode has been
unlinked?)

Can you please refactor your refcounting as follows?

 - A rule takes a reference on each landlock_object it uses.
 - A landlock_object takes a reference on the underlying object (just like now).
 - The underlying object *DOES NOT* take a reference on the
landlock_object (unlike now); the reference from the underlying object
to the landlock_object has weak pointer semantics.
 - When a landlock_object's refcount drops to zero (iow no rules use
it anymore), it is freed.

That might also help get rid of the awkward ->cleaners thing?

> >> +               /*
> >> +                * Releases @object if it is not already released (e.g. with
> >> +                * landlock_release_object()).
> >> +                */
> >> +               release_object(object);
> >> +               /*
> >> +                * Unbalances the @object->cleaners counter to reflect the
> >> +                * underlying object release.
> >> +                */
> >> +               __acquire(object->cleaners);
> >> +               put_object_free(object);
> >> +       }
> >> +       put_object_cleaner(object_cleaner);
> >> +}
[...]
> >> +static inline bool landlock_rule_is_disabled(
> >> +               struct landlock_rule *rule)
> >> +{
> >> +       /*
> >> +        * Disabling (i.e. unlinking) a landlock_rule is a one-way operation.
> >> +        * It is not possible to re-enable such a rule, then there is no need
> >> +        * for smp_load_acquire().
> >> +        *
> >> +        * LIST_POISON2 is set by list_del() and list_del_rcu().
> >> +        */
> >> +       return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2;
> >
> > You're not allowed to do this, the comment above list_del() states:
> >
> >  * Note: list_empty() on entry does not return true after this, the entry is
> >  * in an undefined state.
>
> list_del() checks READ_ONCE(head->next) == head, but
> landlock_rule_is_disabled() checks READ_ONCE(rule->list.prev) ==
> LIST_POISON2.
> The comment about LIST_POISON2 is right but may be misleading. There is
> no use of list_empty() with a landlock_rule->list, only
> landlock_object->rules. The only list_del() is in landlock_put_rule()
> when there is a guarantee that there is no other reference to it, hence
> no possible use of landlock_rule_is_disabled() with this rule. I could
> replace it with a call to list_del_rcu() to make it more consistent.
>
> >
> > If you want to be able to test whether the element is on a list
> > afterwards, use stuff like list_del_init().
>
> There is no need to re-initialize the list but using list_del_init() and
> list_empty() could work too. However, there is no list_del_init_rcu()
> helper. Moreover, resetting the list's pointer with LIST_POISON2 might
> help to detect bugs.

Either way, you are currently using the list_head API in a way that
goes against what the header documents. If you want to rely on
list_del() bringing the object into a specific state, then you can't
leave the comment above list_del() as-is that says that it puts the
object in an undefined state; and this kind of check should probably
be done in a helper in list.h instead of open-coding the check for
LIST_POISON2.
Mickaël Salaün Feb. 27, 2020, 4:46 p.m. UTC | #4
On 26/02/2020 21:24, Jann Horn wrote:
> On Wed, Feb 26, 2020 at 4:32 PM Mickaël Salaün <mic@digikod.net> wrote:
>> On 25/02/2020 21:49, Jann Horn wrote:
>>> On Mon, Feb 24, 2020 at 5:05 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>> A Landlock object enables to identify a kernel object (e.g. an inode).
>>>> A Landlock rule is a set of access rights allowed on an object.  Rules
>>>> are grouped in rulesets that may be tied to a set of processes (i.e.
>>>> subjects) to enforce a scoped access-control (i.e. a domain).
>>>>
>>>> Because Landlock's goal is to empower any process (especially
>>>> unprivileged ones) to sandbox themselves, we can't rely on a system-wide
>>>> object identification such as file extended attributes.  Indeed, we need
>>>> innocuous, composable and modular access-controls.
>>>>
>>>> The main challenge with this constraints is to identify kernel objects
>>>> while this identification is useful (i.e. when a security policy makes
>>>> use of this object).  But this identification data should be freed once
>>>> no policy is using it.  This ephemeral tagging should not and may not be
>>>> written in the filesystem.  We then need to manage the lifetime of a
>>>> rule according to the lifetime of its object.  To avoid a global lock,
>>>> this implementation make use of RCU and counters to safely reference
>>>> objects.
>>>>
>>>> A following commit uses this generic object management for inodes.
> [...]
>>>> +config SECURITY_LANDLOCK
>>>> +       bool "Landlock support"
>>>> +       depends on SECURITY
>>>> +       default n
>>>
>>> (I think "default n" is implicit?)
>>
>> It seems that most (all?) Kconfig are written like this.
> 
> See e.g. <https://lore.kernel.org/lkml/c187bb77-e804-93bd-64db-9418be58f191@infradead.org/>.

Ok, done.

> 
> [...]
>>>> +       return object;
>>>> +}
>>>> +
>>>> +struct landlock_object *landlock_get_object(struct landlock_object *object)
>>>> +       __acquires(object->usage)
>>>> +{
>>>> +       __acquire(object->usage);
>>>> +       /*
>>>> +        * If @object->usage equal 0, then it will be ignored by writers, and
>>>> +        * underlying_object->object may be replaced, but this is not an issue
>>>> +        * for release_object().
>>>> +        */
>>>> +       if (object && refcount_inc_not_zero(&object->usage)) {
>>>> +               /*
>>>> +                * It should not be possible to get a reference to an object if
>>>> +                * its underlying object is being terminated (e.g. with
>>>> +                * landlock_release_object()), because an object is only
>>>> +                * modifiable through such underlying object.  This is not the
>>>> +                * case with landlock_get_object_cleaner().
>>>> +                */
>>>> +               WARN_ON_ONCE(!READ_ONCE(object->underlying_object));
>>>> +               return object;
>>>> +       }
>>>> +       return NULL;
>>>> +}
>>>> +
>>>> +static struct landlock_object *get_object_cleaner(
>>>> +               struct landlock_object *object)
>>>> +       __acquires(object->cleaners)
>>>> +{
>>>> +       __acquire(object->cleaners);
>>>> +       if (object && refcount_inc_not_zero(&object->cleaners))
>>>> +               return object;
>>>> +       return NULL;
>>>> +}
>>>
>>> I don't get this whole "cleaners" thing. Can you give a quick
>>> description of why this is necessary, and what benefits it has over a
>>> standard refcounting+RCU scheme? I don't immediately see anything that
>>> requires this.
>>
>> This indeed needs more documentation here. Here is a comment I'll add to
>> get_object_cleaner():
>>
>> This enables to safely get a reference to an object to potentially free
>> it if it is not already being freed by a concurrent thread.
> 
> "get a reference to an object to potentially free it" just sounds all
> wrong to me. You free an object when you're *dropping* a reference to
> it. Your refcounting scheme doesn't fit my mental models of how normal
> refcounting works at all...

Unfortunately, as I explain below, it is a bit tricky.

> 
> [...]
>>>> +/*
>>>> + * Putting an object is easy when the object is being terminated, but it is
>>>> + * much more tricky when the reason is that there is no more rule tied to this
>>>> + * object.  Indeed, new rules could be added at the same time.
>>>> + */
>>>> +void landlock_put_object(struct landlock_object *object)
>>>> +       __releases(object->usage)
>>>> +{
>>>> +       struct landlock_object *object_cleaner;
>>>> +
>>>> +       __release(object->usage);
>>>> +       might_sleep();
>>>> +       if (!object)
>>>> +               return;
>>>> +       /*
>>>> +        * Guards against concurrent termination to be able to terminate
>>>> +        * @object if it is empty and not referenced by another rule-appender
>>>> +        * other than the underlying object.
>>>> +        */
>>>> +       object_cleaner = get_object_cleaner(object);
> [...]
>>>> +       /*
>>>> +        * Decrements @object->usage and if it reach zero, also decrement
>>>> +        * @object->cleaners.  If both reach zero, then release and free
>>>> +        * @object.
>>>> +        */
>>>> +       if (refcount_dec_and_test(&object->usage)) {
>>>> +               struct landlock_rule *rule_walker, *rule_walker2;
>>>> +
>>>> +               spin_lock(&object->lock);
>>>> +               /*
>>>> +                * Disables all the rules tied to @object when it is forbidden
>>>> +                * to add new rule but still allowed to remove them with
>>>> +                * landlock_put_rule().  This is crucial to be able to safely
>>>> +                * free a rule according to landlock_rule_is_disabled().
>>>> +                */
>>>> +               list_for_each_entry_safe(rule_walker, rule_walker2,
>>>> +                               &object->rules, list)
>>>> +                       list_del_rcu(&rule_walker->list);
> 
> So... rules don't take references on the landlock_objects they use?
> Instead, the landlock_object knows which rules use it, and when the
> landlock_object goes away, it nukes all the rules associated with
> itself?

Right.

> 
> That seems terrible to me - AFAICS it means that if some random
> process decides to install a landlock rule that uses inode X, and then
> that process dies together with all its landlock rules, the inode
> still stays pinned in kernel memory as long as the superblock is
> mounted. In other words, it's a resource leak.

That is not correct. When there is no more process enforced by a
domain/ruleset, this domain is terminated, which means that every rules
linked to this domain are put away. When the usage counter of a rule
reaches zero, then the rule is terminated with landlock_put_rule() which
unlink the rule from its object and clean this object. The cleaning
involves to free the object if there is no rule tied to this object,
thanks to put_object_cleaner().

When the underlying object is terminated, landlock_release_object() also
decrement the usage counter. However, if there is a concurrent thread
adding a new rule, the usage counter still stay greater than zero while
the new rule is being added, but the counter then drops to zero at the
end of this addition, which can then unbalance the "cleaners" counter,
which will finally leads to the object freeing. This design enables to
add rules without locking (if the object already exists). While this
property is interesting for a performance point of view, the main reason
is to avoid unnecessary lock between processes (especially from
different domains).

> (And if I'm not missing
> something in patch 5, that applies even if the inode has been
> unlinked?)

That is true for now, but only because I didn't find yet the right spot
to call landlock_release_inode(). Indeed, unlinking a file may not
terminate an inode because it can still be open by a process, and
freeing an object when the underlying object is unlinked could be a way
to bypass a check on that object/inode.

Do you know where is the best spot to identify the last userspace
reference (through the filesystem or a file descriptor) to an inode?
Fnotify doesn't seem to check for that.


> 
> Can you please refactor your refcounting as follows?
> 
>  - A rule takes a reference on each landlock_object it uses.
>  - A landlock_object takes a reference on the underlying object (just like now).
>  - The underlying object *DOES NOT* take a reference on the
> landlock_object (unlike now); the reference from the underlying object
> to the landlock_object has weak pointer semantics.

We need to increment the reference counter of the underlying objects
(i.e. inodes) not to lose the link with their Landlock object and then
the related access-control. For instance, if a struct inode (e.g. a
directory) is first tied to a Landlock object/access-control, then
because the inode is not open nor used by any process and the kernel
decides to free it, when a process tries to access a file beneath this
directory, there will not have any Landlock object tied to it and the
requested access might then be forbidden (whereas the initial policy
allowed it).

>  - When a landlock_object's refcount drops to zero (iow no rules use
> it anymore), it is freed.

Before the current design, I used a similar pattern, but this is not
necessary because of the management of the underlying object lifetime.
The list_empty() check is enough, and because we need to handle
concurrent termination, the object's usage counter for the rules seems
unnecessary.

> 
> That might also help get rid of the awkward ->cleaners thing?
> 
>>>> +               /*
>>>> +                * Releases @object if it is not already released (e.g. with
>>>> +                * landlock_release_object()).
>>>> +                */
>>>> +               release_object(object);
>>>> +               /*
>>>> +                * Unbalances the @object->cleaners counter to reflect the
>>>> +                * underlying object release.
>>>> +                */
>>>> +               __acquire(object->cleaners);
>>>> +               put_object_free(object);
>>>> +       }
>>>> +       put_object_cleaner(object_cleaner);
>>>> +}
> [...]
>>>> +static inline bool landlock_rule_is_disabled(
>>>> +               struct landlock_rule *rule)
>>>> +{
>>>> +       /*
>>>> +        * Disabling (i.e. unlinking) a landlock_rule is a one-way operation.
>>>> +        * It is not possible to re-enable such a rule, then there is no need
>>>> +        * for smp_load_acquire().
>>>> +        *
>>>> +        * LIST_POISON2 is set by list_del() and list_del_rcu().
>>>> +        */
>>>> +       return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2;
>>>
>>> You're not allowed to do this, the comment above list_del() states:
>>>
>>>  * Note: list_empty() on entry does not return true after this, the entry is
>>>  * in an undefined state.
>>
>> list_del() checks READ_ONCE(head->next) == head, but
>> landlock_rule_is_disabled() checks READ_ONCE(rule->list.prev) ==
>> LIST_POISON2.
>> The comment about LIST_POISON2 is right but may be misleading. There is
>> no use of list_empty() with a landlock_rule->list, only
>> landlock_object->rules. The only list_del() is in landlock_put_rule()
>> when there is a guarantee that there is no other reference to it, hence
>> no possible use of landlock_rule_is_disabled() with this rule. I could
>> replace it with a call to list_del_rcu() to make it more consistent.
>>
>>>
>>> If you want to be able to test whether the element is on a list
>>> afterwards, use stuff like list_del_init().
>>
>> There is no need to re-initialize the list but using list_del_init() and
>> list_empty() could work too. However, there is no list_del_init_rcu()
>> helper. Moreover, resetting the list's pointer with LIST_POISON2 might
>> help to detect bugs.
> 
> Either way, you are currently using the list_head API in a way that
> goes against what the header documents. If you want to rely on
> list_del() bringing the object into a specific state, then you can't
> leave the comment above list_del() as-is that says that it puts the
> object in an undefined state; and this kind of check should probably
> be done in a helper in list.h instead of open-coding the check for
> LIST_POISON2.

In the case of Landlock, it is illegal to use or recycle a rule which
was untied from its (initial) object. There is no use of
list_empty(&landlock_rule->list), only
landlock_rule_is_disabled(landlock_rule). The LIST_POISON2 might help to
identify such misuse.
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index fcd79fc38928..206f85768cd9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9360,6 +9360,16 @@  F:	net/core/skmsg.c
 F:	net/core/sock_map.c
 F:	net/ipv4/tcp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:	Mickaël Salaün <mic@digikod.net>
+L:	linux-security-module@vger.kernel.org
+W:	https://landlock.io
+T:	git https://github.com/landlock-lsm/linux.git
+S:	Supported
+F:	security/landlock/
+K:	landlock
+K:	LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M:	Hauke Mehrtens <hauke@hauke-m.de>
 L:	netdev@vger.kernel.org
diff --git a/security/Kconfig b/security/Kconfig
index 2a1a2d396228..9d9981394fb0 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@  source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 746438499029..2472ef96d40a 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -12,6 +12,7 @@  subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown
+subdir-$(CONFIG_SECURITY_LANDLOCK)		+= landlock
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -29,6 +30,7 @@  obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
 obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown/
+obj-$(CONFIG_SECURITY_LANDLOCK)	+= landlock/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..4a321d5b3f67
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,15 @@ 
+# SPDX-License-Identifier: GPL-2.0-only
+
+config SECURITY_LANDLOCK
+	bool "Landlock support"
+	depends on SECURITY
+	default n
+	help
+	  This selects Landlock, a safe sandboxing mechanism.  It enables to
+	  restrict processes on the fly (i.e. enforce an access control policy),
+	  which can complement seccomp-bpf.  The security policy is a set of access
+	  rights tied to an object, which could be a file, a socket or a process.
+
+	  See Documentation/security/landlock/ for further information.
+
+	  If you are unsure how to answer this question, answer N.
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..cb6deefbf4c0
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,3 @@ 
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := object.o
diff --git a/security/landlock/object.c b/security/landlock/object.c
new file mode 100644
index 000000000000..38fbbb108120
--- /dev/null
+++ b/security/landlock/object.c
@@ -0,0 +1,339 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Object and rule management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2020 ANSSI
+ *
+ * Principles and constraints of the object and rule management:
+ * - Do not leak memory.
+ * - Try as much as possible to free a memory allocation as soon as it is
+ *   unused.
+ * - Do not use global lock.
+ * - Do not charge processes other than the one requesting a Landlock
+ *   operation.
+ */
+
+#include <linux/bug.h>
+#include <linux/compiler.h>
+#include <linux/compiler_types.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/rcupdate.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+
+#include "object.h"
+
+struct landlock_object *landlock_create_object(
+		const enum landlock_object_type type, void *underlying_object)
+{
+	struct landlock_object *object;
+
+	if (WARN_ON_ONCE(!underlying_object))
+		return NULL;
+	object = kzalloc(sizeof(*object), GFP_KERNEL);
+	if (!object)
+		return NULL;
+	refcount_set(&object->usage, 1);
+	refcount_set(&object->cleaners, 1);
+	spin_lock_init(&object->lock);
+	INIT_LIST_HEAD(&object->rules);
+	object->type = type;
+	WRITE_ONCE(object->underlying_object, underlying_object);
+	return object;
+}
+
+struct landlock_object *landlock_get_object(struct landlock_object *object)
+	__acquires(object->usage)
+{
+	__acquire(object->usage);
+	/*
+	 * If @object->usage equal 0, then it will be ignored by writers, and
+	 * underlying_object->object may be replaced, but this is not an issue
+	 * for release_object().
+	 */
+	if (object && refcount_inc_not_zero(&object->usage)) {
+		/*
+		 * It should not be possible to get a reference to an object if
+		 * its underlying object is being terminated (e.g. with
+		 * landlock_release_object()), because an object is only
+		 * modifiable through such underlying object.  This is not the
+		 * case with landlock_get_object_cleaner().
+		 */
+		WARN_ON_ONCE(!READ_ONCE(object->underlying_object));
+		return object;
+	}
+	return NULL;
+}
+
+static struct landlock_object *get_object_cleaner(
+		struct landlock_object *object)
+	__acquires(object->cleaners)
+{
+	__acquire(object->cleaners);
+	if (object && refcount_inc_not_zero(&object->cleaners))
+		return object;
+	return NULL;
+}
+
+/*
+ * There is two cases when an object should be free and the reference to the
+ * underlying object should be put:
+ * - when the last rule tied to this object is removed, which is handled by
+ *   landlock_put_rule() and then release_object();
+ * - when the object is being terminated (e.g. no more reference to an inode),
+ *   which is handled by landlock_put_object().
+ */
+static void put_object_free(struct landlock_object *object)
+	__releases(object->cleaners)
+{
+	__release(object->cleaners);
+	if (!refcount_dec_and_test(&object->cleaners))
+		return;
+	WARN_ON_ONCE(refcount_read(&object->usage));
+	/*
+	 * Ensures a safe use of @object in the RCU block from
+	 * landlock_put_rule().
+	 */
+	kfree_rcu(object, rcu_free);
+}
+
+/*
+ * Destroys a newly created and useless object.
+ */
+void landlock_drop_object(struct landlock_object *object)
+{
+	if (WARN_ON_ONCE(!refcount_dec_and_test(&object->usage)))
+		return;
+	__acquire(object->cleaners);
+	put_object_free(object);
+}
+
+/*
+ * Puts the underlying object (e.g. inode) if it is the first request to
+ * release @object, without calling landlock_put_object().
+ *
+ * Return true if this call effectively marks @object as released, false
+ * otherwise.
+ */
+static bool release_object(struct landlock_object *object)
+	__releases(&object->lock)
+{
+	void *underlying_object;
+
+	lockdep_assert_held(&object->lock);
+
+	underlying_object = xchg(&object->underlying_object, NULL);
+	spin_unlock(&object->lock);
+	might_sleep();
+	if (!underlying_object)
+		return false;
+
+	switch (object->type) {
+	case LANDLOCK_OBJECT_INODE:
+		break;
+	default:
+		WARN_ON_ONCE(1);
+	}
+	return true;
+}
+
+static void put_object_cleaner(struct landlock_object *object)
+	__releases(object->cleaners)
+{
+	/* Let's try an early lockless check. */
+	if (list_empty(&object->rules) &&
+			READ_ONCE(object->underlying_object)) {
+		/*
+		 * Puts @object if there is no rule tied to it and the
+		 * remaining user is the underlying object.  This check is
+		 * atomic because @object->rules and @object->underlying_object
+		 * are protected by @object->lock.
+		 */
+		spin_lock(&object->lock);
+		if (list_empty(&object->rules) &&
+				READ_ONCE(object->underlying_object) &&
+				refcount_dec_if_one(&object->usage)) {
+			/*
+			 * Releases @object, in place of
+			 * landlock_release_object().
+			 *
+			 * @object is already empty, implying that all its
+			 * previous rules are already disabled.
+			 *
+			 * Unbalance the @object->cleaners counter to reflect
+			 * the underlying object release.
+			 */
+			if (!WARN_ON_ONCE(!release_object(object))) {
+				__acquire(object->cleaners);
+				put_object_free(object);
+			}
+		} else {
+			spin_unlock(&object->lock);
+		}
+	}
+	put_object_free(object);
+}
+
+/*
+ * Putting an object is easy when the object is being terminated, but it is
+ * much more tricky when the reason is that there is no more rule tied to this
+ * object.  Indeed, new rules could be added at the same time.
+ */
+void landlock_put_object(struct landlock_object *object)
+	__releases(object->usage)
+{
+	struct landlock_object *object_cleaner;
+
+	__release(object->usage);
+	might_sleep();
+	if (!object)
+		return;
+	/*
+	 * Guards against concurrent termination to be able to terminate
+	 * @object if it is empty and not referenced by another rule-appender
+	 * other than the underlying object.
+	 */
+	object_cleaner = get_object_cleaner(object);
+	if (WARN_ON_ONCE(!object_cleaner)) {
+		__release(object->cleaners);
+		return;
+	}
+	/*
+	 * Decrements @object->usage and if it reach zero, also decrement
+	 * @object->cleaners.  If both reach zero, then release and free
+	 * @object.
+	 */
+	if (refcount_dec_and_test(&object->usage)) {
+		struct landlock_rule *rule_walker, *rule_walker2;
+
+		spin_lock(&object->lock);
+		/*
+		 * Disables all the rules tied to @object when it is forbidden
+		 * to add new rule but still allowed to remove them with
+		 * landlock_put_rule().  This is crucial to be able to safely
+		 * free a rule according to landlock_rule_is_disabled().
+		 */
+		list_for_each_entry_safe(rule_walker, rule_walker2,
+				&object->rules, list)
+			list_del_rcu(&rule_walker->list);
+
+		/*
+		 * Releases @object if it is not already released (e.g. with
+		 * landlock_release_object()).
+		 */
+		release_object(object);
+		/*
+		 * Unbalances the @object->cleaners counter to reflect the
+		 * underlying object release.
+		 */
+		__acquire(object->cleaners);
+		put_object_free(object);
+	}
+	put_object_cleaner(object_cleaner);
+}
+
+void landlock_put_rule(struct landlock_object *object,
+		struct landlock_rule *rule)
+{
+	if (!rule)
+		return;
+	WARN_ON_ONCE(!object);
+	/*
+	 * Guards against a concurrent @object self-destruction with
+	 * landlock_put_object() or put_object_cleaner().
+	 */
+	rcu_read_lock();
+	if (landlock_rule_is_disabled(rule)) {
+		rcu_read_unlock();
+		if (refcount_dec_and_test(&rule->usage))
+			kfree_rcu(rule, rcu_free);
+		return;
+	}
+	if (refcount_dec_and_test(&rule->usage)) {
+		struct landlock_object *safe_object;
+
+		/*
+		 * Now, @rule may still be enabled, or in the process of being
+		 * untied to @object by put_object_cleaner().  However, we know
+		 * that @object will not be freed until rcu_read_unlock() and
+		 * until @object->cleaners reach zero.  Furthermore, we may not
+		 * be the only one willing to free a @rule linked with @object.
+		 * If we succeed to hold @object with get_object_cleaner(), we
+		 * know that until put_object_cleaner(), we can safely use
+		 * @object to remove @rule.
+		 */
+		safe_object = get_object_cleaner(object);
+		rcu_read_unlock();
+		if (!safe_object) {
+			__release(safe_object->cleaners);
+			/*
+			 * We can safely free @rule because it is already
+			 * removed from @object's list.
+			 */
+			WARN_ON_ONCE(!landlock_rule_is_disabled(rule));
+			kfree_rcu(rule, rcu_free);
+		} else {
+			spin_lock(&safe_object->lock);
+			if (!landlock_rule_is_disabled(rule))
+				list_del(&rule->list);
+			spin_unlock(&safe_object->lock);
+			kfree_rcu(rule, rcu_free);
+			put_object_cleaner(safe_object);
+		}
+	} else {
+		rcu_read_unlock();
+	}
+	/*
+	 * put_object_cleaner() might sleep, but it is only reachable if
+	 * !landlock_rule_is_disabled().  Therefore, clean_ref() can not sleep.
+	 */
+	might_sleep();
+}
+
+void landlock_release_object(struct landlock_object __rcu *rcu_object)
+{
+	struct landlock_object *object;
+
+	if (!rcu_object)
+		return;
+	rcu_read_lock();
+	object = get_object_cleaner(rcu_dereference(rcu_object));
+	rcu_read_unlock();
+	if (unlikely(!object)) {
+		__release(object->cleaners);
+		return;
+	}
+	/*
+	 * Makes sure that the underlying object never point to a freed object
+	 * by firstly releasing the object (i.e. NULL the reference to it) to
+	 * be sure no one could get a new reference to it while it is being
+	 * terminated.  Secondly, put the object globally (e.g. for the
+	 * super-block).
+	 *
+	 * This can run concurrently with put_object_cleaner(), which may try
+	 * to release @object as well.
+	 */
+	spin_lock(&object->lock);
+	if (release_object(object)) {
+		/*
+		 * Unbalances the object to reflect the underlying object
+		 * release.
+		 */
+		__acquire(object->usage);
+		landlock_put_object(object);
+	}
+	/*
+	 * If a concurrent thread is adding a new rule, the object will be free
+	 * at the end of this rule addition, otherwise it will be free with the
+	 * following put_object_cleaner() or a remaining one.
+	 */
+	put_object_cleaner(object);
+}
diff --git a/security/landlock/object.h b/security/landlock/object.h
new file mode 100644
index 000000000000..15dfc9a75a82
--- /dev/null
+++ b/security/landlock/object.h
@@ -0,0 +1,134 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Object and rule management
+ *
+ * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_OBJECT_H
+#define _SECURITY_LANDLOCK_OBJECT_H
+
+#include <linux/compiler_types.h>
+#include <linux/list.h>
+#include <linux/poison.h>
+#include <linux/rcupdate.h>
+#include <linux/refcount.h>
+#include <linux/spinlock.h>
+
+struct landlock_access {
+	/*
+	 * @self: Bitfield of allowed actions on the kernel object.  They are
+	 * relative to the object type (e.g. LANDLOCK_ACTION_FS_READ).
+	 */
+	u32 self;
+	/*
+	 * @beneath: Same as @self, but for the child objects (e.g. a file in a
+	 * directory).
+	 */
+	u32 beneath;
+};
+
+struct landlock_rule {
+	struct landlock_access access;
+	/*
+	 * @list: Linked list with other rules tied to the same object, which
+	 * enable to manage their lifetimes.  This is also used to identify if
+	 * a rule is still valid, thanks to landlock_rule_is_disabled(), which
+	 * is important in the matching process because the original object
+	 * address might have been recycled.
+	 */
+	struct list_head list;
+	union {
+		/*
+		 * @usage: Number of rulesets pointing to this rule.  This
+		 * field is never used by RCU readers.
+		 */
+		refcount_t usage;
+		struct rcu_head rcu_free;
+	};
+};
+
+enum landlock_object_type {
+	LANDLOCK_OBJECT_INODE = 1,
+};
+
+struct landlock_object {
+	/*
+	 * @usage: Main usage counter, used to tie an object to it's underlying
+	 * object (i.e. create a lifetime) and potentially add new rules.
+	 */
+	refcount_t usage;
+	/*
+	 * @cleaners: Usage counter used to free a rule from @rules (thanks to
+	 * put_rule()).  Enables to get a reference to this object until it
+	 * really become freed.  Cf. put_object().
+	 */
+	refcount_t cleaners;
+	union {
+		/*
+		 * The use of this struct is controlled by @usage and
+		 * @cleaners, which makes it safe to union it with @rcu_free.
+		 */
+		struct {
+			/*
+			 * @underlying_object: Used when cleaning up an object
+			 * and to mark an object as tied to its underlying
+			 * kernel structure.  It must then be atomically read
+			 * using READ_ONCE().
+			 *
+			 * The one who clear @underlying_object must:
+			 * 1. clear the object self-reference and
+			 * 2. decrement @usage (and potentially free the
+			 *    object).
+			 *
+			 * Cf. clean_object().
+			 */
+			void *underlying_object;
+			/*
+			 * @type: Only used when cleaning up an object.
+			 */
+			enum landlock_object_type type;
+			spinlock_t lock;
+			/*
+			 * @rules: List of struct landlock_rule linked with
+			 * their "list" field.  This list is only accessed when
+			 * updating the list (to be able to clean up later)
+			 * while holding @lock.
+			 */
+			struct list_head rules;
+		};
+		struct rcu_head rcu_free;
+	};
+};
+
+void landlock_put_rule(struct landlock_object *object,
+		struct landlock_rule *rule);
+
+void landlock_release_object(struct landlock_object __rcu *rcu_object);
+
+struct landlock_object *landlock_create_object(
+		const enum landlock_object_type type, void *underlying_object);
+
+struct landlock_object *landlock_get_object(struct landlock_object *object)
+	__acquires(object->usage);
+
+void landlock_put_object(struct landlock_object *object)
+	__releases(object->usage);
+
+void landlock_drop_object(struct landlock_object *object);
+
+static inline bool landlock_rule_is_disabled(
+		struct landlock_rule *rule)
+{
+	/*
+	 * Disabling (i.e. unlinking) a landlock_rule is a one-way operation.
+	 * It is not possible to re-enable such a rule, then there is no need
+	 * for smp_load_acquire().
+	 *
+	 * LIST_POISON2 is set by list_del() and list_del_rcu().
+	 */
+	return !rule || READ_ONCE(rule->list.prev) == LIST_POISON2;
+}
+
+#endif /* _SECURITY_LANDLOCK_OBJECT_H */