[v30,07/12] landlock: Support filesystem access-control

From: Mickaël Salaün <mic@linux.microsoft.com>

From: Mickaël Salaün <mic@linux.microsoft.com>

Using Landlock objects and ruleset, it is possible to tag inodes
according to a process's domain.  To enable an unprivileged process to
express a file hierarchy, it first needs to open a directory (or a file)
and pass this file descriptor to the kernel through
landlock_add_rule(2).  When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers.  The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy.  This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.

Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.

This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions.  This is the
result of multiple discussions to minimize the code of Landlock to ease
review.  Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem.  Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210316204252.427806-8-mic@digikod.net
---

Changes since v29:
* Remove a useless unlock/lock for the first loop walk in
  hook_sb_delete().  This also makes the code clearer but doesn't change
  the garantees for iput().
* Rename iput_inode to prev_inode, which shows its origin.

Changes since v28:
* Fix race conditions that could be caused by concurrent calls to
  release_inode() and hook_sb_delete().
* Avoid livelock when a lot of inodes are tagged.
* Improve concurrency locking and add comments to explain the specific
  lock rules.
* Add an inode_free_security hook to check that release_inode() and
  hook_sb_delete() do their job.
* Add early return to check_access_path() to check if the access request
  is empty.  This doesn't change the semantic.
* Reword the first description sentence (suggested by Serge Hallyn).

Changes since v27:
* Fix domains with layers of non-overlapping access rights (cf.
  layout1.non_overlapping_accesses test) thanks to a stack of access
  rights per layer (replacing ORed access rights).  This avoids
  too-restrictive domains.
* Cosmetic fixes and updates in comments and Kconfig.

Changes since v26:
* Check each rule of a path to enable a more permissive and pragmatic
  access control per layer.  Suggested by Jann Horn:
  https://lore.kernel.org/lkml/CAG48ez1O0VTwEiRd3KqexoF78WR+cmP5bGk5Kh5Cs7aPepiDVg@mail.gmail.com/
* Rename check_access_path_continue() to unmask_layers() and make it
  return the new layer mask.
* Avoid double domain check in hook_file_open().
* In the documentation, add utime(2) as another example of unhandled
  syscalls.  Indeed, using `touch` to test write access may be tempting.
* Remove outdated comment about OverlayFS.
* Rename the landlock.h ifdef to align with most similar files.
* Fix spelling.

Changes since v25:
* Move build_check_layer() to ruleset.c, and add built-time checks for
  the fs_access_mask and access variables according to
  _LANDLOCK_ACCESS_FS_MASK.
* Move limits to a dedicated file and rename them:
  _LANDLOCK_ACCESS_FS_LAST and _LANDLOCK_ACCESS_FS_MASK.
* Set build_check_layer() as non-inline to trigger a warning if it is
  not called.
* Use BITS_PER_TYPE() macro.
* Rename function to landlock_add_fs_hooks().
* Cosmetic variable renames.

Changes since v24:
* Use the new struct landlock_rule and landlock_layer to not mix
  accesses from different layers.  Revert "Enforce deterministic
  interleaved path rules" from v24, and fix the layer check.  This
  enables to follow a sane semantic: an access is granted if, for each
  policy layer, at least one rule encountered on the pathwalk grants the
  access, regardless of their position in the layer stack (suggested by
  Jann Horn).  See layout1.interleaved_masked_accesses tests from
  tools/testing/selftests/landlock/fs_test.c for corner cases.
* Add build-time checks for layers.
* Use the new landlock_insert_rule() API.

Changes since v23:
* Enforce deterministic interleaved path rules.  To have consistent
  layered rules, granting access to a path implies that all accesses
  tied to inodes, from the requested file to the real root, must be
  checked.  Otherwise, stacked rules may result to overzealous
  restrictions.  By excluding the ability to add exceptions in the same
  layer (e.g. /a allowed, /a/b denied, and /a/b/c allowed), we get
  deterministic interleaved path rules.  This removes an optimization
  which could be replaced by a proper cache mechanism.  This also
  further simplifies and explain check_access_path_continue().
* Fix memory allocation error handling in landlock_create_object()
  calls.  This prevent to inadvertently hold an inode.
* In get_inode_object(), improve comments, make code more readable and
  move kfree() call out of the lock window.
* Use the simplified landlock_insert_rule() API.

Changes since v22:
* Simplify check_access_path_continue() (suggested by Jann Horn).
* Remove prefetch() call for now (suggested by Jann Horn).
* Fix spelling and remove superfluous comment (spotted by Jann Horn).
* Cosmetic variable renaming.

Changes since v21:
* Rename ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES (suggested by
  James Morris).
* Remove the LANDLOCK_ACCESS_FS_CHROOT right because chroot(2) (which
  requires CAP_SYS_CHROOT) doesn't enable to bypass Landlock (as tests
  demonstrate it), and because it is often used by sandboxes, it would
  be counterproductive to forbid it.  This also reduces the code size.
* Clean up documentation.

Changes since v19:
* Fix spelling (spotted by Randy Dunlap).

Changes since v18:
* Remove useless include.
* Fix spelling.

Changes since v17:
* Replace landlock_release_inodes() with security_sb_delete() (requested
  by James Morris).
* Replace struct super_block->s_landlock_inode_refs with the LSM
  infrastructure management of the superblock (requested by James
  Morris).
* Fix mknod restriction with a zero mode (spotted by Vincent Dagonneau).
* Minimize executed code in path_mknod and file_open hooks when the
  current tasks is not sandboxed.
* Remove useless checks on the file pointer and inode in
  hook_file_open() .
* Constify domain pointers.
* Rename inode_landlock() to landlock_inode().
* Import include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* from
  the ruleset and domain management patch.
* Explain the rational of this minimal set of access-control.
  https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/

Changes since v16:
* Add ARCH_EPHEMERAL_STATES and enable it for UML.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers: this
  enables to properly manage superset and subset of access rights,
  whatever their order in the stack of layers.
  Cf. https://lore.kernel.org/lkml/e07fe473-1801-01cc-12ae-b3167f95250e@digikod.net/
* Allow to open pipes and similar special files through /proc/self/fd/.
* Properly handle internal filesystems such as nsfs: always allow these
  kind of roots because disconnected path cannot be evaluated.
* Remove the LANDLOCK_ACCESS_FS_LINK_TO and
  LANDLOCK_ACCESS_FS_RENAME_{TO,FROM}, but use the
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and LANDLOCK_ACCESS_FS_MAKE_*
  instead.  Indeed, it is not possible for now (and not really useful)
  to express the semantic of a source and a destination.
* Check access rights to remove a directory or a file with rename(2).
* Forbid reparenting when linking or renaming.  This is needed to easily
  protect against possible privilege escalation by changing the place of
  a file or directory in relation to an enforced access policy (from the
  set of layers).  This will be relaxed in the future.
* Update hooks to take into account replacement of the object's self and
  beneath access bitfields with one.  Simplify the code.
* Check file related access rights.
* Check d_is_negative() instead of !d_backing_inode() in
  check_access_path_continue(), and continue the path walk while there
  is no mapped inode e.g., with rename(2).
* Check private inode in check_access_path().
* Optimize get_file_access() when dealing with a directory.
* Add missing atomic.h .

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Rewrite release_inode() to use inode->sb->s_landlock_inode_refs.
  - Remove useless checks in landlock_release_inodes(), clean object
    pointer according to the new struct landlock_object and wait for all
    iput() to complete.
  - Rewrite get_inode_object() according to the new struct
    landlock_object.  If there is a race-condition when cleaning up an
    object, we retry until the concurrent thread finished the object
    cleaning.
  Cf. https://lore.kernel.org/lkml/CAG48ez21bEn0wL1bbmTiiu8j9jP5iEWtHOwz4tURUJ+ki0ydYw@mail.gmail.com/
* Fix nested domains by implementing a notion of layer level and depth:
  - Check for matching level ranges when walking through a file path.
  - Only allow access if every layer granted the access request.
* Handles files without mount points (e.g. pipes).
* Hardens path walk by checking inode pointer values.
* Prefetches d_parent when walking to the root directory.
* Remove useless inode_alloc_security hook() (suggested by Jann Horn):
  already initialized by lsm_inode_alloc().
* Remove the inode_free_security hook.
* Remove access checks that may be required for FD-only requests:
  truncate, getattr, lock, chmod, chown, chgrp, ioctl.  This will be
  handle in a future evolution of Landlock, but right now the goal is to
  lighten the code to ease review.
* Constify variables.
* Move ABI checks into syscall.c .
* Cosmetic variable renames.

Changes since v11:
* Add back, revamp and make a fully working filesystem access-control
  based on paths and inodes.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-6-mic@digikod.net/
---
 MAINTAINERS                   |   1 +
 arch/Kconfig                  |   7 +
 arch/um/Kconfig               |   1 +
 include/uapi/linux/landlock.h |  75 ++++
 security/landlock/Kconfig     |   2 +-
 security/landlock/Makefile    |   2 +-
 security/landlock/fs.c        | 687 ++++++++++++++++++++++++++++++++++
 security/landlock/fs.h        |  56 +++
 security/landlock/limits.h    |   4 +
 security/landlock/ruleset.c   |   4 +
 security/landlock/setup.c     |   7 +
 security/landlock/setup.h     |   2 +
 12 files changed, 846 insertions(+), 2 deletions(-)
 create mode 100644 include/uapi/linux/landlock.h
 create mode 100644 security/landlock/fs.c
 create mode 100644 security/landlock/fs.h

Message ID	20210316204252.427806-8-mic@digikod.net (mailing list archive)
State	New
Headers	show Return-Path: <linux-kselftest-owner@kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B704FC43142 for <linux-kselftest@archiver.kernel.org>; Tue, 16 Mar 2021 20:43:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7510264F83 for <linux-kselftest@archiver.kernel.org>; Tue, 16 Mar 2021 20:43:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231755AbhCPUn0 (ORCPT <rfc822;linux-kselftest@archiver.kernel.org>); Tue, 16 Mar 2021 16:43:26 -0400 Received: from smtp-42a8.mail.infomaniak.ch ([84.16.66.168]:37509 "EHLO smtp-42a8.mail.infomaniak.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231648AbhCPUnK (ORCPT <rfc822;linux-kselftest@vger.kernel.org>); Tue, 16 Mar 2021 16:43:10 -0400 Received: from smtp-3-0001.mail.infomaniak.ch (unknown [10.4.36.108]) by smtp-3-3000.mail.infomaniak.ch (Postfix) with ESMTPS id 4F0QGf2zjyzMq0Tt; Tue, 16 Mar 2021 21:43:06 +0100 (CET) Received: from localhost (unknown [23.97.221.149]) by smtp-3-0001.mail.infomaniak.ch (Postfix) with ESMTPA id 4F0QGf07BPzlh8T4; Tue, 16 Mar 2021 21:43:06 +0100 (CET) From: =?utf-8?q?Micka=C3=ABl_Sala=C3=BCn?= <mic@digikod.net> To: James Morris <jmorris@namei.org>, Jann Horn <jannh@google.com>, "Serge E . Hallyn" <serge@hallyn.com> Cc: =?utf-8?q?Micka=C3=ABl_Sala=C3=BCn?= <mic@digikod.net>, Al Viro <viro@zeniv.linux.org.uk>, Andrew Morton <akpm@linux-foundation.org>, Andy Lutomirski <luto@amacapital.net>, Anton Ivanov <anton.ivanov@cambridgegreys.com>, Arnd Bergmann <arnd@arndb.de>, Casey Schaufler <casey@schaufler-ca.com>, David Howells <dhowells@redhat.com>, Jeff Dike <jdike@addtoit.com>, Jonathan Corbet <corbet@lwn.net>, Kees Cook <keescook@chromium.org>, Michael Kerrisk <mtk.manpages@gmail.com>, Richard Weinberger <richard@nod.at>, Shuah Khan <shuah@kernel.org>, Vincent Dagonneau <vincent.dagonneau@ssi.gouv.fr>, kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-security-module@vger.kernel.org, x86@kernel.org, =?utf-8?q?Micka=C3=ABl_Sala=C3=BCn?= <mic@linux.microsoft.com> Subject: [PATCH v30 07/12] landlock: Support filesystem access-control Date: Tue, 16 Mar 2021 21:42:47 +0100 Message-Id: <20210316204252.427806-8-mic@digikod.net> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210316204252.427806-1-mic@digikod.net> References: <20210316204252.427806-1-mic@digikod.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kselftest.vger.kernel.org> X-Mailing-List: linux-kselftest@vger.kernel.org
Series	Landlock LSM \| expand [v30,00/12] Landlock LSM [v30,01/12] landlock: Add object management [v30,02/12] landlock: Add ruleset and domain management [v30,03/12] landlock: Set up the security framework and manage credentials [v30,04/12] landlock: Add ptrace restrictions [v30,05/12] LSM: Infrastructure management of the superblock [v30,06/12] fs,security: Add sb_delete hook [v30,07/12] landlock: Support filesystem access-control [v30,08/12] landlock: Add syscall implementations [v30,09/12] arch: Wire up Landlock syscalls [v30,11/12] samples/landlock: Add a sandbox manager example [v30,12/12] landlock: Add user and kernel documentation

[v30,07/12] landlock: Support filesystem access-control

Commit Message

Comments

Patch