[v7,bpf-next,03/18] bpf: introduce BPF token object

Add new kind of BPF kernel object, BPF token. BPF token is meant to
allow delegating privileged BPF functionality, like loading a BPF
program or creating a BPF map, from privileged process to a *trusted*
unprivileged process, all while have a good amount of control over which
privileged operations could be performed using provided BPF token.

This is achieved through mounting BPF FS instance with extra delegation
mount options, which determine what operations are delegatable, and also
constraining it to the owning user namespace (as mentioned in the
previous patch).

BPF token itself is just a derivative from BPF FS and can be created
through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts
a path specification (using the usual fd + string path combo) to a BPF
FS mount. Currently, BPF token "inherits" delegated command, map types,
prog type, and attach type bit sets from BPF FS as is. In the future,
having an BPF token as a separate object with its own FD, we can allow
to further restrict BPF token's allowable set of things either at the creation
time or after the fact, allowing the process to guard itself further
from, e.g., unintentionally trying to load undesired kind of BPF
programs. But for now we keep things simple and just copy bit sets as is.

When BPF token is created from BPF FS mount, we take reference to the
BPF super block's owning user namespace, and then use that namespace for
checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN}
capabilities that are normally only checked against init userns (using
capable()), but now we check them using ns_capable() instead (if BPF
token is provided). See bpf_token_capable() for details.

Such setup means that BPF token in itself is not sufficient to grant BPF
functionality. User namespaced process has to *also* have necessary
combination of capabilities inside that user namespace. So while
previously CAP_BPF was useless when granted within user namespace, now
it gains a meaning and allows container managers and sys admins to have
a flexible control over which processes can and need to use BPF
functionality within the user namespace (i.e., container in practice).
And BPF FS delegation mount options and derived BPF tokens serve as
a per-container "flag" to grant overall ability to use bpf() (plus further
restrict on which parts of bpf() syscalls are treated as namespaced).

Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF)
within the BPF FS owning user namespace, rounding up the ns_capable()
story of BPF token.

The alternative to creating BPF token object was:
  a) not having any extra object and just pasing BPF FS path to each
     relevant bpf() command. This seems suboptimal as it's racy (mount
     under the same path might change in between checking it and using it
     for bpf() command). And also less flexible if we'd like to further
     restrict ourselves compared to all the delegated functionality
     allowed on BPF FS.
  b) use non-bpf() interface, e.g., ioctl(), but otherwise also create
     a dedicated FD that would represent a token-like functionality. This
     doesn't seem superior to having a proper bpf() command, so
     BPF_TOKEN_CREATE was chosen.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h            |  41 +++++++
 include/uapi/linux/bpf.h       |  39 +++++++
 kernel/bpf/Makefile            |   2 +-
 kernel/bpf/inode.c             |  17 ++-
 kernel/bpf/syscall.c           |  17 +++
 kernel/bpf/token.c             | 208 +++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  39 +++++++
 7 files changed, 353 insertions(+), 10 deletions(-)
 create mode 100644 kernel/bpf/token.c

Message ID	20231012222810.4120312-4-andrii@kernel.org (mailing list archive)
State	Changes Requested
Delegated to:	Paul Moore
Headers	show Return-Path: <linux-security-module-owner@vger.kernel.org> From: Andrii Nakryiko <andrii@kernel.org> To: <bpf@vger.kernel.org>, <netdev@vger.kernel.org> CC: <linux-fsdevel@vger.kernel.org>, <linux-security-module@vger.kernel.org>, <keescook@chromium.org>, <brauner@kernel.org>, <lennart@poettering.net>, <kernel-team@meta.com>, <sargun@sargun.me> Subject: [PATCH v7 bpf-next 03/18] bpf: introduce BPF token object Date: Thu, 12 Oct 2023 15:27:55 -0700 Message-ID: <20231012222810.4120312-4-andrii@kernel.org> In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT Content-Type: text/plain Precedence: bulk
Series	BPF token and BPF FS-based delegation \| expand [v7,bpf-next,00/18] BPF token and BPF FS-based delegation [v7,bpf-next,01/18] bpf: align CAP_NET_ADMIN checks with bpf_capable() approach [v7,bpf-next,02/18] bpf: add BPF token delegation mount options to BPF FS [v7,bpf-next,03/18] bpf: introduce BPF token object [v7,bpf-next,04/18] bpf: add BPF token support to BPF_MAP_CREATE command [v7,bpf-next,05/18] bpf: add BPF token support to BPF_BTF_LOAD command [v7,bpf-next,06/18] bpf: add BPF token support to BPF_PROG_LOAD command [v7,bpf-next,07/18] bpf: take into account BPF token when fetching helper protos [v7,bpf-next,08/18] bpf: consistenly use BPF token throughout BPF verifier logic [v7,bpf-next,09/18] bpf,lsm: refactor bpf_prog_alloc/bpf_prog_free LSM hooks [v7,bpf-next,10/18] bpf,lsm: refactor bpf_map_alloc/bpf_map_free LSM hooks [v7,bpf-next,11/18] bpf,lsm: add bpf_token_create and bpf_token_free LSM hooks [v7,bpf-next,12/18] libbpf: add bpf_token_create() API [v7,bpf-next,13/18] selftests/bpf: fix test_maps' use of bpf_map_create_opts [v7,bpf-next,14/18] libbpf: add BPF token support to bpf_map_create() API [v7,bpf-next,15/18] libbpf: add BPF token support to bpf_btf_load() API [v7,bpf-next,16/18] libbpf: add BPF token support to bpf_prog_load() API [v7,bpf-next,17/18] selftests/bpf: add BPF token-enabled tests [v7,bpf-next,18/18] bpf,selinux: allocate bpf_security_struct per BPF token

[v7,bpf-next,03/18] bpf: introduce BPF token object

Commit Message

Patch