From patchwork Thu Oct 12 22:27:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419915 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF817CDB474 for ; Thu, 12 Oct 2023 22:31:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443019AbjJLWbl convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:31:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443016AbjJLWbg (ORCPT ); Thu, 12 Oct 2023 18:31:36 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DEF7E5 for ; Thu, 12 Oct 2023 15:31:35 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuD7B010333 for ; Thu, 12 Oct 2023 15:31:34 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjt7aufu-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:31:34 -0700 Received: from twshared40933.03.prn6.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:31:31 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id A58CD39A3A41D; Thu, 12 Oct 2023 15:29:04 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 01/18] bpf: align CAP_NET_ADMIN checks with bpf_capable() approach Date: Thu, 12 Oct 2023 15:27:53 -0700 Message-ID: <20231012222810.4120312-2-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Pibg9uoNwWwN4hgWTLvkiM3upxmSb-tO X-Proofpoint-ORIG-GUID: Pibg9uoNwWwN4hgWTLvkiM3upxmSb-tO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Within BPF syscall handling code CAP_NET_ADMIN checks stand out a bit compared to CAP_BPF and CAP_PERFMON checks. For the latter, CAP_BPF or CAP_PERFMON are checked first, but if they are not set, CAP_SYS_ADMIN takes over and grants whatever part of BPF syscall is required. Similar kind of checks that involve CAP_NET_ADMIN are not so consistent. One out of four uses does follow CAP_BPF/CAP_PERFMON model: during BPF_PROG_LOAD, if the type of BPF program is "network-related" either CAP_NET_ADMIN or CAP_SYS_ADMIN is required to proceed. But in three other cases CAP_NET_ADMIN is required even if CAP_SYS_ADMIN is set: - when creating DEVMAP/XDKMAP/CPU_MAP maps; - when attaching CGROUP_SKB programs; - when handling BPF_PROG_QUERY command. This patch is changing the latter three cases to follow BPF_PROG_LOAD model, that is allowing to proceed under either CAP_NET_ADMIN or CAP_SYS_ADMIN. This also makes it cleaner in subsequent BPF token patches to switch wholesomely to a generic bpf_token_capable(int cap) check, that always falls back to CAP_SYS_ADMIN if requested capability is missing. Cc: Jakub Kicinski Signed-off-by: Andrii Nakryiko --- kernel/bpf/syscall.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 8677837f3deb..d5cf8f4a548b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1097,6 +1097,11 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, return ret; } +static bool bpf_net_capable(void) +{ + return capable(CAP_NET_ADMIN) || capable(CAP_SYS_ADMIN); +} + #define BPF_MAP_CREATE_LAST_FIELD map_extra /* called via syscall */ static int map_create(union bpf_attr *attr) @@ -1200,7 +1205,7 @@ static int map_create(union bpf_attr *attr) case BPF_MAP_TYPE_DEVMAP: case BPF_MAP_TYPE_DEVMAP_HASH: case BPF_MAP_TYPE_XSKMAP: - if (!capable(CAP_NET_ADMIN)) + if (!bpf_net_capable()) return -EPERM; break; default: @@ -2600,7 +2605,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) !bpf_capable()) return -EPERM; - if (is_net_admin_prog_type(type) && !capable(CAP_NET_ADMIN) && !capable(CAP_SYS_ADMIN)) + if (is_net_admin_prog_type(type) && !bpf_net_capable()) return -EPERM; if (is_perfmon_prog_type(type) && !perfmon_capable()) return -EPERM; @@ -3750,7 +3755,7 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog, case BPF_PROG_TYPE_SK_LOOKUP: return attach_type == prog->expected_attach_type ? 0 : -EINVAL; case BPF_PROG_TYPE_CGROUP_SKB: - if (!capable(CAP_NET_ADMIN)) + if (!bpf_net_capable()) /* cg-skb progs can be loaded by unpriv user. * check permissions at attach time. */ @@ -3934,7 +3939,7 @@ static int bpf_prog_detach(const union bpf_attr *attr) static int bpf_prog_query(const union bpf_attr *attr, union bpf_attr __user *uattr) { - if (!capable(CAP_NET_ADMIN)) + if (!bpf_net_capable()) return -EPERM; if (CHECK_ATTR(BPF_PROG_QUERY)) return -EINVAL; From patchwork Thu Oct 12 22:27:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419911 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2299CCDB47E for ; Thu, 12 Oct 2023 22:31:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443014AbjJLWb3 convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:31:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442981AbjJLWb2 (ORCPT ); Thu, 12 Oct 2023 18:31:28 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F17CD6 for ; Thu, 12 Oct 2023 15:31:26 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLu7XT010229 for ; Thu, 12 Oct 2023 15:31:25 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjt7athy-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:31:25 -0700 Received: from twshared15247.17.frc2.facebook.com (2620:10d:c0a8:1b::30) by mail.thefacebook.com (2620:10d:c0a8:82::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:31:23 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id F3FC239A3A46E; Thu, 12 Oct 2023 15:29:06 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 02/18] bpf: add BPF token delegation mount options to BPF FS Date: Thu, 12 Oct 2023 15:27:54 -0700 Message-ID: <20231012222810.4120312-3-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> X-FB-Internal: Safe X-Proofpoint-GUID: woMI7rui86hA9_10cyMrjEveGgvXy9NZ X-Proofpoint-ORIG-GUID: woMI7rui86hA9_10cyMrjEveGgvXy9NZ X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add few new mount options to BPF FS that allow to specify that a given BPF FS instance allows creation of BPF token (added in the next patch), and what sort of operations are allowed under BPF token. As such, we get 4 new mount options, each is a bit mask - `delegate_cmds` allow to specify which bpf() syscall commands are allowed with BPF token derived from this BPF FS instance; - if BPF_MAP_CREATE command is allowed, `delegate_maps` specifies a set of allowable BPF map types that could be created with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_progs` specifies a set of allowable BPF program types that could be loaded with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_attachs` specifies a set of allowable BPF program attach types that could be loaded with BPF token; delegate_progs and delegate_attachs are meant to be used together, as full BPF program type is, in general, determined through both program type and program attach type. Currently, these mount options accept the following forms of values: - a special value "any", that enables all possible values of a given bit set; - numeric value (decimal or hexadecimal, determined by kernel automatically) that specifies a bit mask value directly; - all the values for a given mount option are combined, if specified multiple times. E.g., `mount -t bpf nodev /path/to/mount -o delegate_maps=0x1 -o delegate_maps=0x2` will result in a combined 0x3 mask. Ideally, more convenient (for humans) symbolic form derived from corresponding UAPI enums would be accepted (e.g., `-o delegate_progs=kprobe|tracepoint`) and I intend to implement this, but it requires a bunch of UAPI header churn, so I postponed it until this feature lands upstream or at least there is a definite consensus that this feature is acceptable and is going to make it, just to minimize amount of wasted effort and not increase amount of non-essential code to be reviewed. Attentive reader will notice that BPF FS is now marked as FS_USERNS_MOUNT, which theoretically makes it mountable inside non-init user namespace as long as the process has sufficient *namespaced* capabilities within that user namespace. But in reality we still restrict BPF FS to be mountable only by processes with CAP_SYS_ADMIN *in init userns* (extra check in bpf_fill_super()). FS_USERNS_MOUNT is added to allow creating BPF FS context object (i.e., fsopen("bpf")) from inside unprivileged process inside non-init userns, to capture that userns as the owning userns. It will still be required to pass this context object back to privileged process to instantiate and mount it. This manipulation is important, because capturing non-init userns as the owning userns of BPF FS instance (super block) allows to use that userns to constraint BPF token to that userns later on (see next patch). So creating BPF FS with delegation inside unprivileged userns will restrict derived BPF token objects to only "work" inside that intended userns, making it scoped to a intended "container". There is a set of selftests at the end of the patch set that simulates this sequence of steps and validates that everything works as intended. But careful review is requested to make sure there are no missed gaps in the implementation and testing. All this is based on suggestions and discussions with Christian Brauner ([0]), to the best of my ability to follow all the implications. [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 10 ++++++ kernel/bpf/inode.c | 88 +++++++++++++++++++++++++++++++++++++++------ 2 files changed, 88 insertions(+), 10 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 61bde4520f5c..3b3270ef11cc 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1562,6 +1562,16 @@ struct bpf_link_primer { u32 id; }; +struct bpf_mount_opts { + umode_t mode; + + /* BPF token-related delegation options */ + u64 delegate_cmds; + u64 delegate_maps; + u64 delegate_progs; + u64 delegate_attachs; +}; + struct bpf_struct_ops_value; struct btf_member; diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 99d0625b6c82..24b3faf901f4 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "preload/bpf_preload.h" enum bpf_type { @@ -600,10 +601,31 @@ EXPORT_SYMBOL(bpf_prog_get_type_path); */ static int bpf_show_options(struct seq_file *m, struct dentry *root) { + struct bpf_mount_opts *opts = root->d_sb->s_fs_info; umode_t mode = d_inode(root)->i_mode & S_IALLUGO & ~S_ISVTX; if (mode != S_IRWXUGO) seq_printf(m, ",mode=%o", mode); + + if (opts->delegate_cmds == ~0ULL) + seq_printf(m, ",delegate_cmds=any"); + else if (opts->delegate_cmds) + seq_printf(m, ",delegate_cmds=0x%llx", opts->delegate_cmds); + + if (opts->delegate_maps == ~0ULL) + seq_printf(m, ",delegate_maps=any"); + else if (opts->delegate_maps) + seq_printf(m, ",delegate_maps=0x%llx", opts->delegate_maps); + + if (opts->delegate_progs == ~0ULL) + seq_printf(m, ",delegate_progs=any"); + else if (opts->delegate_progs) + seq_printf(m, ",delegate_progs=0x%llx", opts->delegate_progs); + + if (opts->delegate_attachs == ~0ULL) + seq_printf(m, ",delegate_attachs=any"); + else if (opts->delegate_attachs) + seq_printf(m, ",delegate_attachs=0x%llx", opts->delegate_attachs); return 0; } @@ -627,22 +649,27 @@ static const struct super_operations bpf_super_ops = { enum { OPT_MODE, + OPT_DELEGATE_CMDS, + OPT_DELEGATE_MAPS, + OPT_DELEGATE_PROGS, + OPT_DELEGATE_ATTACHS, }; static const struct fs_parameter_spec bpf_fs_parameters[] = { fsparam_u32oct ("mode", OPT_MODE), + fsparam_string ("delegate_cmds", OPT_DELEGATE_CMDS), + fsparam_string ("delegate_maps", OPT_DELEGATE_MAPS), + fsparam_string ("delegate_progs", OPT_DELEGATE_PROGS), + fsparam_string ("delegate_attachs", OPT_DELEGATE_ATTACHS), {} }; -struct bpf_mount_opts { - umode_t mode; -}; - static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param) { - struct bpf_mount_opts *opts = fc->fs_private; + struct bpf_mount_opts *opts = fc->s_fs_info; struct fs_parse_result result; - int opt; + int opt, err; + u64 msk; opt = fs_parse(fc, bpf_fs_parameters, param, &result); if (opt < 0) { @@ -666,6 +693,25 @@ static int bpf_parse_param(struct fs_context *fc, struct fs_parameter *param) case OPT_MODE: opts->mode = result.uint_32 & S_IALLUGO; break; + case OPT_DELEGATE_CMDS: + case OPT_DELEGATE_MAPS: + case OPT_DELEGATE_PROGS: + case OPT_DELEGATE_ATTACHS: + if (strcmp(param->string, "any") == 0) { + msk = ~0ULL; + } else { + err = kstrtou64(param->string, 0, &msk); + if (err) + return err; + } + switch (opt) { + case OPT_DELEGATE_CMDS: opts->delegate_cmds |= msk; break; + case OPT_DELEGATE_MAPS: opts->delegate_maps |= msk; break; + case OPT_DELEGATE_PROGS: opts->delegate_progs |= msk; break; + case OPT_DELEGATE_ATTACHS: opts->delegate_attachs |= msk; break; + default: return -EINVAL; + } + break; } return 0; @@ -740,10 +786,14 @@ static int populate_bpffs(struct dentry *parent) static int bpf_fill_super(struct super_block *sb, struct fs_context *fc) { static const struct tree_descr bpf_rfiles[] = { { "" } }; - struct bpf_mount_opts *opts = fc->fs_private; + struct bpf_mount_opts *opts = sb->s_fs_info; struct inode *inode; int ret; + /* Mounting an instance of BPF FS requires privileges */ + if (fc->user_ns != &init_user_ns && !capable(CAP_SYS_ADMIN)) + return -EPERM; + ret = simple_fill_super(sb, BPF_FS_MAGIC, bpf_rfiles); if (ret) return ret; @@ -765,7 +815,10 @@ static int bpf_get_tree(struct fs_context *fc) static void bpf_free_fc(struct fs_context *fc) { - kfree(fc->fs_private); + struct bpf_mount_opts *opts = fc->s_fs_info; + + if (opts) + kfree(opts); } static const struct fs_context_operations bpf_context_ops = { @@ -787,17 +840,32 @@ static int bpf_init_fs_context(struct fs_context *fc) opts->mode = S_IRWXUGO; - fc->fs_private = opts; + /* start out with no BPF token delegation enabled */ + opts->delegate_cmds = 0; + opts->delegate_maps = 0; + opts->delegate_progs = 0; + opts->delegate_attachs = 0; + + fc->s_fs_info = opts; fc->ops = &bpf_context_ops; return 0; } +static void bpf_kill_super(struct super_block *sb) +{ + struct bpf_mount_opts *opts = sb->s_fs_info; + + kill_litter_super(sb); + kfree(opts); +} + static struct file_system_type bpf_fs_type = { .owner = THIS_MODULE, .name = "bpf", .init_fs_context = bpf_init_fs_context, .parameters = bpf_fs_parameters, - .kill_sb = kill_litter_super, + .kill_sb = bpf_kill_super, + .fs_flags = FS_USERNS_MOUNT, }; static int __init bpf_init(void) From patchwork Thu Oct 12 22:27:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419914 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2575FCDB474 for ; Thu, 12 Oct 2023 22:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443000AbjJLWbd convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:31:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443016AbjJLWbc (ORCPT ); Thu, 12 Oct 2023 18:31:32 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1739CC for ; Thu, 12 Oct 2023 15:31:29 -0700 (PDT) Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLtvkd019886 for ; Thu, 12 Oct 2023 15:31:29 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tp16cjvk2-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:31:29 -0700 Received: from twshared11278.41.prn1.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:31:25 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 0F50E39A3A4A2; Thu, 12 Oct 2023 15:29:09 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 03/18] bpf: introduce BPF token object Date: Thu, 12 Oct 2023 15:27:55 -0700 Message-ID: <20231012222810.4120312-4-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: vHzLmLaWrIz1qW0lqxafbsb3qtdHQIFa X-Proofpoint-ORIG-GUID: vHzLmLaWrIz1qW0lqxafbsb3qtdHQIFa X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a *trusted* unprivileged process, all while have a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts a path specification (using the usual fd + string path combo) to a BPF FS mount. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from, e.g., unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to *also* have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. The alternative to creating BPF token object was: a) not having any extra object and just pasing BPF FS path to each relevant bpf() command. This seems suboptimal as it's racy (mount under the same path might change in between checking it and using it for bpf() command). And also less flexible if we'd like to further restrict ourselves compared to all the delegated functionality allowed on BPF FS. b) use non-bpf() interface, e.g., ioctl(), but otherwise also create a dedicated FD that would represent a token-like functionality. This doesn't seem superior to having a proper bpf() command, so BPF_TOKEN_CREATE was chosen. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 41 +++++++ include/uapi/linux/bpf.h | 39 +++++++ kernel/bpf/Makefile | 2 +- kernel/bpf/inode.c | 17 ++- kernel/bpf/syscall.c | 17 +++ kernel/bpf/token.c | 208 +++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 39 +++++++ 7 files changed, 353 insertions(+), 10 deletions(-) create mode 100644 kernel/bpf/token.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3b3270ef11cc..afa4c68cdbf4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -51,6 +51,10 @@ struct module; struct bpf_func_state; struct ftrace_ops; struct cgroup; +struct bpf_token; +struct user_namespace; +struct super_block; +struct inode; extern struct idr btf_idr; extern spinlock_t btf_idr_lock; @@ -1572,6 +1576,13 @@ struct bpf_mount_opts { u64 delegate_attachs; }; +struct bpf_token { + struct work_struct work; + atomic64_t refcnt; + struct user_namespace *userns; + u64 allowed_cmds; +}; + struct bpf_struct_ops_value; struct btf_member; @@ -2029,6 +2040,7 @@ static inline void bpf_enable_instrumentation(void) migrate_enable(); } +extern const struct super_operations bpf_super_ops; extern const struct file_operations bpf_map_fops; extern const struct file_operations bpf_prog_fops; extern const struct file_operations bpf_iter_fops; @@ -2162,6 +2174,8 @@ static inline void bpf_map_dec_elem_count(struct bpf_map *map) extern int sysctl_unprivileged_bpf_disabled; +bool bpf_token_capable(const struct bpf_token *token, int cap); + static inline bool bpf_allow_ptr_leaks(void) { return perfmon_capable(); @@ -2196,8 +2210,17 @@ int bpf_link_new_fd(struct bpf_link *link); struct bpf_link *bpf_link_get_from_fd(u32 ufd); struct bpf_link *bpf_link_get_curr_or_next(u32 *id); +void bpf_token_inc(struct bpf_token *token); +void bpf_token_put(struct bpf_token *token); +int bpf_token_create(union bpf_attr *attr); +struct bpf_token *bpf_token_get_from_fd(u32 ufd); + +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd); + int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname); int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags); +struct inode *bpf_get_inode(struct super_block *sb, const struct inode *dir, + umode_t mode); #define BPF_ITER_FUNC_PREFIX "bpf_iter_" #define DEFINE_BPF_ITER_FUNC(target, args...) \ @@ -2557,6 +2580,24 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags) return -EOPNOTSUPP; } +static inline bool bpf_token_capable(const struct bpf_token *token, int cap) +{ + return capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN)); +} + +static inline void bpf_token_inc(struct bpf_token *token) +{ +} + +static inline void bpf_token_put(struct bpf_token *token) +{ +} + +static inline struct bpf_token *bpf_token_get_from_fd(u32 ufd) +{ + return ERR_PTR(-EOPNOTSUPP); +} + static inline void __dev_flush(void) { } diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7ba61b75bc0e..12d6c3b755e6 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -847,6 +847,37 @@ union bpf_iter_link_info { * Returns zero on success. On error, -1 is returned and *errno* * is set appropriately. * + * BPF_TOKEN_CREATE + * Description + * Create BPF token with embedded information about what + * BPF-related functionality it allows: + * - a set of allowed bpf() syscall commands; + * - a set of allowed BPF map types to be created with + * BPF_MAP_CREATE command, if BPF_MAP_CREATE itself is allowed; + * - a set of allowed BPF program types and BPF program attach + * types to be loaded with BPF_PROG_LOAD command, if + * BPF_PROG_LOAD itself is allowed. + * + * BPF token is created (derived) from an instance of BPF FS, + * assuming it has necessary delegation mount options specified. + * BPF FS mount is specified with openat()-style path FD + string. + * This BPF token can be passed as an extra parameter to various + * bpf() syscall commands to grant BPF subsystem functionality to + * unprivileged processes. + * + * When created, BPF token is "associated" with the owning + * user namespace of BPF FS instance (super block) that it was + * derived from, and subsequent BPF operations performed with + * BPF token would be performing capabilities checks (i.e., + * CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN) within + * that user namespace. Without BPF token, such capabilities + * have to be granted in init user namespace, making bpf() + * syscall incompatible with user namespace, for the most part. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * * NOTES * eBPF objects (maps and programs) can be shared between processes. * @@ -901,6 +932,8 @@ enum bpf_cmd { BPF_ITER_CREATE, BPF_LINK_DETACH, BPF_PROG_BIND_MAP, + BPF_TOKEN_CREATE, + __MAX_BPF_CMD, }; enum bpf_map_type { @@ -1699,6 +1732,12 @@ union bpf_attr { __u32 flags; /* extra flags */ } prog_bind_map; + struct { /* struct used by BPF_TOKEN_CREATE command */ + __u32 flags; + __u32 bpffs_path_fd; + __aligned_u64 bpffs_pathname; + } token_create; + } __attribute__((aligned(8))); /* The description below is an attempt at providing documentation to eBPF diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index f526b7573e97..4ce95acfcaa7 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -6,7 +6,7 @@ cflags-nogcse-$(CONFIG_X86)$(CONFIG_CC_IS_GCC) := -fno-gcse endif CFLAGS_core.o += $(call cc-disable-warning, override-init) $(cflags-nogcse-yy) -obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o +obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 24b3faf901f4..1dcd00c7f782 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -99,9 +99,9 @@ static const struct inode_operations bpf_prog_iops = { }; static const struct inode_operations bpf_map_iops = { }; static const struct inode_operations bpf_link_iops = { }; -static struct inode *bpf_get_inode(struct super_block *sb, - const struct inode *dir, - umode_t mode) +struct inode *bpf_get_inode(struct super_block *sb, + const struct inode *dir, + umode_t mode) { struct inode *inode; @@ -603,11 +603,13 @@ static int bpf_show_options(struct seq_file *m, struct dentry *root) { struct bpf_mount_opts *opts = root->d_sb->s_fs_info; umode_t mode = d_inode(root)->i_mode & S_IALLUGO & ~S_ISVTX; + u64 mask; if (mode != S_IRWXUGO) seq_printf(m, ",mode=%o", mode); - if (opts->delegate_cmds == ~0ULL) + mask = (1ULL << __MAX_BPF_CMD) - 1; + if ((opts->delegate_cmds & mask) == mask) seq_printf(m, ",delegate_cmds=any"); else if (opts->delegate_cmds) seq_printf(m, ",delegate_cmds=0x%llx", opts->delegate_cmds); @@ -640,7 +642,7 @@ static void bpf_free_inode(struct inode *inode) free_inode_nonrcu(inode); } -static const struct super_operations bpf_super_ops = { +const struct super_operations bpf_super_ops = { .statfs = simple_statfs, .drop_inode = generic_delete_inode, .show_options = bpf_show_options, @@ -815,10 +817,7 @@ static int bpf_get_tree(struct fs_context *fc) static void bpf_free_fc(struct fs_context *fc) { - struct bpf_mount_opts *opts = fc->s_fs_info; - - if (opts) - kfree(opts); + kfree(fc->s_fs_info); } static const struct fs_context_operations bpf_context_ops = { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index d5cf8f4a548b..6d51d83b1406 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -5319,6 +5319,20 @@ static int bpf_prog_bind_map(union bpf_attr *attr) return ret; } +#define BPF_TOKEN_CREATE_LAST_FIELD token_create.bpffs_pathname + +static int token_create(union bpf_attr *attr) +{ + if (CHECK_ATTR(BPF_TOKEN_CREATE)) + return -EINVAL; + + /* no flags are supported yet */ + if (attr->token_create.flags) + return -EINVAL; + + return bpf_token_create(attr); +} + static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) { union bpf_attr attr; @@ -5452,6 +5466,9 @@ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) case BPF_PROG_BIND_MAP: err = bpf_prog_bind_map(&attr); break; + case BPF_TOKEN_CREATE: + err = token_create(&attr); + break; default: err = -EINVAL; break; diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c new file mode 100644 index 000000000000..638b9379a507 --- /dev/null +++ b/kernel/bpf/token.c @@ -0,0 +1,208 @@ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +bool bpf_token_capable(const struct bpf_token *token, int cap) +{ + /* BPF token allows ns_capable() level of capabilities */ + if (token) { + if (ns_capable(token->userns, cap)) + return true; + if (cap != CAP_SYS_ADMIN && ns_capable(token->userns, CAP_SYS_ADMIN)) + return true; + } + /* otherwise fallback to capable() checks */ + return capable(cap) || (cap != CAP_SYS_ADMIN && capable(CAP_SYS_ADMIN)); +} + +void bpf_token_inc(struct bpf_token *token) +{ + atomic64_inc(&token->refcnt); +} + +static void bpf_token_free(struct bpf_token *token) +{ + put_user_ns(token->userns); + kvfree(token); +} + +static void bpf_token_put_deferred(struct work_struct *work) +{ + struct bpf_token *token = container_of(work, struct bpf_token, work); + + bpf_token_free(token); +} + +void bpf_token_put(struct bpf_token *token) +{ + if (!token) + return; + + if (!atomic64_dec_and_test(&token->refcnt)) + return; + + INIT_WORK(&token->work, bpf_token_put_deferred); + schedule_work(&token->work); +} + +static int bpf_token_release(struct inode *inode, struct file *filp) +{ + struct bpf_token *token = filp->private_data; + + bpf_token_put(token); + return 0; +} + +static void bpf_token_show_fdinfo(struct seq_file *m, struct file *filp) +{ + struct bpf_token *token = filp->private_data; + u64 mask; + + BUILD_BUG_ON(__MAX_BPF_CMD >= 64); + mask = (1ULL << __MAX_BPF_CMD) - 1; + if ((token->allowed_cmds & mask) == mask) + seq_printf(m, "allowed_cmds:\tany\n"); + else + seq_printf(m, "allowed_cmds:\t0x%llx\n", token->allowed_cmds); +} + +static struct bpf_token *bpf_token_alloc(void) +{ + struct bpf_token *token; + + token = kvzalloc(sizeof(*token), GFP_USER); + if (!token) + return NULL; + + atomic64_set(&token->refcnt, 1); + + return token; +} + +#define BPF_TOKEN_INODE_NAME "bpf-token" + +static const struct inode_operations bpf_token_iops = { }; + +static const struct file_operations bpf_token_fops = { + .release = bpf_token_release, + .show_fdinfo = bpf_token_show_fdinfo, +}; + +int bpf_token_create(union bpf_attr *attr) +{ + struct bpf_mount_opts *mnt_opts; + struct bpf_token *token = NULL; + struct user_namespace *userns; + struct inode *inode; + struct file *file; + struct path path; + umode_t mode; + int err, fd; + + err = user_path_at(attr->token_create.bpffs_path_fd, + u64_to_user_ptr(attr->token_create.bpffs_pathname), + LOOKUP_FOLLOW | LOOKUP_EMPTY, &path); + if (err) + return err; + + if (path.mnt->mnt_root != path.dentry) { + err = -EINVAL; + goto out_path; + } + if (path.mnt->mnt_sb->s_op != &bpf_super_ops) { + err = -EINVAL; + goto out_path; + } + err = path_permission(&path, MAY_ACCESS); + if (err) + goto out_path; + + userns = path.dentry->d_sb->s_user_ns; + if (!ns_capable(userns, CAP_BPF)) { + err = -EPERM; + goto out_path; + } + + mode = S_IFREG | ((S_IRUSR | S_IWUSR) & ~current_umask()); + inode = bpf_get_inode(path.mnt->mnt_sb, NULL, mode); + if (IS_ERR(inode)) { + err = PTR_ERR(inode); + goto out_path; + } + + inode->i_op = &bpf_token_iops; + inode->i_fop = &bpf_token_fops; + clear_nlink(inode); /* make sure it is unlinked */ + + file = alloc_file_pseudo(inode, path.mnt, BPF_TOKEN_INODE_NAME, O_RDWR, &bpf_token_fops); + if (IS_ERR(file)) { + iput(inode); + err = PTR_ERR(file); + goto out_path; + } + + token = bpf_token_alloc(); + if (!token) { + err = -ENOMEM; + goto out_file; + } + + /* remember bpffs owning userns for future ns_capable() checks */ + token->userns = get_user_ns(userns); + + mnt_opts = path.dentry->d_sb->s_fs_info; + token->allowed_cmds = mnt_opts->delegate_cmds; + + fd = get_unused_fd_flags(O_CLOEXEC); + if (fd < 0) { + err = fd; + goto out_token; + } + + file->private_data = token; + fd_install(fd, file); + + path_put(&path); + return fd; + +out_token: + bpf_token_free(token); +out_file: + fput(file); +out_path: + path_put(&path); + return err; +} + +struct bpf_token *bpf_token_get_from_fd(u32 ufd) +{ + struct fd f = fdget(ufd); + struct bpf_token *token; + + if (!f.file) + return ERR_PTR(-EBADF); + if (f.file->f_op != &bpf_token_fops) { + fdput(f); + return ERR_PTR(-EINVAL); + } + + token = f.file->private_data; + bpf_token_inc(token); + fdput(f); + + return token; +} + +bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd) +{ + if (!token) + return false; + + return token->allowed_cmds & (1ULL << cmd); +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 7ba61b75bc0e..12d6c3b755e6 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -847,6 +847,37 @@ union bpf_iter_link_info { * Returns zero on success. On error, -1 is returned and *errno* * is set appropriately. * + * BPF_TOKEN_CREATE + * Description + * Create BPF token with embedded information about what + * BPF-related functionality it allows: + * - a set of allowed bpf() syscall commands; + * - a set of allowed BPF map types to be created with + * BPF_MAP_CREATE command, if BPF_MAP_CREATE itself is allowed; + * - a set of allowed BPF program types and BPF program attach + * types to be loaded with BPF_PROG_LOAD command, if + * BPF_PROG_LOAD itself is allowed. + * + * BPF token is created (derived) from an instance of BPF FS, + * assuming it has necessary delegation mount options specified. + * BPF FS mount is specified with openat()-style path FD + string. + * This BPF token can be passed as an extra parameter to various + * bpf() syscall commands to grant BPF subsystem functionality to + * unprivileged processes. + * + * When created, BPF token is "associated" with the owning + * user namespace of BPF FS instance (super block) that it was + * derived from, and subsequent BPF operations performed with + * BPF token would be performing capabilities checks (i.e., + * CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN) within + * that user namespace. Without BPF token, such capabilities + * have to be granted in init user namespace, making bpf() + * syscall incompatible with user namespace, for the most part. + * + * Return + * A new file descriptor (a nonnegative integer), or -1 if an + * error occurred (in which case, *errno* is set appropriately). + * * NOTES * eBPF objects (maps and programs) can be shared between processes. * @@ -901,6 +932,8 @@ enum bpf_cmd { BPF_ITER_CREATE, BPF_LINK_DETACH, BPF_PROG_BIND_MAP, + BPF_TOKEN_CREATE, + __MAX_BPF_CMD, }; enum bpf_map_type { @@ -1699,6 +1732,12 @@ union bpf_attr { __u32 flags; /* extra flags */ } prog_bind_map; + struct { /* struct used by BPF_TOKEN_CREATE command */ + __u32 flags; + __u32 bpffs_path_fd; + __aligned_u64 bpffs_pathname; + } token_create; + } __attribute__((aligned(8))); /* The description below is an attempt at providing documentation to eBPF From patchwork Thu Oct 12 22:27:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419913 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE809CDB47E for ; Thu, 12 Oct 2023 22:31:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442989AbjJLWbd convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:31:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443000AbjJLWba (ORCPT ); Thu, 12 Oct 2023 18:31:30 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B29BCA for ; Thu, 12 Oct 2023 15:31:29 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLu4ZV019398 for ; Thu, 12 Oct 2023 15:31:29 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjtjj0wr-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:31:28 -0700 Received: from twshared9518.03.prn6.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:31:25 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 1FC1B39A3A4B0; Thu, 12 Oct 2023 15:29:11 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 04/18] bpf: add BPF token support to BPF_MAP_CREATE command Date: Thu, 12 Oct 2023 15:27:56 -0700 Message-ID: <20231012222810.4120312-5-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: g2dWALVeR3MAV7GA_kEFuiMB8L_f_gYQ X-Proofpoint-GUID: g2dWALVeR3MAV7GA_kEFuiMB8L_f_gYQ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 2 + include/uapi/linux/bpf.h | 2 + kernel/bpf/inode.c | 3 +- kernel/bpf/syscall.c | 52 ++++++++++++++----- kernel/bpf/token.c | 16 ++++++ tools/include/uapi/linux/bpf.h | 2 + .../selftests/bpf/prog_tests/libbpf_probes.c | 2 + .../selftests/bpf/prog_tests/libbpf_str.c | 3 ++ 8 files changed, 67 insertions(+), 15 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index afa4c68cdbf4..35fc0b06365c 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1581,6 +1581,7 @@ struct bpf_token { atomic64_t refcnt; struct user_namespace *userns; u64 allowed_cmds; + u64 allowed_maps; }; struct bpf_struct_ops_value; @@ -2216,6 +2217,7 @@ int bpf_token_create(union bpf_attr *attr); struct bpf_token *bpf_token_get_from_fd(u32 ufd); bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd); +bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type); int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname); int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 12d6c3b755e6..78deb9f74379 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -984,6 +984,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + __MAX_BPF_MAP_TYPE }; /* Note that tracing related programs such as @@ -1428,6 +1429,7 @@ union bpf_attr { * to using 5 hash functions). */ __u64 map_extra; + __u32 map_token_fd; }; struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */ diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 1dcd00c7f782..e30f0821a35e 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -614,7 +614,8 @@ static int bpf_show_options(struct seq_file *m, struct dentry *root) else if (opts->delegate_cmds) seq_printf(m, ",delegate_cmds=0x%llx", opts->delegate_cmds); - if (opts->delegate_maps == ~0ULL) + mask = (1ULL << __MAX_BPF_MAP_TYPE) - 1; + if ((opts->delegate_maps & mask) == mask) seq_printf(m, ",delegate_maps=any"); else if (opts->delegate_maps) seq_printf(m, ",delegate_maps=0x%llx", opts->delegate_maps); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 6d51d83b1406..e63e4280c43c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -985,8 +985,8 @@ int map_check_no_btf(const struct bpf_map *map, return -ENOTSUPP; } -static int map_check_btf(struct bpf_map *map, const struct btf *btf, - u32 btf_key_id, u32 btf_value_id) +static int map_check_btf(struct bpf_map *map, struct bpf_token *token, + const struct btf *btf, u32 btf_key_id, u32 btf_value_id) { const struct btf_type *key_type, *value_type; u32 key_size, value_size; @@ -1014,7 +1014,7 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, if (!IS_ERR_OR_NULL(map->record)) { int i; - if (!bpf_capable()) { + if (!bpf_token_capable(token, CAP_BPF)) { ret = -EPERM; goto free_map_tab; } @@ -1102,11 +1102,12 @@ static bool bpf_net_capable(void) return capable(CAP_NET_ADMIN) || capable(CAP_SYS_ADMIN); } -#define BPF_MAP_CREATE_LAST_FIELD map_extra +#define BPF_MAP_CREATE_LAST_FIELD map_token_fd /* called via syscall */ static int map_create(union bpf_attr *attr) { const struct bpf_map_ops *ops; + struct bpf_token *token = NULL; int numa_node = bpf_map_attr_numa_node(attr); u32 map_type = attr->map_type; struct bpf_map *map; @@ -1157,14 +1158,32 @@ static int map_create(union bpf_attr *attr) if (!ops->map_mem_usage) return -EINVAL; + if (attr->map_token_fd) { + token = bpf_token_get_from_fd(attr->map_token_fd); + if (IS_ERR(token)) + return PTR_ERR(token); + + /* if current token doesn't grant map creation permissions, + * then we can't use this token, so ignore it and rely on + * system-wide capabilities checks + */ + if (!bpf_token_allow_cmd(token, BPF_MAP_CREATE) || + !bpf_token_allow_map_type(token, attr->map_type)) { + bpf_token_put(token); + token = NULL; + } + } + + err = -EPERM; + /* Intent here is for unprivileged_bpf_disabled to block BPF map * creation for unprivileged users; other actions depend * on fd availability and access to bpffs, so are dependent on * object creation success. Even with unprivileged BPF disabled, * capability checks are still carried out. */ - if (sysctl_unprivileged_bpf_disabled && !bpf_capable()) - return -EPERM; + if (sysctl_unprivileged_bpf_disabled && !bpf_token_capable(token, CAP_BPF)) + goto put_token; /* check privileged map type permissions */ switch (map_type) { @@ -1197,25 +1216,27 @@ static int map_create(union bpf_attr *attr) case BPF_MAP_TYPE_LRU_PERCPU_HASH: case BPF_MAP_TYPE_STRUCT_OPS: case BPF_MAP_TYPE_CPUMAP: - if (!bpf_capable()) - return -EPERM; + if (!bpf_token_capable(token, CAP_BPF)) + goto put_token; break; case BPF_MAP_TYPE_SOCKMAP: case BPF_MAP_TYPE_SOCKHASH: case BPF_MAP_TYPE_DEVMAP: case BPF_MAP_TYPE_DEVMAP_HASH: case BPF_MAP_TYPE_XSKMAP: - if (!bpf_net_capable()) - return -EPERM; + if (!bpf_token_capable(token, CAP_NET_ADMIN)) + goto put_token; break; default: WARN(1, "unsupported map type %d", map_type); - return -EPERM; + goto put_token; } map = ops->map_alloc(attr); - if (IS_ERR(map)) - return PTR_ERR(map); + if (IS_ERR(map)) { + err = PTR_ERR(map); + goto put_token; + } map->ops = ops; map->map_type = map_type; @@ -1252,7 +1273,7 @@ static int map_create(union bpf_attr *attr) map->btf = btf; if (attr->btf_value_type_id) { - err = map_check_btf(map, btf, attr->btf_key_type_id, + err = map_check_btf(map, token, btf, attr->btf_key_type_id, attr->btf_value_type_id); if (err) goto free_map; @@ -1273,6 +1294,7 @@ static int map_create(union bpf_attr *attr) goto free_map_sec; bpf_map_save_memcg(map); + bpf_token_put(token); err = bpf_map_new_fd(map, f_flags); if (err < 0) { @@ -1293,6 +1315,8 @@ static int map_create(union bpf_attr *attr) free_map: btf_put(map->btf); map->ops->map_free(map); +put_token: + bpf_token_put(token); return err; } diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c index 638b9379a507..17b66e11a55b 100644 --- a/kernel/bpf/token.c +++ b/kernel/bpf/token.c @@ -70,6 +70,13 @@ static void bpf_token_show_fdinfo(struct seq_file *m, struct file *filp) seq_printf(m, "allowed_cmds:\tany\n"); else seq_printf(m, "allowed_cmds:\t0x%llx\n", token->allowed_cmds); + + BUILD_BUG_ON(__MAX_BPF_MAP_TYPE >= 64); + mask = (1ULL << __MAX_BPF_MAP_TYPE) - 1; + if ((token->allowed_maps & mask) == mask) + seq_printf(m, "allowed_maps:\tany\n"); + else + seq_printf(m, "allowed_maps:\t0x%llx\n", token->allowed_maps); } static struct bpf_token *bpf_token_alloc(void) @@ -158,6 +165,7 @@ int bpf_token_create(union bpf_attr *attr) mnt_opts = path.dentry->d_sb->s_fs_info; token->allowed_cmds = mnt_opts->delegate_cmds; + token->allowed_maps = mnt_opts->delegate_maps; fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { @@ -206,3 +214,11 @@ bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd) return token->allowed_cmds & (1ULL << cmd); } + +bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type) +{ + if (!token || type >= __MAX_BPF_MAP_TYPE) + return false; + + return token->allowed_maps & (1ULL << type); +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 12d6c3b755e6..78deb9f74379 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -984,6 +984,7 @@ enum bpf_map_type { BPF_MAP_TYPE_BLOOM_FILTER, BPF_MAP_TYPE_USER_RINGBUF, BPF_MAP_TYPE_CGRP_STORAGE, + __MAX_BPF_MAP_TYPE }; /* Note that tracing related programs such as @@ -1428,6 +1429,7 @@ union bpf_attr { * to using 5 hash functions). */ __u64 map_extra; + __u32 map_token_fd; }; struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */ diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c index 9f766ddd946a..573249a2814d 100644 --- a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c @@ -68,6 +68,8 @@ void test_libbpf_probe_map_types(void) if (map_type == BPF_MAP_TYPE_UNSPEC) continue; + if (strcmp(map_type_name, "__MAX_BPF_MAP_TYPE") == 0) + continue; if (!test__start_subtest(map_type_name)) continue; diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c index c440ea3311ed..2a0633f43c73 100644 --- a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c @@ -132,6 +132,9 @@ static void test_libbpf_bpf_map_type_str(void) const char *map_type_str; char buf[256]; + if (map_type == __MAX_BPF_MAP_TYPE) + continue; + map_type_name = btf__str_by_offset(btf, e->name_off); map_type_str = libbpf_bpf_map_type_str(map_type); ASSERT_OK_PTR(map_type_str, map_type_name); From patchwork Thu Oct 12 22:27:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419912 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9775CDB482 for ; Thu, 12 Oct 2023 22:31:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442981AbjJLWbb convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:31:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442989AbjJLWba (ORCPT ); Thu, 12 Oct 2023 18:31:30 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7E2FD6 for ; Thu, 12 Oct 2023 15:31:28 -0700 (PDT) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 39CLuAQ8015866 for ; Thu, 12 Oct 2023 15:31:28 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3tpbsne34c-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:31:27 -0700 Received: from twshared15247.17.frc2.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:31:22 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 2A0FC39A3A4B9; Thu, 12 Oct 2023 15:29:13 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 05/18] bpf: add BPF token support to BPF_BTF_LOAD command Date: Thu, 12 Oct 2023 15:27:57 -0700 Message-ID: <20231012222810.4120312-6-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: RftBjWBCT0U8d61n54Xq0RhhS0LAR8Ud X-Proofpoint-ORIG-GUID: RftBjWBCT0U8d61n54Xq0RhhS0LAR8Ud X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko --- include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 20 ++++++++++++++++++-- tools/include/uapi/linux/bpf.h | 1 + 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 78deb9f74379..e6d35086152b 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1611,6 +1611,7 @@ union bpf_attr { * truncated), or smaller (if log buffer wasn't filled completely). */ __u32 btf_log_true_size; + __u32 btf_token_fd; }; struct { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e63e4280c43c..a2c9edcbcd77 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4728,15 +4728,31 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr, return err; } -#define BPF_BTF_LOAD_LAST_FIELD btf_log_true_size +#define BPF_BTF_LOAD_LAST_FIELD btf_token_fd static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size) { + struct bpf_token *token = NULL; + if (CHECK_ATTR(BPF_BTF_LOAD)) return -EINVAL; - if (!bpf_capable()) + if (attr->btf_token_fd) { + token = bpf_token_get_from_fd(attr->btf_token_fd); + if (IS_ERR(token)) + return PTR_ERR(token); + if (!bpf_token_allow_cmd(token, BPF_BTF_LOAD)) { + bpf_token_put(token); + token = NULL; + } + } + + if (!bpf_token_capable(token, CAP_BPF)) { + bpf_token_put(token); return -EPERM; + } + + bpf_token_put(token); return btf_new_fd(attr, uattr, uattr_size); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 78deb9f74379..e6d35086152b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1611,6 +1611,7 @@ union bpf_attr { * truncated), or smaller (if log buffer wasn't filled completely). */ __u32 btf_log_true_size; + __u32 btf_token_fd; }; struct { From patchwork Thu Oct 12 22:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419922 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42C24CDB474 for ; Thu, 12 Oct 2023 22:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443018AbjJLWch convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443024AbjJLWcb (ORCPT ); Thu, 12 Oct 2023 18:32:31 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79FF1DA for ; Thu, 12 Oct 2023 15:32:29 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 39CLuMD7022489 for ; Thu, 12 Oct 2023 15:32:28 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3tpbry9a19-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:28 -0700 Received: from twshared11278.41.prn1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:25 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 3879A39A3A4C1; Thu, 12 Oct 2023 15:29:15 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 06/18] bpf: add BPF token support to BPF_PROG_LOAD command Date: Thu, 12 Oct 2023 15:27:58 -0700 Message-ID: <20231012222810.4120312-7-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: qpTZ1bI-ejrAxIdjeCd4mjzlPtzqhgF9 X-Proofpoint-ORIG-GUID: qpTZ1bI-ejrAxIdjeCd4mjzlPtzqhgF9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 6 ++ include/uapi/linux/bpf.h | 2 + kernel/bpf/core.c | 1 + kernel/bpf/inode.c | 6 +- kernel/bpf/syscall.c | 87 ++++++++++++++----- kernel/bpf/token.c | 27 ++++++ tools/include/uapi/linux/bpf.h | 2 + .../selftests/bpf/prog_tests/libbpf_probes.c | 2 + .../selftests/bpf/prog_tests/libbpf_str.c | 3 + 9 files changed, 110 insertions(+), 26 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 35fc0b06365c..d21efa0ca4d3 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1442,6 +1442,7 @@ struct bpf_prog_aux { #ifdef CONFIG_SECURITY void *security; #endif + struct bpf_token *token; struct bpf_prog_offload *offload; struct btf *btf; struct bpf_func_info *func_info; @@ -1582,6 +1583,8 @@ struct bpf_token { struct user_namespace *userns; u64 allowed_cmds; u64 allowed_maps; + u64 allowed_progs; + u64 allowed_attachs; }; struct bpf_struct_ops_value; @@ -2218,6 +2221,9 @@ struct bpf_token *bpf_token_get_from_fd(u32 ufd); bool bpf_token_allow_cmd(const struct bpf_token *token, enum bpf_cmd cmd); bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type type); +bool bpf_token_allow_prog_type(const struct bpf_token *token, + enum bpf_prog_type prog_type, + enum bpf_attach_type attach_type); int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname); int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e6d35086152b..45ac44f03530 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1029,6 +1029,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ BPF_PROG_TYPE_NETFILTER, + __MAX_BPF_PROG_TYPE }; enum bpf_attach_type { @@ -1499,6 +1500,7 @@ union bpf_attr { * truncated), or smaller (if log buffer wasn't filled completely). */ __u32 log_true_size; + __u32 prog_token_fd; }; struct { /* anonymous struct used by BPF_OBJ_* commands */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 08626b519ce2..fc8de25b7948 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2747,6 +2747,7 @@ void bpf_prog_free(struct bpf_prog *fp) if (aux->dst_prog) bpf_prog_put(aux->dst_prog); + bpf_token_put(aux->token); INIT_WORK(&aux->work, bpf_prog_free_deferred); schedule_work(&aux->work); } diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index e30f0821a35e..0dfcc85593ee 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -620,12 +620,14 @@ static int bpf_show_options(struct seq_file *m, struct dentry *root) else if (opts->delegate_maps) seq_printf(m, ",delegate_maps=0x%llx", opts->delegate_maps); - if (opts->delegate_progs == ~0ULL) + mask = (1ULL << __MAX_BPF_PROG_TYPE) - 1; + if ((opts->delegate_progs & mask) == mask) seq_printf(m, ",delegate_progs=any"); else if (opts->delegate_progs) seq_printf(m, ",delegate_progs=0x%llx", opts->delegate_progs); - if (opts->delegate_attachs == ~0ULL) + mask = (1ULL << __MAX_BPF_ATTACH_TYPE) - 1; + if ((opts->delegate_attachs & mask) == mask) seq_printf(m, ",delegate_attachs=any"); else if (opts->delegate_attachs) seq_printf(m, ",delegate_attachs=0x%llx", opts->delegate_attachs); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index a2c9edcbcd77..c6b00aee3b62 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2584,13 +2584,15 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) } /* last field in 'union bpf_attr' used by this command */ -#define BPF_PROG_LOAD_LAST_FIELD log_true_size +#define BPF_PROG_LOAD_LAST_FIELD prog_token_fd static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) { enum bpf_prog_type type = attr->prog_type; struct bpf_prog *prog, *dst_prog = NULL; struct btf *attach_btf = NULL; + struct bpf_token *token = NULL; + bool bpf_cap; int err; char license[128]; @@ -2606,10 +2608,31 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) BPF_F_XDP_DEV_BOUND_ONLY)) return -EINVAL; + bpf_prog_load_fixup_attach_type(attr); + + if (attr->prog_token_fd) { + token = bpf_token_get_from_fd(attr->prog_token_fd); + if (IS_ERR(token)) + return PTR_ERR(token); + /* if current token doesn't grant prog loading permissions, + * then we can't use this token, so ignore it and rely on + * system-wide capabilities checks + */ + if (!bpf_token_allow_cmd(token, BPF_PROG_LOAD) || + !bpf_token_allow_prog_type(token, attr->prog_type, + attr->expected_attach_type)) { + bpf_token_put(token); + token = NULL; + } + } + + bpf_cap = bpf_token_capable(token, CAP_BPF); + err = -EPERM; + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && (attr->prog_flags & BPF_F_ANY_ALIGNMENT) && - !bpf_capable()) - return -EPERM; + !bpf_cap) + goto put_token; /* Intent here is for unprivileged_bpf_disabled to block BPF program * creation for unprivileged users; other actions depend @@ -2618,21 +2641,23 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) * capability checks are still carried out for these * and other operations. */ - if (sysctl_unprivileged_bpf_disabled && !bpf_capable()) - return -EPERM; + if (sysctl_unprivileged_bpf_disabled && !bpf_cap) + goto put_token; if (attr->insn_cnt == 0 || - attr->insn_cnt > (bpf_capable() ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS)) - return -E2BIG; + attr->insn_cnt > (bpf_cap ? BPF_COMPLEXITY_LIMIT_INSNS : BPF_MAXINSNS)) { + err = -E2BIG; + goto put_token; + } if (type != BPF_PROG_TYPE_SOCKET_FILTER && type != BPF_PROG_TYPE_CGROUP_SKB && - !bpf_capable()) - return -EPERM; + !bpf_cap) + goto put_token; - if (is_net_admin_prog_type(type) && !bpf_net_capable()) - return -EPERM; - if (is_perfmon_prog_type(type) && !perfmon_capable()) - return -EPERM; + if (is_net_admin_prog_type(type) && !bpf_token_capable(token, CAP_NET_ADMIN)) + goto put_token; + if (is_perfmon_prog_type(type) && !bpf_token_capable(token, CAP_PERFMON)) + goto put_token; /* attach_prog_fd/attach_btf_obj_fd can specify fd of either bpf_prog * or btf, we need to check which one it is @@ -2642,27 +2667,33 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) if (IS_ERR(dst_prog)) { dst_prog = NULL; attach_btf = btf_get_by_fd(attr->attach_btf_obj_fd); - if (IS_ERR(attach_btf)) - return -EINVAL; + if (IS_ERR(attach_btf)) { + err = -EINVAL; + goto put_token; + } if (!btf_is_kernel(attach_btf)) { /* attaching through specifying bpf_prog's BTF * objects directly might be supported eventually */ btf_put(attach_btf); - return -ENOTSUPP; + err = -ENOTSUPP; + goto put_token; } } } else if (attr->attach_btf_id) { /* fall back to vmlinux BTF, if BTF type ID is specified */ attach_btf = bpf_get_btf_vmlinux(); - if (IS_ERR(attach_btf)) - return PTR_ERR(attach_btf); - if (!attach_btf) - return -EINVAL; + if (IS_ERR(attach_btf)) { + err = PTR_ERR(attach_btf); + goto put_token; + } + if (!attach_btf) { + err = -EINVAL; + goto put_token; + } btf_get(attach_btf); } - bpf_prog_load_fixup_attach_type(attr); if (bpf_prog_load_check_attach(type, attr->expected_attach_type, attach_btf, attr->attach_btf_id, dst_prog)) { @@ -2670,7 +2701,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) bpf_prog_put(dst_prog); if (attach_btf) btf_put(attach_btf); - return -EINVAL; + err = -EINVAL; + goto put_token; } /* plain bpf_prog allocation */ @@ -2680,7 +2712,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) bpf_prog_put(dst_prog); if (attach_btf) btf_put(attach_btf); - return -ENOMEM; + err = -EINVAL; + goto put_token; } prog->expected_attach_type = attr->expected_attach_type; @@ -2691,6 +2724,10 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE; prog->aux->xdp_has_frags = attr->prog_flags & BPF_F_XDP_HAS_FRAGS; + /* move token into prog->aux, reuse taken refcnt */ + prog->aux->token = token; + token = NULL; + err = security_bpf_prog_alloc(prog->aux); if (err) goto free_prog; @@ -2792,6 +2829,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) if (prog->aux->attach_btf) btf_put(prog->aux->attach_btf); bpf_prog_free(prog); +put_token: + bpf_token_put(token); return err; } @@ -3779,7 +3818,7 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog, case BPF_PROG_TYPE_SK_LOOKUP: return attach_type == prog->expected_attach_type ? 0 : -EINVAL; case BPF_PROG_TYPE_CGROUP_SKB: - if (!bpf_net_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_NET_ADMIN)) /* cg-skb progs can be loaded by unpriv user. * check permissions at attach time. */ diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c index 17b66e11a55b..d4e0cc8075d3 100644 --- a/kernel/bpf/token.c +++ b/kernel/bpf/token.c @@ -77,6 +77,20 @@ static void bpf_token_show_fdinfo(struct seq_file *m, struct file *filp) seq_printf(m, "allowed_maps:\tany\n"); else seq_printf(m, "allowed_maps:\t0x%llx\n", token->allowed_maps); + + BUILD_BUG_ON(__MAX_BPF_PROG_TYPE >= 64); + mask = (1ULL << __MAX_BPF_PROG_TYPE) - 1; + if ((token->allowed_progs & mask) == mask) + seq_printf(m, "allowed_progs:\tany\n"); + else + seq_printf(m, "allowed_progs:\t0x%llx\n", token->allowed_progs); + + BUILD_BUG_ON(__MAX_BPF_ATTACH_TYPE >= 64); + mask = (1ULL << __MAX_BPF_ATTACH_TYPE) - 1; + if ((token->allowed_attachs & mask) == mask) + seq_printf(m, "allowed_attachs:\tany\n"); + else + seq_printf(m, "allowed_attachs:\t0x%llx\n", token->allowed_attachs); } static struct bpf_token *bpf_token_alloc(void) @@ -166,6 +180,8 @@ int bpf_token_create(union bpf_attr *attr) mnt_opts = path.dentry->d_sb->s_fs_info; token->allowed_cmds = mnt_opts->delegate_cmds; token->allowed_maps = mnt_opts->delegate_maps; + token->allowed_progs = mnt_opts->delegate_progs; + token->allowed_attachs = mnt_opts->delegate_attachs; fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { @@ -222,3 +238,14 @@ bool bpf_token_allow_map_type(const struct bpf_token *token, enum bpf_map_type t return token->allowed_maps & (1ULL << type); } + +bool bpf_token_allow_prog_type(const struct bpf_token *token, + enum bpf_prog_type prog_type, + enum bpf_attach_type attach_type) +{ + if (!token || prog_type >= __MAX_BPF_PROG_TYPE || attach_type >= __MAX_BPF_ATTACH_TYPE) + return false; + + return (token->allowed_progs & (1ULL << prog_type)) && + (token->allowed_attachs & (1ULL << attach_type)); +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index e6d35086152b..45ac44f03530 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1029,6 +1029,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ BPF_PROG_TYPE_NETFILTER, + __MAX_BPF_PROG_TYPE }; enum bpf_attach_type { @@ -1499,6 +1500,7 @@ union bpf_attr { * truncated), or smaller (if log buffer wasn't filled completely). */ __u32 log_true_size; + __u32 prog_token_fd; }; struct { /* anonymous struct used by BPF_OBJ_* commands */ diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c index 573249a2814d..4ed46ed58a7b 100644 --- a/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_probes.c @@ -30,6 +30,8 @@ void test_libbpf_probe_prog_types(void) if (prog_type == BPF_PROG_TYPE_UNSPEC) continue; + if (strcmp(prog_type_name, "__MAX_BPF_PROG_TYPE") == 0) + continue; if (!test__start_subtest(prog_type_name)) continue; diff --git a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c index 2a0633f43c73..384bc1f7a65e 100644 --- a/tools/testing/selftests/bpf/prog_tests/libbpf_str.c +++ b/tools/testing/selftests/bpf/prog_tests/libbpf_str.c @@ -189,6 +189,9 @@ static void test_libbpf_bpf_prog_type_str(void) const char *prog_type_str; char buf[256]; + if (prog_type == __MAX_BPF_PROG_TYPE) + continue; + prog_type_name = btf__str_by_offset(btf, e->name_off); prog_type_str = libbpf_bpf_prog_type_str(prog_type); ASSERT_OK_PTR(prog_type_str, prog_type_name); From patchwork Thu Oct 12 22:27:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419925 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C605DCDB483 for ; Thu, 12 Oct 2023 22:32:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443039AbjJLWcj convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443020AbjJLWci (ORCPT ); Thu, 12 Oct 2023 18:32:38 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E3ACF1 for ; Thu, 12 Oct 2023 15:32:35 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuEBq010348 for ; Thu, 12 Oct 2023 15:32:35 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjt7b0rp-17 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:35 -0700 Received: from twshared68648.02.prn6.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:27 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 46D8539A3A4E6; Thu, 12 Oct 2023 15:29:17 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 07/18] bpf: take into account BPF token when fetching helper protos Date: Thu, 12 Oct 2023 15:27:59 -0700 Message-ID: <20231012222810.4120312-8-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: QKoZXagG0L95u_DgqWhUEzxUOurP3o8S X-Proofpoint-ORIG-GUID: QKoZXagG0L95u_DgqWhUEzxUOurP3o8S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Instead of performing unconditional system-wide bpf_capable() and perfmon_capable() calls inside bpf_base_func_proto() function (and other similar ones) to determine eligibility of a given BPF helper for a given program, use previously recorded BPF token during BPF_PROG_LOAD command handling to inform the decision. Signed-off-by: Andrii Nakryiko --- drivers/media/rc/bpf-lirc.c | 2 +- include/linux/bpf.h | 5 +++-- kernel/bpf/cgroup.c | 6 +++--- kernel/bpf/helpers.c | 6 +++--- kernel/bpf/syscall.c | 5 +++-- kernel/trace/bpf_trace.c | 2 +- net/core/filter.c | 32 ++++++++++++++++---------------- net/ipv4/bpf_tcp_ca.c | 2 +- net/netfilter/nf_bpf_link.c | 2 +- 9 files changed, 32 insertions(+), 30 deletions(-) diff --git a/drivers/media/rc/bpf-lirc.c b/drivers/media/rc/bpf-lirc.c index fe17c7f98e81..6d07693c6b9f 100644 --- a/drivers/media/rc/bpf-lirc.c +++ b/drivers/media/rc/bpf-lirc.c @@ -110,7 +110,7 @@ lirc_mode2_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_get_prandom_u32: return &bpf_get_prandom_u32_proto; case BPF_FUNC_trace_printk: - if (perfmon_capable()) + if (bpf_token_capable(prog->aux->token, CAP_PERFMON)) return bpf_get_trace_printk_proto(); fallthrough; default: diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d21efa0ca4d3..3060820e0052 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2472,7 +2472,8 @@ const char *btf_find_decl_tag_value(const struct btf *btf, const struct btf_type struct bpf_prog *bpf_prog_by_id(u32 id); struct bpf_link *bpf_link_by_id(u32 id); -const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id); +const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id, + const struct bpf_prog *prog); void bpf_task_storage_free(struct task_struct *task); void bpf_cgrp_storage_free(struct cgroup *cgroup); bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog); @@ -2729,7 +2730,7 @@ static inline int btf_struct_access(struct bpf_verifier_log *log, } static inline const struct bpf_func_proto * -bpf_base_func_proto(enum bpf_func_id func_id) +bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { return NULL; } diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index 74ad2215e1ba..bf629f66b2d8 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -1630,7 +1630,7 @@ cgroup_dev_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } @@ -2188,7 +2188,7 @@ sysctl_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } @@ -2345,7 +2345,7 @@ cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_perf_event_output: return &bpf_event_output_data_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index d2840dd5b00d..310bf8ed43a5 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -1669,7 +1669,7 @@ const struct bpf_func_proto bpf_probe_read_kernel_str_proto __weak; const struct bpf_func_proto bpf_task_pt_regs_proto __weak; const struct bpf_func_proto * -bpf_base_func_proto(enum bpf_func_id func_id) +bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { switch (func_id) { case BPF_FUNC_map_lookup_elem: @@ -1720,7 +1720,7 @@ bpf_base_func_proto(enum bpf_func_id func_id) break; } - if (!bpf_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_BPF)) return NULL; switch (func_id) { @@ -1778,7 +1778,7 @@ bpf_base_func_proto(enum bpf_func_id func_id) break; } - if (!perfmon_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_PERFMON)) return NULL; switch (func_id) { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index c6b00aee3b62..f0e22d116041 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -5654,7 +5654,7 @@ static const struct bpf_func_proto bpf_sys_bpf_proto = { const struct bpf_func_proto * __weak tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } BPF_CALL_1(bpf_sys_close, u32, fd) @@ -5704,7 +5704,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { switch (func_id) { case BPF_FUNC_sys_bpf: - return !perfmon_capable() ? NULL : &bpf_sys_bpf_proto; + return !bpf_token_capable(prog->aux->token, CAP_PERFMON) + ? NULL : &bpf_sys_bpf_proto; case BPF_FUNC_btf_find_by_name_kind: return &bpf_btf_find_by_name_kind_proto; case BPF_FUNC_sys_close: diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index df697c74d519..ef22a2e870b3 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1557,7 +1557,7 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_trace_vprintk: return bpf_get_trace_vprintk_proto(); default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } diff --git a/net/core/filter.c b/net/core/filter.c index cc2e4babc85f..135d5283e0e3 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -84,7 +84,7 @@ #include static const struct bpf_func_proto * -bpf_sk_base_func_proto(enum bpf_func_id func_id); +bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog); int copy_bpf_fprog_from_user(struct sock_fprog *dst, sockptr_t src, int len) { @@ -7823,7 +7823,7 @@ sock_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_ktime_get_coarse_ns: return &bpf_ktime_get_coarse_ns_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } @@ -7916,7 +7916,7 @@ sock_addr_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return NULL; } default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -7935,7 +7935,7 @@ sk_filter_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_perf_event_output: return &bpf_skb_event_output_proto; default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8122,7 +8122,7 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) #endif #endif default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8181,7 +8181,7 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) #endif #endif default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } #if IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES) @@ -8242,7 +8242,7 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_tcp_sock_proto; #endif /* CONFIG_INET */ default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8284,7 +8284,7 @@ sk_msg_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_cgroup_classid_curr_proto; #endif default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8328,7 +8328,7 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_skc_lookup_tcp_proto; #endif default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8339,7 +8339,7 @@ flow_dissector_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_skb_load_bytes: return &bpf_flow_dissector_load_bytes_proto; default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -8366,7 +8366,7 @@ lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_skb_under_cgroup: return &bpf_skb_under_cgroup_proto; default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -11197,7 +11197,7 @@ sk_reuseport_func_proto(enum bpf_func_id func_id, case BPF_FUNC_ktime_get_coarse_ns: return &bpf_ktime_get_coarse_ns_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } @@ -11379,7 +11379,7 @@ sk_lookup_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_sk_release: return &bpf_sk_release_proto; default: - return bpf_sk_base_func_proto(func_id); + return bpf_sk_base_func_proto(func_id, prog); } } @@ -11713,7 +11713,7 @@ const struct bpf_func_proto bpf_sock_from_file_proto = { }; static const struct bpf_func_proto * -bpf_sk_base_func_proto(enum bpf_func_id func_id) +bpf_sk_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { const struct bpf_func_proto *func; @@ -11742,10 +11742,10 @@ bpf_sk_base_func_proto(enum bpf_func_id func_id) case BPF_FUNC_ktime_get_coarse_ns: return &bpf_ktime_get_coarse_ns_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } - if (!perfmon_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_PERFMON)) return NULL; return func; diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c index 39dcccf0f174..c7bbd8f3c708 100644 --- a/net/ipv4/bpf_tcp_ca.c +++ b/net/ipv4/bpf_tcp_ca.c @@ -191,7 +191,7 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id, case BPF_FUNC_ktime_get_coarse_ns: return &bpf_ktime_get_coarse_ns_proto; default: - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } } diff --git a/net/netfilter/nf_bpf_link.c b/net/netfilter/nf_bpf_link.c index e502ec00b2fe..1969facac91c 100644 --- a/net/netfilter/nf_bpf_link.c +++ b/net/netfilter/nf_bpf_link.c @@ -314,7 +314,7 @@ static bool nf_is_valid_access(int off, int size, enum bpf_access_type type, static const struct bpf_func_proto * bpf_nf_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { - return bpf_base_func_proto(func_id); + return bpf_base_func_proto(func_id, prog); } const struct bpf_verifier_ops netfilter_verifier_ops = { From patchwork Thu Oct 12 22:28:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419916 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B307CDB47E for ; Thu, 12 Oct 2023 22:32:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443016AbjJLWcV convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442975AbjJLWcU (ORCPT ); Thu, 12 Oct 2023 18:32:20 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00D35CF for ; Thu, 12 Oct 2023 15:32:18 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuSmR031584 for ; Thu, 12 Oct 2023 15:32:18 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpbs0250c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:18 -0700 Received: from twshared32169.15.frc2.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:17 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 532FA39A3A502; Thu, 12 Oct 2023 15:29:19 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 08/18] bpf: consistenly use BPF token throughout BPF verifier logic Date: Thu, 12 Oct 2023 15:28:00 -0700 Message-ID: <20231012222810.4120312-9-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 3s6PtVKC2Xm0B7Y63G4XlZQagRATiI2l X-Proofpoint-ORIG-GUID: 3s6PtVKC2Xm0B7Y63G4XlZQagRATiI2l X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Remove remaining direct queries to perfmon_capable() and bpf_capable() in BPF verifier logic and instead use BPF token (if available) to make decisions about privileges. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 16 ++++++++-------- include/linux/filter.h | 2 +- kernel/bpf/arraymap.c | 2 +- kernel/bpf/core.c | 2 +- kernel/bpf/verifier.c | 13 ++++++------- net/core/filter.c | 4 ++-- 6 files changed, 19 insertions(+), 20 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3060820e0052..c87af564f464 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2180,24 +2180,24 @@ extern int sysctl_unprivileged_bpf_disabled; bool bpf_token_capable(const struct bpf_token *token, int cap); -static inline bool bpf_allow_ptr_leaks(void) +static inline bool bpf_allow_ptr_leaks(const struct bpf_token *token) { - return perfmon_capable(); + return bpf_token_capable(token, CAP_PERFMON); } -static inline bool bpf_allow_uninit_stack(void) +static inline bool bpf_allow_uninit_stack(const struct bpf_token *token) { - return perfmon_capable(); + return bpf_token_capable(token, CAP_PERFMON); } -static inline bool bpf_bypass_spec_v1(void) +static inline bool bpf_bypass_spec_v1(const struct bpf_token *token) { - return perfmon_capable() || cpu_mitigations_off(); + return cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); } -static inline bool bpf_bypass_spec_v4(void) +static inline bool bpf_bypass_spec_v4(const struct bpf_token *token) { - return perfmon_capable() || cpu_mitigations_off(); + return cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); } int bpf_map_new_fd(struct bpf_map *map, int flags); diff --git a/include/linux/filter.h b/include/linux/filter.h index bcd2bc15ff56..cf3962de56e6 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1145,7 +1145,7 @@ static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog) return false; if (!bpf_jit_harden) return false; - if (bpf_jit_harden == 1 && bpf_capable()) + if (bpf_jit_harden == 1 && bpf_token_capable(prog->aux->token, CAP_BPF)) return false; return true; diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 2058e89b5ddd..f0c64df6b6ff 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -82,7 +82,7 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr) bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY; int numa_node = bpf_map_attr_numa_node(attr); u32 elem_size, index_mask, max_entries; - bool bypass_spec_v1 = bpf_bypass_spec_v1(); + bool bypass_spec_v1 = bpf_bypass_spec_v1(NULL); u64 array_size, mask64; struct bpf_array *array; diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index fc8de25b7948..ce307440fa8d 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -675,7 +675,7 @@ static bool bpf_prog_kallsyms_candidate(const struct bpf_prog *fp) void bpf_prog_kallsyms_add(struct bpf_prog *fp) { if (!bpf_prog_kallsyms_candidate(fp) || - !bpf_capable()) + !bpf_token_capable(fp->aux->token, CAP_BPF)) return; bpf_prog_ksym_set_addr(fp); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index e777f50401b6..28755bf1b5a5 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -20138,7 +20138,12 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 env->prog = *prog; env->ops = bpf_verifier_ops[env->prog->type]; env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel); - is_priv = bpf_capable(); + + env->allow_ptr_leaks = bpf_allow_ptr_leaks(env->prog->aux->token); + env->allow_uninit_stack = bpf_allow_uninit_stack(env->prog->aux->token); + env->bypass_spec_v1 = bpf_bypass_spec_v1(env->prog->aux->token); + env->bypass_spec_v4 = bpf_bypass_spec_v4(env->prog->aux->token); + env->bpf_capable = is_priv = bpf_token_capable(env->prog->aux->token, CAP_BPF); bpf_get_btf_vmlinux(); @@ -20170,12 +20175,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 if (attr->prog_flags & BPF_F_ANY_ALIGNMENT) env->strict_alignment = false; - env->allow_ptr_leaks = bpf_allow_ptr_leaks(); - env->allow_uninit_stack = bpf_allow_uninit_stack(); - env->bypass_spec_v1 = bpf_bypass_spec_v1(); - env->bypass_spec_v4 = bpf_bypass_spec_v4(); - env->bpf_capable = bpf_capable(); - if (is_priv) env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ; diff --git a/net/core/filter.c b/net/core/filter.c index 135d5283e0e3..6d3c1e15ff19 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -8541,7 +8541,7 @@ static bool cg_skb_is_valid_access(int off, int size, return false; case bpf_ctx_range(struct __sk_buff, data): case bpf_ctx_range(struct __sk_buff, data_end): - if (!bpf_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_BPF)) return false; break; } @@ -8553,7 +8553,7 @@ static bool cg_skb_is_valid_access(int off, int size, case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]): break; case bpf_ctx_range(struct __sk_buff, tstamp): - if (!bpf_capable()) + if (!bpf_token_capable(prog->aux->token, CAP_BPF)) return false; break; default: From patchwork Thu Oct 12 22:28:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419917 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA4F4CDB474 for ; Thu, 12 Oct 2023 22:32:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443017AbjJLWcX convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442975AbjJLWcW (ORCPT ); Thu, 12 Oct 2023 18:32:22 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C834CF for ; Thu, 12 Oct 2023 15:32:20 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLu3dv019290 for ; Thu, 12 Oct 2023 15:32:20 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjtjj53c-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:20 -0700 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:19 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 619B439A3A52D; Thu, 12 Oct 2023 15:29:21 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 09/18] bpf,lsm: refactor bpf_prog_alloc/bpf_prog_free LSM hooks Date: Thu, 12 Oct 2023 15:28:01 -0700 Message-ID: <20231012222810.4120312-10-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: RtH88aQTdSxd0Y942XfyaZvve_B_d1hH X-Proofpoint-GUID: RtH88aQTdSxd0Y942XfyaZvve_B_d1hH X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Based on upstream discussion ([0]), rework existing bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF program struct. Also, we pass bpf_attr union with all the user-provided arguments for BPF_PROG_LOAD command. This will give LSMs as much information as we can basically provide. The hook is also BPF token-aware now, and optional bpf_token struct is passed as a third argument. bpf_prog_load LSM hook is called after a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were allocated and filled out, but right before performing full-fledged BPF verification step. bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for consistency. SELinux code is adjusted to all new names, types, and signatures. Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be used by some LSMs to allocate extra security blob, but also by other LSMs to reject BPF program loading, we need to make sure that bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one *even* if the hook itself returned error. If we don't do that, we run the risk of leaking memory. This seems to be possible today when combining SELinux and BPF LSM, as one example, depending on their relative ordering. Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to sleepable LSM hooks list, as they are both executed in sleepable context. Also drop bpf_prog_load hook from untrusted, as there is no issue with refcount or anything else anymore, that originally forced us to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM hook arguments as trusted"). We now trigger this hook much later and it should not be an issue anymore. [0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/ Signed-off-by: Andrii Nakryiko --- include/linux/lsm_hook_defs.h | 5 +++-- include/linux/security.h | 12 +++++++----- kernel/bpf/bpf_lsm.c | 5 +++-- kernel/bpf/syscall.c | 25 +++++++++++++------------ security/security.c | 25 +++++++++++++++---------- security/selinux/hooks.c | 15 ++++++++------- 6 files changed, 49 insertions(+), 38 deletions(-) diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index ac962c4cb44b..f7088a06c593 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -400,8 +400,9 @@ LSM_HOOK(int, 0, bpf_map, struct bpf_map *map, fmode_t fmode) LSM_HOOK(int, 0, bpf_prog, struct bpf_prog *prog) LSM_HOOK(int, 0, bpf_map_alloc_security, struct bpf_map *map) LSM_HOOK(void, LSM_RET_VOID, bpf_map_free_security, struct bpf_map *map) -LSM_HOOK(int, 0, bpf_prog_alloc_security, struct bpf_prog_aux *aux) -LSM_HOOK(void, LSM_RET_VOID, bpf_prog_free_security, struct bpf_prog_aux *aux) +LSM_HOOK(int, 0, bpf_prog_load, struct bpf_prog *prog, union bpf_attr *attr, + struct bpf_token *token) +LSM_HOOK(void, LSM_RET_VOID, bpf_prog_free, struct bpf_prog *prog) #endif /* CONFIG_BPF_SYSCALL */ LSM_HOOK(int, 0, locked_down, enum lockdown_reason what) diff --git a/include/linux/security.h b/include/linux/security.h index 5f16eecde00b..b9fb101561b6 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -2020,15 +2020,16 @@ static inline void securityfs_remove(struct dentry *dentry) union bpf_attr; struct bpf_map; struct bpf_prog; -struct bpf_prog_aux; +struct bpf_token; #ifdef CONFIG_SECURITY extern int security_bpf(int cmd, union bpf_attr *attr, unsigned int size); extern int security_bpf_map(struct bpf_map *map, fmode_t fmode); extern int security_bpf_prog(struct bpf_prog *prog); extern int security_bpf_map_alloc(struct bpf_map *map); extern void security_bpf_map_free(struct bpf_map *map); -extern int security_bpf_prog_alloc(struct bpf_prog_aux *aux); -extern void security_bpf_prog_free(struct bpf_prog_aux *aux); +extern int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, + struct bpf_token *token); +extern void security_bpf_prog_free(struct bpf_prog *prog); #else static inline int security_bpf(int cmd, union bpf_attr *attr, unsigned int size) @@ -2054,12 +2055,13 @@ static inline int security_bpf_map_alloc(struct bpf_map *map) static inline void security_bpf_map_free(struct bpf_map *map) { } -static inline int security_bpf_prog_alloc(struct bpf_prog_aux *aux) +static inline int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, + struct bpf_token *token) { return 0; } -static inline void security_bpf_prog_free(struct bpf_prog_aux *aux) +static inline void security_bpf_prog_free(struct bpf_prog *prog) { } #endif /* CONFIG_SECURITY */ #endif /* CONFIG_BPF_SYSCALL */ diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index e14c822f8911..3e956f6302f3 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -263,6 +263,8 @@ BTF_ID(func, bpf_lsm_bpf_map) BTF_ID(func, bpf_lsm_bpf_map_alloc_security) BTF_ID(func, bpf_lsm_bpf_map_free_security) BTF_ID(func, bpf_lsm_bpf_prog) +BTF_ID(func, bpf_lsm_bpf_prog_load) +BTF_ID(func, bpf_lsm_bpf_prog_free) BTF_ID(func, bpf_lsm_bprm_check_security) BTF_ID(func, bpf_lsm_bprm_committed_creds) BTF_ID(func, bpf_lsm_bprm_committing_creds) @@ -346,8 +348,7 @@ BTF_SET_END(sleepable_lsm_hooks) BTF_SET_START(untrusted_lsm_hooks) BTF_ID(func, bpf_lsm_bpf_map_free_security) -BTF_ID(func, bpf_lsm_bpf_prog_alloc_security) -BTF_ID(func, bpf_lsm_bpf_prog_free_security) +BTF_ID(func, bpf_lsm_bpf_prog_free) BTF_ID(func, bpf_lsm_file_alloc_security) BTF_ID(func, bpf_lsm_file_free_security) #ifdef CONFIG_SECURITY_NETWORK diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f0e22d116041..ee0fea87b17b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2138,7 +2138,7 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu) kvfree(aux->func_info); kfree(aux->func_info_aux); free_uid(aux->user); - security_bpf_prog_free(aux); + security_bpf_prog_free(aux->prog); bpf_prog_free(aux->prog); } @@ -2728,10 +2728,6 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) prog->aux->token = token; token = NULL; - err = security_bpf_prog_alloc(prog->aux); - if (err) - goto free_prog; - prog->aux->user = get_current_user(); prog->len = attr->insn_cnt; @@ -2739,12 +2735,12 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) if (copy_from_bpfptr(prog->insns, make_bpfptr(attr->insns, uattr.is_kernel), bpf_prog_insn_size(prog)) != 0) - goto free_prog_sec; + goto free_prog; /* copy eBPF program license from user space */ if (strncpy_from_bpfptr(license, make_bpfptr(attr->license, uattr.is_kernel), sizeof(license) - 1) < 0) - goto free_prog_sec; + goto free_prog; license[sizeof(license) - 1] = 0; /* eBPF programs must be GPL compatible to use GPL-ed functions */ @@ -2758,25 +2754,29 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) if (bpf_prog_is_dev_bound(prog->aux)) { err = bpf_prog_dev_bound_init(prog, attr); if (err) - goto free_prog_sec; + goto free_prog; } if (type == BPF_PROG_TYPE_EXT && dst_prog && bpf_prog_is_dev_bound(dst_prog->aux)) { err = bpf_prog_dev_bound_inherit(prog, dst_prog); if (err) - goto free_prog_sec; + goto free_prog; } /* find program type: socket_filter vs tracing_filter */ err = find_prog_type(type, prog); if (err < 0) - goto free_prog_sec; + goto free_prog; prog->aux->load_time = ktime_get_boottime_ns(); err = bpf_obj_name_cpy(prog->aux->name, attr->prog_name, sizeof(attr->prog_name)); if (err < 0) + goto free_prog; + + err = security_bpf_prog_load(prog, attr, token); + if (err) goto free_prog_sec; /* run eBPF verifier */ @@ -2822,10 +2822,11 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) */ __bpf_prog_put_noref(prog, prog->aux->real_func_cnt); return err; + free_prog_sec: - free_uid(prog->aux->user); - security_bpf_prog_free(prog->aux); + security_bpf_prog_free(prog); free_prog: + free_uid(prog->aux->user); if (prog->aux->attach_btf) btf_put(prog->aux->attach_btf); bpf_prog_free(prog); diff --git a/security/security.c b/security/security.c index 23b129d482a7..bc8b72325c3b 100644 --- a/security/security.c +++ b/security/security.c @@ -5180,16 +5180,21 @@ int security_bpf_map_alloc(struct bpf_map *map) } /** - * security_bpf_prog_alloc() - Allocate a bpf program LSM blob - * @aux: bpf program aux info struct + * security_bpf_prog_load() - Check if loading of BPF program is allowed + * @prog BPF program object + * @attr: BPF syscall attributes used to create BPF program + * @token: BPF token used to grant user access to BPF subsystem * - * Initialize the security field inside bpf program. + * Do a check when the kernel allocates BPF program object and is about to + * pass it to BPF verifier for additional correctness checks. This is also the + * point where LSM blob is allocated for LSMs that need them. * * Return: Returns 0 on success, error on failure. */ -int security_bpf_prog_alloc(struct bpf_prog_aux *aux) +int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, + struct bpf_token *token) { - return call_int_hook(bpf_prog_alloc_security, 0, aux); + return call_int_hook(bpf_prog_load, 0, prog, attr, token); } /** @@ -5204,14 +5209,14 @@ void security_bpf_map_free(struct bpf_map *map) } /** - * security_bpf_prog_free() - Free a bpf program's LSM blob - * @aux: bpf program aux info struct + * security_bpf_prog_free() - Free a BPF program's LSM blob + * @prog: BPF program struct * - * Clean up the security information stored inside bpf prog. + * Clean up the security information stored inside BPF program. */ -void security_bpf_prog_free(struct bpf_prog_aux *aux) +void security_bpf_prog_free(struct bpf_prog *prog) { - call_void_hook(bpf_prog_free_security, aux); + call_void_hook(bpf_prog_free, prog); } #endif /* CONFIG_BPF_SYSCALL */ diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 2aa0e219d721..f386adb0f27d 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -6805,7 +6805,8 @@ static void selinux_bpf_map_free(struct bpf_map *map) kfree(bpfsec); } -static int selinux_bpf_prog_alloc(struct bpf_prog_aux *aux) +static int selinux_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, + struct bpf_token *token) { struct bpf_security_struct *bpfsec; @@ -6814,16 +6815,16 @@ static int selinux_bpf_prog_alloc(struct bpf_prog_aux *aux) return -ENOMEM; bpfsec->sid = current_sid(); - aux->security = bpfsec; + prog->aux->security = bpfsec; return 0; } -static void selinux_bpf_prog_free(struct bpf_prog_aux *aux) +static void selinux_bpf_prog_free(struct bpf_prog *prog) { - struct bpf_security_struct *bpfsec = aux->security; + struct bpf_security_struct *bpfsec = prog->aux->security; - aux->security = NULL; + prog->aux->security = NULL; kfree(bpfsec); } #endif @@ -7180,7 +7181,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { LSM_HOOK_INIT(bpf_map, selinux_bpf_map), LSM_HOOK_INIT(bpf_prog, selinux_bpf_prog), LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free), - LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free), + LSM_HOOK_INIT(bpf_prog_free, selinux_bpf_prog_free), #endif #ifdef CONFIG_PERF_EVENTS @@ -7238,7 +7239,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { #endif #ifdef CONFIG_BPF_SYSCALL LSM_HOOK_INIT(bpf_map_alloc_security, selinux_bpf_map_alloc), - LSM_HOOK_INIT(bpf_prog_alloc_security, selinux_bpf_prog_alloc), + LSM_HOOK_INIT(bpf_prog_load, selinux_bpf_prog_load), #endif #ifdef CONFIG_PERF_EVENTS LSM_HOOK_INIT(perf_event_alloc, selinux_perf_event_alloc), From patchwork Thu Oct 12 22:28:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419919 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 649B0C46CA1 for ; Thu, 12 Oct 2023 22:32:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442975AbjJLWc2 convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443018AbjJLWc0 (ORCPT ); Thu, 12 Oct 2023 18:32:26 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7DA1E8 for ; Thu, 12 Oct 2023 15:32:24 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuWUn031838 for ; Thu, 12 Oct 2023 15:32:24 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpbs025hf-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:24 -0700 Received: from twshared15247.17.frc2.facebook.com (2620:10d:c0a8:1b::30) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:22 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 6DF7439A3A539; Thu, 12 Oct 2023 15:29:23 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 10/18] bpf,lsm: refactor bpf_map_alloc/bpf_map_free LSM hooks Date: Thu, 12 Oct 2023 15:28:02 -0700 Message-ID: <20231012222810.4120312-11-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: mjRNGENBx7n8V5Z6SaVi7EAQktbps_aQ X-Proofpoint-ORIG-GUID: mjRNGENBx7n8V5Z6SaVi7EAQktbps_aQ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc hook into bpf_map_create, taking not just struct bpf_map, but also bpf_attr and bpf_token, to give a fuller context to LSMs. Unlike bpf_prog_alloc, there is no need to move the hook around, as it currently is firing right before allocating BPF map ID and FD, which seems to be a sweet spot. But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free LSM hook is called even if bpf_map_create hook returned error, as if few LSMs are combined together it could be that one LSM successfully allocated security blob for its needs, while subsequent LSM rejected BPF map creation. The former LSM would still need to free up LSM blob, so we need to ensure security_bpf_map_free() is called regardless of the outcome. Signed-off-by: Andrii Nakryiko --- include/linux/lsm_hook_defs.h | 5 +++-- include/linux/security.h | 6 ++++-- kernel/bpf/bpf_lsm.c | 6 +++--- kernel/bpf/syscall.c | 4 ++-- security/security.c | 16 ++++++++++------ security/selinux/hooks.c | 7 ++++--- 6 files changed, 26 insertions(+), 18 deletions(-) diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index f7088a06c593..0adfb136521a 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -398,8 +398,9 @@ LSM_HOOK(void, LSM_RET_VOID, audit_rule_free, void *lsmrule) LSM_HOOK(int, 0, bpf, int cmd, union bpf_attr *attr, unsigned int size) LSM_HOOK(int, 0, bpf_map, struct bpf_map *map, fmode_t fmode) LSM_HOOK(int, 0, bpf_prog, struct bpf_prog *prog) -LSM_HOOK(int, 0, bpf_map_alloc_security, struct bpf_map *map) -LSM_HOOK(void, LSM_RET_VOID, bpf_map_free_security, struct bpf_map *map) +LSM_HOOK(int, 0, bpf_map_create, struct bpf_map *map, union bpf_attr *attr, + struct bpf_token *token) +LSM_HOOK(void, LSM_RET_VOID, bpf_map_free, struct bpf_map *map) LSM_HOOK(int, 0, bpf_prog_load, struct bpf_prog *prog, union bpf_attr *attr, struct bpf_token *token) LSM_HOOK(void, LSM_RET_VOID, bpf_prog_free, struct bpf_prog *prog) diff --git a/include/linux/security.h b/include/linux/security.h index b9fb101561b6..59c5fab2c4d6 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -2025,7 +2025,8 @@ struct bpf_token; extern int security_bpf(int cmd, union bpf_attr *attr, unsigned int size); extern int security_bpf_map(struct bpf_map *map, fmode_t fmode); extern int security_bpf_prog(struct bpf_prog *prog); -extern int security_bpf_map_alloc(struct bpf_map *map); +extern int security_bpf_map_create(struct bpf_map *map, union bpf_attr *attr, + struct bpf_token *token); extern void security_bpf_map_free(struct bpf_map *map); extern int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, struct bpf_token *token); @@ -2047,7 +2048,8 @@ static inline int security_bpf_prog(struct bpf_prog *prog) return 0; } -static inline int security_bpf_map_alloc(struct bpf_map *map) +static inline int security_bpf_map_create(struct bpf_map *map, union bpf_attr *attr, + struct bpf_token *token) { return 0; } diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 3e956f6302f3..9e4e615f11eb 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -260,8 +260,8 @@ bpf_lsm_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) BTF_SET_START(sleepable_lsm_hooks) BTF_ID(func, bpf_lsm_bpf) BTF_ID(func, bpf_lsm_bpf_map) -BTF_ID(func, bpf_lsm_bpf_map_alloc_security) -BTF_ID(func, bpf_lsm_bpf_map_free_security) +BTF_ID(func, bpf_lsm_bpf_map_create) +BTF_ID(func, bpf_lsm_bpf_map_free) BTF_ID(func, bpf_lsm_bpf_prog) BTF_ID(func, bpf_lsm_bpf_prog_load) BTF_ID(func, bpf_lsm_bpf_prog_free) @@ -347,7 +347,7 @@ BTF_ID(func, bpf_lsm_userns_create) BTF_SET_END(sleepable_lsm_hooks) BTF_SET_START(untrusted_lsm_hooks) -BTF_ID(func, bpf_lsm_bpf_map_free_security) +BTF_ID(func, bpf_lsm_bpf_map_free) BTF_ID(func, bpf_lsm_bpf_prog_free) BTF_ID(func, bpf_lsm_file_alloc_security) BTF_ID(func, bpf_lsm_file_free_security) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ee0fea87b17b..160571b65b41 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1285,9 +1285,9 @@ static int map_create(union bpf_attr *attr) attr->btf_vmlinux_value_type_id; } - err = security_bpf_map_alloc(map); + err = security_bpf_map_create(map, attr, token); if (err) - goto free_map; + goto free_map_sec; err = bpf_map_alloc_id(map); if (err) diff --git a/security/security.c b/security/security.c index bc8b72325c3b..145e8082b9a6 100644 --- a/security/security.c +++ b/security/security.c @@ -5167,16 +5167,20 @@ int security_bpf_prog(struct bpf_prog *prog) } /** - * security_bpf_map_alloc() - Allocate a bpf map LSM blob - * @map: bpf map + * security_bpf_map_create() - Check if BPF map creation is allowed + * @map BPF map object + * @attr: BPF syscall attributes used to create BPF map + * @token: BPF token used to grant user access * - * Initialize the security field inside bpf map. + * Do a check when the kernel creates a new BPF map. This is also the + * point where LSM blob is allocated for LSMs that need them. * * Return: Returns 0 on success, error on failure. */ -int security_bpf_map_alloc(struct bpf_map *map) +int security_bpf_map_create(struct bpf_map *map, union bpf_attr *attr, + struct bpf_token *token) { - return call_int_hook(bpf_map_alloc_security, 0, map); + return call_int_hook(bpf_map_create, 0, map, attr, token); } /** @@ -5205,7 +5209,7 @@ int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, */ void security_bpf_map_free(struct bpf_map *map) { - call_void_hook(bpf_map_free_security, map); + call_void_hook(bpf_map_free, map); } /** diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index f386adb0f27d..3d5dea4f1df0 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -6783,7 +6783,8 @@ static int selinux_bpf_prog(struct bpf_prog *prog) BPF__PROG_RUN, NULL); } -static int selinux_bpf_map_alloc(struct bpf_map *map) +static int selinux_bpf_map_create(struct bpf_map *map, union bpf_attr *attr, + struct bpf_token *token) { struct bpf_security_struct *bpfsec; @@ -7180,7 +7181,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { LSM_HOOK_INIT(bpf, selinux_bpf), LSM_HOOK_INIT(bpf_map, selinux_bpf_map), LSM_HOOK_INIT(bpf_prog, selinux_bpf_prog), - LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free), + LSM_HOOK_INIT(bpf_map_free, selinux_bpf_map_free), LSM_HOOK_INIT(bpf_prog_free, selinux_bpf_prog_free), #endif @@ -7238,7 +7239,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { LSM_HOOK_INIT(audit_rule_init, selinux_audit_rule_init), #endif #ifdef CONFIG_BPF_SYSCALL - LSM_HOOK_INIT(bpf_map_alloc_security, selinux_bpf_map_alloc), + LSM_HOOK_INIT(bpf_map_create, selinux_bpf_map_create), LSM_HOOK_INIT(bpf_prog_load, selinux_bpf_prog_load), #endif #ifdef CONFIG_PERF_EVENTS From patchwork Thu Oct 12 22:28:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419927 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CDACCDB484 for ; Thu, 12 Oct 2023 22:32:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443032AbjJLWcm convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443033AbjJLWci (ORCPT ); Thu, 12 Oct 2023 18:32:38 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2210AEB for ; Thu, 12 Oct 2023 15:32:37 -0700 (PDT) Received: from pps.filterd (m0148460.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuoPN014798 for ; Thu, 12 Oct 2023 15:32:36 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tppq2a9pk-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:36 -0700 Received: from twshared40933.03.prn6.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:82::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:31 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 7A4A139A3A544; Thu, 12 Oct 2023 15:29:25 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 11/18] bpf,lsm: add bpf_token_create and bpf_token_free LSM hooks Date: Thu, 12 Oct 2023 15:28:03 -0700 Message-ID: <20231012222810.4120312-12-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: LquUfczoDlhCENPMbX6__-97ivHgSdOm X-Proofpoint-ORIG-GUID: LquUfczoDlhCENPMbX6__-97ivHgSdOm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Wire up bpf_token_create and bpf_token_free LSM hooks, which allow to allocate LSM security blob (we add `void *security` field to struct bpf_token for that), but also control who can instantiate BPF token. This follows existing pattern for BPF map and BPF prog. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 3 +++ include/linux/lsm_hook_defs.h | 3 +++ include/linux/security.h | 11 +++++++++++ kernel/bpf/bpf_lsm.c | 2 ++ kernel/bpf/token.c | 6 ++++++ security/security.c | 28 ++++++++++++++++++++++++++++ 6 files changed, 53 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c87af564f464..dfcac60f1857 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1585,6 +1585,9 @@ struct bpf_token { u64 allowed_maps; u64 allowed_progs; u64 allowed_attachs; +#ifdef CONFIG_SECURITY + void *security; +#endif }; struct bpf_struct_ops_value; diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index 0adfb136521a..d776c9b7b856 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -404,6 +404,9 @@ LSM_HOOK(void, LSM_RET_VOID, bpf_map_free, struct bpf_map *map) LSM_HOOK(int, 0, bpf_prog_load, struct bpf_prog *prog, union bpf_attr *attr, struct bpf_token *token) LSM_HOOK(void, LSM_RET_VOID, bpf_prog_free, struct bpf_prog *prog) +LSM_HOOK(int, 0, bpf_token_create, struct bpf_token *token, union bpf_attr *attr, + struct path *path) +LSM_HOOK(void, LSM_RET_VOID, bpf_token_free, struct bpf_token *token) #endif /* CONFIG_BPF_SYSCALL */ LSM_HOOK(int, 0, locked_down, enum lockdown_reason what) diff --git a/include/linux/security.h b/include/linux/security.h index 59c5fab2c4d6..b9cb1446bb78 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -2031,6 +2031,9 @@ extern void security_bpf_map_free(struct bpf_map *map); extern int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, struct bpf_token *token); extern void security_bpf_prog_free(struct bpf_prog *prog); +extern int security_bpf_token_create(struct bpf_token *token, union bpf_attr *attr, + struct path *path); +extern void security_bpf_token_free(struct bpf_token *token); #else static inline int security_bpf(int cmd, union bpf_attr *attr, unsigned int size) @@ -2065,6 +2068,14 @@ static inline int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr * static inline void security_bpf_prog_free(struct bpf_prog *prog) { } + +static inline int security_bpf_token_create(struct bpf_token *token, union bpf_attr *attr, + struct path *path) +{ + return 0; +} +static inline void security_bpf_token_free(struct bpf_token *token) +{ } #endif /* CONFIG_SECURITY */ #endif /* CONFIG_BPF_SYSCALL */ diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c index 9e4e615f11eb..2a019528953e 100644 --- a/kernel/bpf/bpf_lsm.c +++ b/kernel/bpf/bpf_lsm.c @@ -265,6 +265,8 @@ BTF_ID(func, bpf_lsm_bpf_map_free) BTF_ID(func, bpf_lsm_bpf_prog) BTF_ID(func, bpf_lsm_bpf_prog_load) BTF_ID(func, bpf_lsm_bpf_prog_free) +BTF_ID(func, bpf_lsm_bpf_token_create) +BTF_ID(func, bpf_lsm_bpf_token_free) BTF_ID(func, bpf_lsm_bprm_check_security) BTF_ID(func, bpf_lsm_bprm_committed_creds) BTF_ID(func, bpf_lsm_bprm_committing_creds) diff --git a/kernel/bpf/token.c b/kernel/bpf/token.c index d4e0cc8075d3..18fd1e04f92d 100644 --- a/kernel/bpf/token.c +++ b/kernel/bpf/token.c @@ -7,6 +7,7 @@ #include #include #include +#include bool bpf_token_capable(const struct bpf_token *token, int cap) { @@ -28,6 +29,7 @@ void bpf_token_inc(struct bpf_token *token) static void bpf_token_free(struct bpf_token *token) { + security_bpf_token_free(token); put_user_ns(token->userns); kvfree(token); } @@ -183,6 +185,10 @@ int bpf_token_create(union bpf_attr *attr) token->allowed_progs = mnt_opts->delegate_progs; token->allowed_attachs = mnt_opts->delegate_attachs; + err = security_bpf_token_create(token, attr, &path); + if (err) + goto out_token; + fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) { err = fd; diff --git a/security/security.c b/security/security.c index 145e8082b9a6..83a2403b7261 100644 --- a/security/security.c +++ b/security/security.c @@ -5201,6 +5201,23 @@ int security_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr, return call_int_hook(bpf_prog_load, 0, prog, attr, token); } +/** + * security_bpf_token_create() - Check if creating of BPF token is allowed + * @token BPF token object + * @attr: BPF syscall attributes used to create BPF token + * @path: path pointing to BPF FS mount point from which BPF token is created + * + * Do a check when the kernel instantiates a new BPF token object from BPF FS + * instance. This is also the point where LSM blob can be allocated for LSMs. + * + * Return: Returns 0 on success, error on failure. + */ +int security_bpf_token_create(struct bpf_token *token, union bpf_attr *attr, + struct path *path) +{ + return call_int_hook(bpf_token_create, 0, token, attr, path); +} + /** * security_bpf_map_free() - Free a bpf map's LSM blob * @map: bpf map @@ -5222,6 +5239,17 @@ void security_bpf_prog_free(struct bpf_prog *prog) { call_void_hook(bpf_prog_free, prog); } + +/** + * security_bpf_token_free() - Free a BPF token's LSM blob + * @token: BPF token struct + * + * Clean up the security information stored inside BPF token. + */ +void security_bpf_token_free(struct bpf_token *token) +{ + call_void_hook(bpf_token_free, token); +} #endif /* CONFIG_BPF_SYSCALL */ /** From patchwork Thu Oct 12 22:28:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419924 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF6BFCDB47E for ; Thu, 12 Oct 2023 22:32:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443031AbjJLWci convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443034AbjJLWcg (ORCPT ); Thu, 12 Oct 2023 18:32:36 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 790B3EE for ; Thu, 12 Oct 2023 15:32:34 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 39CLuMDH022489 for ; Thu, 12 Oct 2023 15:32:33 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3tpbry9a19-10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:33 -0700 Received: from twshared9518.03.prn6.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:27 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 8CC6939A3A56F; Thu, 12 Oct 2023 15:29:27 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 12/18] libbpf: add bpf_token_create() API Date: Thu, 12 Oct 2023 15:28:04 -0700 Message-ID: <20231012222810.4120312-13-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: -G5U81qNskaFuktHws6DE8ixZj9TT-2T X-Proofpoint-ORIG-GUID: -G5U81qNskaFuktHws6DE8ixZj9TT-2T X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/bpf.c | 19 +++++++++++++++++++ tools/lib/bpf/bpf.h | 28 ++++++++++++++++++++++++++++ tools/lib/bpf/libbpf.map | 1 + 3 files changed, 48 insertions(+) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index b0f1913763a3..593ff9ea120d 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -1271,3 +1271,22 @@ int bpf_prog_bind_map(int prog_fd, int map_fd, ret = sys_bpf(BPF_PROG_BIND_MAP, &attr, attr_sz); return libbpf_err_errno(ret); } + +int bpf_token_create(int bpffs_path_fd, const char *bpffs_pathname, + struct bpf_token_create_opts *opts) +{ + const size_t attr_sz = offsetofend(union bpf_attr, token_create); + union bpf_attr attr; + int fd; + + if (!OPTS_VALID(opts, bpf_token_create_opts)) + return libbpf_err(-EINVAL); + + memset(&attr, 0, attr_sz); + attr.token_create.bpffs_path_fd = bpffs_path_fd; + attr.token_create.bpffs_pathname = ptr_to_u64(bpffs_pathname); + attr.token_create.flags = OPTS_GET(opts, flags, 0); + + fd = sys_bpf_fd(BPF_TOKEN_CREATE, &attr, attr_sz); + return libbpf_err_errno(fd); +} diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 74c2887cfd24..a5ddb0393fee 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -635,6 +635,34 @@ struct bpf_test_run_opts { LIBBPF_API int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts); +struct bpf_token_create_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + __u32 flags; + size_t :0; +}; +#define bpf_token_create_opts__last_field flags + +/** + * @brief **bpf_token_create()** creates a new instance of BPF token derived + * from specified BPF FS mount point. + * + * BPF token created with this API can be passed to bpf() syscall for + * commands like BPF_PROG_LOAD, BPF_MAP_CREATE, etc. + * + * @param bpffs_path_fd O_PATH FD (see man 2 openat() for semantics) specifying, + * in combination with *bpffs_pathname*, BPF FS mount from which to derive + * a BPF token instance. + * @param pin_pathname absolute or relative path specifying,in combination + * with *bpffs_pathname*, BPF FS mount from which to derive a BPF token + * instance. + * @param opts optional BPF token creation options, can be NULL + * + * @return BPF token FD > 0, on success; negative error code, otherwise (errno + * is also set to the error code) + */ +LIBBPF_API int bpf_token_create(int bpffs_path_fd, const char *bpffs_pathname, + struct bpf_token_create_opts *opts); + #ifdef __cplusplus } /* extern "C" */ #endif diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index cc973b678a39..9d05ebcd3a9e 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -400,6 +400,7 @@ LIBBPF_1.3.0 { bpf_program__attach_netfilter; bpf_program__attach_tcx; bpf_program__attach_uprobe_multi; + bpf_token_create; ring__avail_data_size; ring__consume; ring__consumer_pos; From patchwork Thu Oct 12 22:28:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419921 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7B83CDB483 for ; Thu, 12 Oct 2023 22:32:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443025AbjJLWca convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443024AbjJLWc2 (ORCPT ); Thu, 12 Oct 2023 18:32:28 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D008CA for ; Thu, 12 Oct 2023 15:32:27 -0700 (PDT) Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLtvl5019886 for ; Thu, 12 Oct 2023 15:32:27 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tp16ck25p-19 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:27 -0700 Received: from twshared11278.41.prn1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:24 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 971AF39A3A5CF; Thu, 12 Oct 2023 15:29:29 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 13/18] selftests/bpf: fix test_maps' use of bpf_map_create_opts Date: Thu, 12 Oct 2023 15:28:05 -0700 Message-ID: <20231012222810.4120312-14-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: _ij35dci08QPtB6Lug3x4b6hZLOj4KKH X-Proofpoint-ORIG-GUID: _ij35dci08QPtB6Lug3x4b6hZLOj4KKH X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Use LIBBPF_OPTS() macro to properly initialize bpf_map_create_opts in test_maps' tests. Signed-off-by: Andrii Nakryiko --- .../bpf/map_tests/map_percpu_stats.c | 20 +++++-------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c index 1a9eeefda9a8..8bf497a9843e 100644 --- a/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c +++ b/tools/testing/selftests/bpf/map_tests/map_percpu_stats.c @@ -326,20 +326,14 @@ static int map_create(__u32 type, const char *name, struct bpf_map_create_opts * static int create_hash(void) { - struct bpf_map_create_opts map_opts = { - .sz = sizeof(map_opts), - .map_flags = BPF_F_NO_PREALLOC, - }; + LIBBPF_OPTS(bpf_map_create_opts, map_opts, .map_flags = BPF_F_NO_PREALLOC); return map_create(BPF_MAP_TYPE_HASH, "hash", &map_opts); } static int create_percpu_hash(void) { - struct bpf_map_create_opts map_opts = { - .sz = sizeof(map_opts), - .map_flags = BPF_F_NO_PREALLOC, - }; + LIBBPF_OPTS(bpf_map_create_opts, map_opts, .map_flags = BPF_F_NO_PREALLOC); return map_create(BPF_MAP_TYPE_PERCPU_HASH, "percpu_hash", &map_opts); } @@ -356,21 +350,17 @@ static int create_percpu_hash_prealloc(void) static int create_lru_hash(__u32 type, __u32 map_flags) { - struct bpf_map_create_opts map_opts = { - .sz = sizeof(map_opts), - .map_flags = map_flags, - }; + LIBBPF_OPTS(bpf_map_create_opts, map_opts, .map_flags = map_flags); return map_create(type, "lru_hash", &map_opts); } static int create_hash_of_maps(void) { - struct bpf_map_create_opts map_opts = { - .sz = sizeof(map_opts), + LIBBPF_OPTS(bpf_map_create_opts, map_opts, .map_flags = BPF_F_NO_PREALLOC, .inner_map_fd = create_small_hash(), - }; + ); int ret; ret = map_create_opts(BPF_MAP_TYPE_HASH_OF_MAPS, "hash_of_maps", From patchwork Thu Oct 12 22:28:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419920 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54F0DCDB47E for ; Thu, 12 Oct 2023 22:32:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443021AbjJLWc3 convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443020AbjJLWc1 (ORCPT ); Thu, 12 Oct 2023 18:32:27 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E586EEA for ; Thu, 12 Oct 2023 15:32:25 -0700 (PDT) Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLtvl1019886 for ; Thu, 12 Oct 2023 15:32:25 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tp16ck25p-14 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:25 -0700 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:19 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id ADA2639A3A5E0; Thu, 12 Oct 2023 15:29:31 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 14/18] libbpf: add BPF token support to bpf_map_create() API Date: Thu, 12 Oct 2023 15:28:06 -0700 Message-ID: <20231012222810.4120312-15-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 3Gv6GwNdkExU5JKVaoCYMf9guX3RzAxv X-Proofpoint-ORIG-GUID: 3Gv6GwNdkExU5JKVaoCYMf9guX3RzAxv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/bpf.c | 4 +++- tools/lib/bpf/bpf.h | 5 ++++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 593ff9ea120d..f9ee7608a96a 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -169,7 +169,7 @@ int bpf_map_create(enum bpf_map_type map_type, __u32 max_entries, const struct bpf_map_create_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, map_extra); + const size_t attr_sz = offsetofend(union bpf_attr, map_token_fd); union bpf_attr attr; int fd; @@ -198,6 +198,8 @@ int bpf_map_create(enum bpf_map_type map_type, attr.numa_node = OPTS_GET(opts, numa_node, 0); attr.map_ifindex = OPTS_GET(opts, map_ifindex, 0); + attr.map_token_fd = OPTS_GET(opts, token_fd, 0); + fd = sys_bpf_fd(BPF_MAP_CREATE, &attr, attr_sz); return libbpf_err_errno(fd); } diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index a5ddb0393fee..156bc82c4895 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -51,8 +51,11 @@ struct bpf_map_create_opts { __u32 numa_node; __u32 map_ifindex; + + __u32 token_fd; + size_t :0; }; -#define bpf_map_create_opts__last_field map_ifindex +#define bpf_map_create_opts__last_field token_fd LIBBPF_API int bpf_map_create(enum bpf_map_type map_type, const char *map_name, From patchwork Thu Oct 12 22:28:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419923 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02770CDB482 for ; Thu, 12 Oct 2023 22:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443026AbjJLWci convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443031AbjJLWcg (ORCPT ); Thu, 12 Oct 2023 18:32:36 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 252B4DC for ; Thu, 12 Oct 2023 15:32:33 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 39CLuMDE022489 for ; Thu, 12 Oct 2023 15:32:32 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3tpbry9a19-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:32 -0700 Received: from twshared9518.03.prn6.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:27 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 4C4B339A3A614; Thu, 12 Oct 2023 15:29:33 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 15/18] libbpf: add BPF token support to bpf_btf_load() API Date: Thu, 12 Oct 2023 15:28:07 -0700 Message-ID: <20231012222810.4120312-16-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 5_WzJGprjoUsYH-UT7oapgWdgmq2BtL9 X-Proofpoint-ORIG-GUID: 5_WzJGprjoUsYH-UT7oapgWdgmq2BtL9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/bpf.c | 4 +++- tools/lib/bpf/bpf.h | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index f9ee7608a96a..4547ae1037af 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -1168,7 +1168,7 @@ int bpf_raw_tracepoint_open(const char *name, int prog_fd) int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, btf_log_true_size); + const size_t attr_sz = offsetofend(union bpf_attr, btf_token_fd); union bpf_attr attr; char *log_buf; size_t log_size; @@ -1193,6 +1193,8 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts attr.btf = ptr_to_u64(btf_data); attr.btf_size = btf_size; + attr.btf_token_fd = OPTS_GET(opts, token_fd, 0); + /* log_level == 0 and log_buf != NULL means "try loading without * log_buf, but retry with log_buf and log_level=1 on error", which is * consistent across low-level and high-level BTF and program loading diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 156bc82c4895..d7df5543f402 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -133,9 +133,10 @@ struct bpf_btf_load_opts { * If kernel doesn't support this feature, log_size is left unchanged. */ __u32 log_true_size; + __u32 token_fd; size_t :0; }; -#define bpf_btf_load_opts__last_field log_true_size +#define bpf_btf_load_opts__last_field token_fd LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts); From patchwork Thu Oct 12 22:28:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419928 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C081CDB47E for ; Thu, 12 Oct 2023 22:32:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443033AbjJLWcx convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443047AbjJLWcl (ORCPT ); Thu, 12 Oct 2023 18:32:41 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD232E9 for ; Thu, 12 Oct 2023 15:32:40 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLu3sg019251 for ; Thu, 12 Oct 2023 15:32:40 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjtjj6k4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:40 -0700 Received: from twshared11278.41.prn1.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:30 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id CB91439A3A720; Thu, 12 Oct 2023 15:29:38 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 16/18] libbpf: add BPF token support to bpf_prog_load() API Date: Thu, 12 Oct 2023 15:28:08 -0700 Message-ID: <20231012222810.4120312-17-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: rrFf0M6qmZlSaxN6NqyFPQthD1S_dVYG X-Proofpoint-GUID: rrFf0M6qmZlSaxN6NqyFPQthD1S_dVYG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/bpf.c | 3 ++- tools/lib/bpf/bpf.h | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 4547ae1037af..5a238831b4ff 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -234,7 +234,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, size_t insn_cnt, struct bpf_prog_load_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, log_true_size); + const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd); void *finfo = NULL, *linfo = NULL; const char *func_info, *line_info; __u32 log_size, log_level, attach_prog_fd, attach_btf_obj_fd; @@ -263,6 +263,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type, attr.prog_flags = OPTS_GET(opts, prog_flags, 0); attr.prog_ifindex = OPTS_GET(opts, prog_ifindex, 0); attr.kern_version = OPTS_GET(opts, kern_version, 0); + attr.prog_token_fd = OPTS_GET(opts, token_fd, 0); if (prog_name && kernel_supports(NULL, FEAT_PROG_NAME)) libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name)); diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index d7df5543f402..edc0cab465d6 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -105,9 +105,10 @@ struct bpf_prog_load_opts { * If kernel doesn't support this feature, log_size is left unchanged. */ __u32 log_true_size; + __u32 token_fd; size_t :0; }; -#define bpf_prog_load_opts__last_field log_true_size +#define bpf_prog_load_opts__last_field token_fd LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type, const char *prog_name, const char *license, From patchwork Thu Oct 12 22:28:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419918 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5FBECDB474 for ; Thu, 12 Oct 2023 22:32:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443022AbjJLWc2 convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442975AbjJLWcY (ORCPT ); Thu, 12 Oct 2023 18:32:24 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56D5CCA for ; Thu, 12 Oct 2023 15:32:21 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLu3dx019290 for ; Thu, 12 Oct 2023 15:32:21 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tpjtjj53c-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:20 -0700 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:19 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 2DD1539A3A759; Thu, 12 Oct 2023 15:29:40 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 17/18] selftests/bpf: add BPF token-enabled tests Date: Thu, 12 Oct 2023 15:28:09 -0700 Message-ID: <20231012222810.4120312-18-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: aD2_gNjjicK9_e72p7VWL4kS0biTi9Va X-Proofpoint-GUID: aD2_gNjjicK9_e72p7VWL4kS0biTi9Va X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Add a selftest that attempts to conceptually replicate intended BPF token use cases inside user namespaced container. Child process is forked. It is then put into its own userns and mountns. Child creates BPF FS context object and sets it up as desired. This ensures child userns is captures as owning userns for this instance of BPF FS. This context is passed back to privileged parent process through Unix socket, where parent creates and mounts it as a detached mount. This mount FD is passed back to the child to be used for BPF token creation, which allows otherwise privileged BPF operations to succeed inside userns. We validate that all of token-enabled privileged commands (BPF_BTF_LOAD, BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only succeed inside the userns if a) BPF token is provided with proper allowed sets of commands and types; and b) namespaces CAP_BPF and other privileges are set. Lacking a) or b) should lead to -EPERM failures. Based on suggested workflow by Christian Brauner ([0]). [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Signed-off-by: Andrii Nakryiko --- .../testing/selftests/bpf/prog_tests/token.c | 629 ++++++++++++++++++ 1 file changed, 629 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/token.c diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c new file mode 100644 index 000000000000..41cee6b4731e --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/token.c @@ -0,0 +1,629 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ +#define _GNU_SOURCE +#include +#include +#include "cap_helpers.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* copied from include/uapi/linux/mount.h, as including it conflicts with + * sys/mount.h include + */ +enum fsconfig_command { + FSCONFIG_SET_FLAG = 0, /* Set parameter, supplying no value */ + FSCONFIG_SET_STRING = 1, /* Set parameter, supplying a string value */ + FSCONFIG_SET_BINARY = 2, /* Set parameter, supplying a binary blob value */ + FSCONFIG_SET_PATH = 3, /* Set parameter, supplying an object by path */ + FSCONFIG_SET_PATH_EMPTY = 4, /* Set parameter, supplying an object by (empty) path */ + FSCONFIG_SET_FD = 5, /* Set parameter, supplying an object by fd */ + FSCONFIG_CMD_CREATE = 6, /* Invoke superblock creation */ + FSCONFIG_CMD_RECONFIGURE = 7, /* Invoke superblock reconfiguration */ +}; + +static inline int sys_fsopen(const char *fsname, unsigned flags) +{ + return syscall(__NR_fsopen, fsname, flags); +} + +static inline int sys_fsconfig(int fs_fd, unsigned cmd, const char *key, const void *val, int aux) +{ + return syscall(__NR_fsconfig, fs_fd, cmd, key, val, aux); +} + +static inline int sys_fsmount(int fs_fd, unsigned flags, unsigned ms_flags) +{ + return syscall(__NR_fsmount, fs_fd, flags, ms_flags); +} + +static int drop_priv_caps(__u64 *old_caps) +{ + return cap_disable_effective((1ULL << CAP_BPF) | + (1ULL << CAP_PERFMON) | + (1ULL << CAP_NET_ADMIN) | + (1ULL << CAP_SYS_ADMIN), old_caps); +} + +static int restore_priv_caps(__u64 old_caps) +{ + return cap_enable_effective(old_caps, NULL); +} + +static int set_delegate_mask(int fs_fd, const char *key, __u64 mask) +{ + char buf[32]; + int err; + + snprintf(buf, sizeof(buf), "0x%llx", (unsigned long long)mask); + err = sys_fsconfig(fs_fd, FSCONFIG_SET_STRING, key, + mask == ~0ULL ? "any" : buf, 0); + if (err < 0) + err = -errno; + return err; +} + +#define zclose(fd) do { if (fd >= 0) close(fd); fd = -1; } while (0) + +struct bpffs_opts { + __u64 cmds; + __u64 maps; + __u64 progs; + __u64 attachs; +}; + +static int setup_bpffs_fd(struct bpffs_opts *opts) +{ + int fs_fd = -1, err; + + /* create VFS context */ + fs_fd = sys_fsopen("bpf", 0); + if (!ASSERT_GE(fs_fd, 0, "fs_fd")) + goto cleanup; + + /* set up token delegation mount options */ + err = set_delegate_mask(fs_fd, "delegate_cmds", opts->cmds); + if (!ASSERT_OK(err, "fs_cfg_cmds")) + goto cleanup; + err = set_delegate_mask(fs_fd, "delegate_maps", opts->maps); + if (!ASSERT_OK(err, "fs_cfg_maps")) + goto cleanup; + err = set_delegate_mask(fs_fd, "delegate_progs", opts->progs); + if (!ASSERT_OK(err, "fs_cfg_progs")) + goto cleanup; + err = set_delegate_mask(fs_fd, "delegate_attachs", opts->attachs); + if (!ASSERT_OK(err, "fs_cfg_attachs")) + goto cleanup; + + return fs_fd; +cleanup: + zclose(fs_fd); + return -1; +} + +static int materialize_bpffs_fd(int fs_fd) +{ + int mnt_fd, err; + + /* instantiate FS object */ + err = sys_fsconfig(fs_fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); + if (err < 0) + return -errno; + + /* create O_PATH fd for detached mount */ + mnt_fd = sys_fsmount(fs_fd, 0, 0); + if (err < 0) + return -errno; + + return mnt_fd; +} + +/* send FD over Unix domain (AF_UNIX) socket */ +static int sendfd(int sockfd, int fd) +{ + struct msghdr msg = {}; + struct cmsghdr *cmsg; + int fds[1] = { fd }, err; + char iobuf[1]; + struct iovec io = { + .iov_base = iobuf, + .iov_len = sizeof(iobuf), + }; + union { + char buf[CMSG_SPACE(sizeof(fds))]; + struct cmsghdr align; + } u; + + msg.msg_iov = &io; + msg.msg_iovlen = 1; + msg.msg_control = u.buf; + msg.msg_controllen = sizeof(u.buf); + cmsg = CMSG_FIRSTHDR(&msg); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + cmsg->cmsg_len = CMSG_LEN(sizeof(fds)); + memcpy(CMSG_DATA(cmsg), fds, sizeof(fds)); + + err = sendmsg(sockfd, &msg, 0); + if (err < 0) + err = -errno; + if (!ASSERT_EQ(err, 1, "sendmsg")) + return -EINVAL; + + return 0; +} + +/* receive FD over Unix domain (AF_UNIX) socket */ +static int recvfd(int sockfd, int *fd) +{ + struct msghdr msg = {}; + struct cmsghdr *cmsg; + int fds[1], err; + char iobuf[1]; + struct iovec io = { + .iov_base = iobuf, + .iov_len = sizeof(iobuf), + }; + union { + char buf[CMSG_SPACE(sizeof(fds))]; + struct cmsghdr align; + } u; + + msg.msg_iov = &io; + msg.msg_iovlen = 1; + msg.msg_control = u.buf; + msg.msg_controllen = sizeof(u.buf); + + err = recvmsg(sockfd, &msg, 0); + if (err < 0) + err = -errno; + if (!ASSERT_EQ(err, 1, "recvmsg")) + return -EINVAL; + + cmsg = CMSG_FIRSTHDR(&msg); + if (!ASSERT_OK_PTR(cmsg, "cmsg_null") || + !ASSERT_EQ(cmsg->cmsg_len, CMSG_LEN(sizeof(fds)), "cmsg_len") || + !ASSERT_EQ(cmsg->cmsg_level, SOL_SOCKET, "cmsg_level") || + !ASSERT_EQ(cmsg->cmsg_type, SCM_RIGHTS, "cmsg_type")) + return -EINVAL; + + memcpy(fds, CMSG_DATA(cmsg), sizeof(fds)); + *fd = fds[0]; + + return 0; +} + +static ssize_t write_nointr(int fd, const void *buf, size_t count) +{ + ssize_t ret; + + do { + ret = write(fd, buf, count); + } while (ret < 0 && errno == EINTR); + + return ret; +} + +static int write_file(const char *path, const void *buf, size_t count) +{ + int fd; + ssize_t ret; + + fd = open(path, O_WRONLY | O_CLOEXEC | O_NOCTTY | O_NOFOLLOW); + if (fd < 0) + return -1; + + ret = write_nointr(fd, buf, count); + close(fd); + if (ret < 0 || (size_t)ret != count) + return -1; + + return 0; +} + +static int create_and_enter_userns(void) +{ + uid_t uid; + gid_t gid; + char map[100]; + + uid = getuid(); + gid = getgid(); + + if (unshare(CLONE_NEWUSER)) + return -1; + + if (write_file("/proc/self/setgroups", "deny", sizeof("deny") - 1) && + errno != ENOENT) + return -1; + + snprintf(map, sizeof(map), "0 %d 1", uid); + if (write_file("/proc/self/uid_map", map, strlen(map))) + return -1; + + + snprintf(map, sizeof(map), "0 %d 1", gid); + if (write_file("/proc/self/gid_map", map, strlen(map))) + return -1; + + if (setgid(0)) + return -1; + + if (setuid(0)) + return -1; + + return 0; +} + +typedef int (*child_callback_fn)(int); + +static void child(int sock_fd, struct bpffs_opts *bpffs_opts, child_callback_fn callback) +{ + LIBBPF_OPTS(bpf_map_create_opts, map_opts); + int mnt_fd = -1, fs_fd = -1, err = 0; + + /* setup userns with root mappings */ + err = create_and_enter_userns(); + if (!ASSERT_OK(err, "create_and_enter_userns")) + goto cleanup; + + /* setup mountns to allow creating BPF FS (fsopen("bpf")) from unpriv process */ + err = unshare(CLONE_NEWNS); + if (!ASSERT_OK(err, "create_mountns")) + goto cleanup; + + err = mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, 0); + if (!ASSERT_OK(err, "remount_root")) + goto cleanup; + + fs_fd = setup_bpffs_fd(bpffs_opts); + if (!ASSERT_GE(fs_fd, 0, "setup_bpffs")) { + err = -EINVAL; + goto cleanup; + } + + /* pass BPF FS context object to parent */ + err = sendfd(sock_fd, fs_fd); + if (!ASSERT_OK(err, "send_fs_fd")) + goto cleanup; + + /* avoid mucking around with mount namespaces and mounting at + * well-known path, just get detach-mounted BPF FS fd back from parent + */ + err = recvfd(sock_fd, &mnt_fd); + if (!ASSERT_OK(err, "recv_mnt_fd")) + goto cleanup; + + /* do custom test logic with customly set up BPF FS instance */ + err = callback(mnt_fd); + if (!ASSERT_OK(err, "test_callback")) + goto cleanup; + + err = 0; +cleanup: + zclose(sock_fd); + zclose(mnt_fd); + + exit(-err); +} + +static int wait_for_pid(pid_t pid) +{ + int status, ret; + +again: + ret = waitpid(pid, &status, 0); + if (ret == -1) { + if (errno == EINTR) + goto again; + + return -1; + } + + if (!WIFEXITED(status)) + return -1; + + return WEXITSTATUS(status); +} + +static void parent(int child_pid, int sock_fd) +{ + int fs_fd = -1, mnt_fd = -1, err; + + err = recvfd(sock_fd, &fs_fd); + if (!ASSERT_OK(err, "recv_bpffs_fd")) + goto cleanup; + + mnt_fd = materialize_bpffs_fd(fs_fd); + if (!ASSERT_GE(mnt_fd, 0, "materialize_bpffs_fd")) { + err = -EINVAL; + goto cleanup; + } + zclose(fs_fd); + + /* pass BPF FS context object to parent */ + err = sendfd(sock_fd, mnt_fd); + if (!ASSERT_OK(err, "send_mnt_fd")) + goto cleanup; + zclose(mnt_fd); + + err = wait_for_pid(child_pid); + ASSERT_OK(err, "waitpid_child"); + +cleanup: + zclose(sock_fd); + zclose(fs_fd); + zclose(mnt_fd); + + if (child_pid > 0) + (void)kill(child_pid, SIGKILL); +} + +static void subtest_userns(struct bpffs_opts *bpffs_opts, child_callback_fn cb) +{ + int sock_fds[2] = { -1, -1 }; + int child_pid = 0, err; + + err = socketpair(AF_UNIX, SOCK_STREAM, 0, sock_fds); + if (!ASSERT_OK(err, "socketpair")) + goto cleanup; + + child_pid = fork(); + if (!ASSERT_GE(child_pid, 0, "fork")) + goto cleanup; + + if (child_pid == 0) { + zclose(sock_fds[0]); + return child(sock_fds[1], bpffs_opts, cb); + + } else { + zclose(sock_fds[1]); + return parent(child_pid, sock_fds[0]); + } + +cleanup: + zclose(sock_fds[0]); + zclose(sock_fds[1]); + if (child_pid > 0) + (void)kill(child_pid, SIGKILL); +} + +static int userns_map_create(int mnt_fd) +{ + LIBBPF_OPTS(bpf_map_create_opts, map_opts); + int err, token_fd = -1, map_fd = -1; + __u64 old_caps = 0; + + /* create BPF token from BPF FS mount */ + token_fd = bpf_token_create(mnt_fd, "", NULL); + if (!ASSERT_GT(token_fd, 0, "token_create")) { + err = -EINVAL; + goto cleanup; + } + + /* while inside non-init userns, we need both a BPF token *and* + * CAP_BPF inside current userns to create privileged map; let's test + * that neither BPF token alone nor namespaced CAP_BPF is sufficient + */ + err = drop_priv_caps(&old_caps); + if (!ASSERT_OK(err, "drop_caps")) + goto cleanup; + + /* no token, no CAP_BPF -> fail */ + map_opts.token_fd = 0; + map_fd = bpf_map_create(BPF_MAP_TYPE_STACK, "wo_token_wo_bpf", 0, 8, 1, &map_opts); + if (!ASSERT_LT(map_fd, 0, "stack_map_wo_token_wo_cap_bpf_should_fail")) { + err = -EINVAL; + goto cleanup; + } + + /* token without CAP_BPF -> fail */ + map_opts.token_fd = token_fd; + map_fd = bpf_map_create(BPF_MAP_TYPE_STACK, "w_token_wo_bpf", 0, 8, 1, &map_opts); + if (!ASSERT_LT(map_fd, 0, "stack_map_w_token_wo_cap_bpf_should_fail")) { + err = -EINVAL; + goto cleanup; + } + + /* get back effective local CAP_BPF (and CAP_SYS_ADMIN) */ + err = restore_priv_caps(old_caps); + if (!ASSERT_OK(err, "restore_caps")) + goto cleanup; + + /* CAP_BPF without token -> fail */ + map_opts.token_fd = 0; + map_fd = bpf_map_create(BPF_MAP_TYPE_STACK, "wo_token_w_bpf", 0, 8, 1, &map_opts); + if (!ASSERT_LT(map_fd, 0, "stack_map_wo_token_w_cap_bpf_should_fail")) { + err = -EINVAL; + goto cleanup; + } + + /* finally, namespaced CAP_BPF + token -> success */ + map_opts.token_fd = token_fd; + map_fd = bpf_map_create(BPF_MAP_TYPE_STACK, "w_token_w_bpf", 0, 8, 1, &map_opts); + if (!ASSERT_GT(map_fd, 0, "stack_map_w_token_w_cap_bpf")) { + err = -EINVAL; + goto cleanup; + } + +cleanup: + zclose(token_fd); + zclose(map_fd); + return err; +} + +static int userns_btf_load(int mnt_fd) +{ + LIBBPF_OPTS(bpf_btf_load_opts, btf_opts); + int err, token_fd = -1, btf_fd = -1; + const void *raw_btf_data; + struct btf *btf = NULL; + __u32 raw_btf_size; + __u64 old_caps = 0; + + /* create BPF token from BPF FS mount */ + token_fd = bpf_token_create(mnt_fd, "", NULL); + if (!ASSERT_GT(token_fd, 0, "token_create")) { + err = -EINVAL; + goto cleanup; + } + + /* while inside non-init userns, we need both a BPF token *and* + * CAP_BPF inside current userns to create privileged map; let's test + * that neither BPF token alone nor namespaced CAP_BPF is sufficient + */ + err = drop_priv_caps(&old_caps); + if (!ASSERT_OK(err, "drop_caps")) + goto cleanup; + + /* setup a trivial BTF data to load to the kernel */ + btf = btf__new_empty(); + if (!ASSERT_OK_PTR(btf, "empty_btf")) + goto cleanup; + + ASSERT_GT(btf__add_int(btf, "int", 4, 0), 0, "int_type"); + + raw_btf_data = btf__raw_data(btf, &raw_btf_size); + if (!ASSERT_OK_PTR(raw_btf_data, "raw_btf_data")) + goto cleanup; + + /* no token + no CAP_BPF -> failure */ + btf_opts.token_fd = 0; + btf_fd = bpf_btf_load(raw_btf_data, raw_btf_size, &btf_opts); + if (!ASSERT_LT(btf_fd, 0, "no_token_no_cap_should_fail")) + goto cleanup; + + /* token + no CAP_BPF -> failure */ + btf_opts.token_fd = token_fd; + btf_fd = bpf_btf_load(raw_btf_data, raw_btf_size, &btf_opts); + if (!ASSERT_LT(btf_fd, 0, "token_no_cap_should_fail")) + goto cleanup; + + /* get back effective local CAP_BPF (and CAP_SYS_ADMIN) */ + err = restore_priv_caps(old_caps); + if (!ASSERT_OK(err, "restore_caps")) + goto cleanup; + + /* token + CAP_BPF -> success */ + btf_opts.token_fd = token_fd; + btf_fd = bpf_btf_load(raw_btf_data, raw_btf_size, &btf_opts); + if (!ASSERT_GT(btf_fd, 0, "token_and_cap_success")) + goto cleanup; + + err = 0; +cleanup: + btf__free(btf); + zclose(btf_fd); + zclose(token_fd); + return err; +} + +static int userns_prog_load(int mnt_fd) +{ + LIBBPF_OPTS(bpf_prog_load_opts, prog_opts); + int err, token_fd = -1, prog_fd = -1; + struct bpf_insn insns[] = { + /* bpf_jiffies64() requires CAP_BPF */ + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64), + /* bpf_get_current_task() requires CAP_PERFMON */ + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_current_task), + /* r0 = 0; exit; */ + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }; + size_t insn_cnt = ARRAY_SIZE(insns); + __u64 old_caps = 0; + + /* create BPF token from BPF FS mount */ + token_fd = bpf_token_create(mnt_fd, "", NULL); + if (!ASSERT_GT(token_fd, 0, "token_create")) { + err = -EINVAL; + goto cleanup; + } + + /* validate we can successfully load BPF program with token; this + * being XDP program (CAP_NET_ADMIN) using bpf_jiffies64() (CAP_BPF) + * and bpf_get_current_task() (CAP_PERFMON) helpers validates we have + * BPF token wired properly in a bunch of places in the kernel + */ + prog_opts.token_fd = token_fd; + prog_opts.expected_attach_type = BPF_XDP; + prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL", + insns, insn_cnt, &prog_opts); + if (!ASSERT_GT(prog_fd, 0, "prog_fd")) { + err = -EPERM; + goto cleanup; + } + + /* no token + caps -> failure */ + prog_opts.token_fd = 0; + prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL", + insns, insn_cnt, &prog_opts); + if (!ASSERT_EQ(prog_fd, -EPERM, "prog_fd_eperm")) { + err = -EPERM; + goto cleanup; + } + + err = drop_priv_caps(&old_caps); + if (!ASSERT_OK(err, "drop_caps")) + goto cleanup; + + /* no caps + token -> failure */ + prog_opts.token_fd = token_fd; + prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL", + insns, insn_cnt, &prog_opts); + if (!ASSERT_EQ(prog_fd, -EPERM, "prog_fd_eperm")) { + err = -EPERM; + goto cleanup; + } + + /* no caps + no token -> definitely a failure */ + prog_opts.token_fd = 0; + prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, "token_prog", "GPL", + insns, insn_cnt, &prog_opts); + if (!ASSERT_EQ(prog_fd, -EPERM, "prog_fd_eperm")) { + err = -EPERM; + goto cleanup; + } + + err = 0; +cleanup: + zclose(prog_fd); + zclose(token_fd); + return err; +} + +void test_token(void) +{ + if (test__start_subtest("map_token")) { + struct bpffs_opts opts = { + .cmds = 1ULL << BPF_MAP_CREATE, + .maps = 1ULL << BPF_MAP_TYPE_STACK, + }; + + subtest_userns(&opts, userns_map_create); + } + if (test__start_subtest("btf_token")) { + struct bpffs_opts opts = { + .cmds = 1ULL << BPF_BTF_LOAD, + }; + + subtest_userns(&opts, userns_btf_load); + } + if (test__start_subtest("prog_token")) { + struct bpffs_opts opts = { + .cmds = 1ULL << BPF_PROG_LOAD, + .progs = 1ULL << BPF_PROG_TYPE_XDP, + .attachs = 1ULL << BPF_XDP, + }; + + subtest_userns(&opts, userns_prog_load); + } +} From patchwork Thu Oct 12 22:28:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13419926 X-Patchwork-Delegate: paul@paul-moore.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78DEACDB474 for ; Thu, 12 Oct 2023 22:32:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443020AbjJLWck convert rfc822-to-8bit (ORCPT ); Thu, 12 Oct 2023 18:32:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1443024AbjJLWci (ORCPT ); Thu, 12 Oct 2023 18:32:38 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1040D6 for ; Thu, 12 Oct 2023 15:32:36 -0700 (PDT) Received: from pps.filterd (m0148460.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39CLuoPM014798 for ; Thu, 12 Oct 2023 15:32:36 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tppq2a9pk-9 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 12 Oct 2023 15:32:35 -0700 Received: from twshared40933.03.prn6.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:82::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 12 Oct 2023 15:32:31 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 81C3F39A3A767; Thu, 12 Oct 2023 15:29:43 -0700 (PDT) From: Andrii Nakryiko To: , CC: , , , , , , Subject: [PATCH v7 bpf-next 18/18] bpf,selinux: allocate bpf_security_struct per BPF token Date: Thu, 12 Oct 2023 15:28:10 -0700 Message-ID: <20231012222810.4120312-19-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231012222810.4120312-1-andrii@kernel.org> References: <20231012222810.4120312-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Zbgo-qZwEqmczgMZs838aCw9_sib4xHq X-Proofpoint-ORIG-GUID: Zbgo-qZwEqmczgMZs838aCw9_sib4xHq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-12_14,2023-10-12_01,2023-05-22_02 Precedence: bulk List-ID: Utilize newly added bpf_token_create/bpf_token_free LSM hooks to allocate struct bpf_security_struct for each BPF token object in SELinux. This just follows similar pattern for BPF prog and map. Signed-off-by: Andrii Nakryiko --- security/selinux/hooks.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 3d5dea4f1df0..acd84839ae2c 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -6828,6 +6828,29 @@ static void selinux_bpf_prog_free(struct bpf_prog *prog) prog->aux->security = NULL; kfree(bpfsec); } + +static int selinux_bpf_token_create(struct bpf_token *token, union bpf_attr *attr, + struct path *path) +{ + struct bpf_security_struct *bpfsec; + + bpfsec = kzalloc(sizeof(*bpfsec), GFP_KERNEL); + if (!bpfsec) + return -ENOMEM; + + bpfsec->sid = current_sid(); + token->security = bpfsec; + + return 0; +} + +static void selinux_bpf_token_free(struct bpf_token *token) +{ + struct bpf_security_struct *bpfsec = token->security; + + token->security = NULL; + kfree(bpfsec); +} #endif struct lsm_blob_sizes selinux_blob_sizes __ro_after_init = { @@ -7183,6 +7206,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { LSM_HOOK_INIT(bpf_prog, selinux_bpf_prog), LSM_HOOK_INIT(bpf_map_free, selinux_bpf_map_free), LSM_HOOK_INIT(bpf_prog_free, selinux_bpf_prog_free), + LSM_HOOK_INIT(bpf_token_free, selinux_bpf_token_free), #endif #ifdef CONFIG_PERF_EVENTS @@ -7241,6 +7265,7 @@ static struct security_hook_list selinux_hooks[] __ro_after_init = { #ifdef CONFIG_BPF_SYSCALL LSM_HOOK_INIT(bpf_map_create, selinux_bpf_map_create), LSM_HOOK_INIT(bpf_prog_load, selinux_bpf_prog_load), + LSM_HOOK_INIT(bpf_token_create, selinux_bpf_token_create), #endif #ifdef CONFIG_PERF_EVENTS LSM_HOOK_INIT(perf_event_alloc, selinux_perf_event_alloc),