From patchwork Thu May 18 21:54:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13247432 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D671C77B7D for ; Thu, 18 May 2023 21:55:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230135AbjERVzG convert rfc822-to-8bit (ORCPT ); Thu, 18 May 2023 17:55:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230242AbjERVzE (ORCPT ); Thu, 18 May 2023 17:55:04 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7265E10C3 for ; Thu, 18 May 2023 14:55:01 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34IFpoKG017801 for ; Thu, 18 May 2023 14:55:01 -0700 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3qnk7ecfmg-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 May 2023 14:55:01 -0700 Received: from twshared24695.38.frc1.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 18 May 2023 14:54:59 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id D9C8230EEE4C4; Thu, 18 May 2023 14:54:47 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , , , , Andrii Nakryiko Subject: [PATCH v2 bpf-next 1/3] bpf: support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Date: Thu, 18 May 2023 14:54:42 -0700 Message-ID: <20230518215444.1418789-2-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230518215444.1418789-1-andrii@kernel.org> References: <20230518215444.1418789-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: oCu3bDBhzHoqy5ORblxqnpLmTYypJnvl X-Proofpoint-ORIG-GUID: oCu3bDBhzHoqy5ORblxqnpLmTYypJnvl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-18_15,2023-05-17_02,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Current UAPI of BPF_OBJ_PIN and BPF_OBJ_GET commands of bpf() syscall forces users to specify pinning location as a string-based absolute or relative (to current working directory) path. This has various implications related to security (e.g., symlink-based attacks), forces BPF FS to be exposed in the file system, which can cause races with other applications. One of the feedbacks we got from folks working with containers heavily was that inability to use purely FD-based location specification was an unfortunate limitation and hindrance for BPF_OBJ_PIN and BPF_OBJ_GET commands. This patch closes this oversight, adding path_fd field to BPF_OBJ_PIN and BPF_OBJ_GET UAPI, following conventions established by *at() syscalls for dirfd + pathname combinations. This now allows interesting possibilities like working with detached BPF FS mount (e.g., to perform multiple pinnings without running a risk of someone interfering with them), and generally making pinning/getting more secure and not prone to any races and/or security attacks. This is demonstrated by a selftest added in subsequent patch that takes advantage of new mount APIs (fsopen, fsconfig, fsmount) to demonstrate creating detached BPF FS mount, pinning, and then getting BPF map out of it, all while never exposing this private instance of BPF FS to outside worlds. Signed-off-by: Andrii Nakryiko --- include/linux/bpf.h | 4 ++-- include/uapi/linux/bpf.h | 10 ++++++++++ kernel/bpf/inode.c | 16 ++++++++-------- kernel/bpf/syscall.c | 25 ++++++++++++++++++++----- tools/include/uapi/linux/bpf.h | 10 ++++++++++ 5 files changed, 50 insertions(+), 15 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 36e4b2d8cca2..f58895830ada 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2077,8 +2077,8 @@ struct file *bpf_link_new_file(struct bpf_link *link, int *reserved_fd); struct bpf_link *bpf_link_get_from_fd(u32 ufd); struct bpf_link *bpf_link_get_curr_or_next(u32 *id); -int bpf_obj_pin_user(u32 ufd, const char __user *pathname); -int bpf_obj_get_user(const char __user *pathname, int flags); +int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname); +int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags); #define BPF_ITER_FUNC_PREFIX "bpf_iter_" #define DEFINE_BPF_ITER_FUNC(target, args...) \ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 1bb11a6ee667..3731284671e4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1272,6 +1272,9 @@ enum { /* Create a map that will be registered/unregesitered by the backed bpf_link */ BPF_F_LINK = (1U << 13), + +/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */ + BPF_F_PATH_FD = (1U << 14), }; /* Flags for BPF_PROG_QUERY. */ @@ -1420,6 +1423,13 @@ union bpf_attr { __aligned_u64 pathname; __u32 bpf_fd; __u32 file_flags; + /* Same as dirfd in openat() syscall; see openat(2) + * manpage for details of path FD and pathname semantics; + * path_fd should accompanied by BPF_F_PATH_FD flag set in + * file_flags field, otherwise it should be set to zero; + * if BPF_F_PATH_FD flag is not set, AT_FDCWD is assumed. + */ + __u32 path_fd; }; struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */ diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 9948b542a470..13bb54f6bd17 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -435,7 +435,7 @@ static int bpf_iter_link_pin_kernel(struct dentry *parent, return ret; } -static int bpf_obj_do_pin(const char __user *pathname, void *raw, +static int bpf_obj_do_pin(int path_fd, const char __user *pathname, void *raw, enum bpf_type type) { struct dentry *dentry; @@ -444,7 +444,7 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw, umode_t mode; int ret; - dentry = user_path_create(AT_FDCWD, pathname, &path, 0); + dentry = user_path_create(path_fd, pathname, &path, 0); if (IS_ERR(dentry)) return PTR_ERR(dentry); @@ -478,7 +478,7 @@ static int bpf_obj_do_pin(const char __user *pathname, void *raw, return ret; } -int bpf_obj_pin_user(u32 ufd, const char __user *pathname) +int bpf_obj_pin_user(u32 ufd, int path_fd, const char __user *pathname) { enum bpf_type type; void *raw; @@ -488,14 +488,14 @@ int bpf_obj_pin_user(u32 ufd, const char __user *pathname) if (IS_ERR(raw)) return PTR_ERR(raw); - ret = bpf_obj_do_pin(pathname, raw, type); + ret = bpf_obj_do_pin(path_fd, pathname, raw, type); if (ret != 0) bpf_any_put(raw, type); return ret; } -static void *bpf_obj_do_get(const char __user *pathname, +static void *bpf_obj_do_get(int path_fd, const char __user *pathname, enum bpf_type *type, int flags) { struct inode *inode; @@ -503,7 +503,7 @@ static void *bpf_obj_do_get(const char __user *pathname, void *raw; int ret; - ret = user_path_at(AT_FDCWD, pathname, LOOKUP_FOLLOW, &path); + ret = user_path_at(path_fd, pathname, LOOKUP_FOLLOW, &path); if (ret) return ERR_PTR(ret); @@ -527,7 +527,7 @@ static void *bpf_obj_do_get(const char __user *pathname, return ERR_PTR(ret); } -int bpf_obj_get_user(const char __user *pathname, int flags) +int bpf_obj_get_user(int path_fd, const char __user *pathname, int flags) { enum bpf_type type = BPF_TYPE_UNSPEC; int f_flags; @@ -538,7 +538,7 @@ int bpf_obj_get_user(const char __user *pathname, int flags) if (f_flags < 0) return f_flags; - raw = bpf_obj_do_get(pathname, &type, f_flags); + raw = bpf_obj_do_get(path_fd, pathname, &type, f_flags); if (IS_ERR(raw)) return PTR_ERR(raw); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 909c112ef537..559705606fa5 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2697,23 +2697,38 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size) return err; } -#define BPF_OBJ_LAST_FIELD file_flags +#define BPF_OBJ_LAST_FIELD path_fd static int bpf_obj_pin(const union bpf_attr *attr) { - if (CHECK_ATTR(BPF_OBJ) || attr->file_flags != 0) + int path_fd; + + if (CHECK_ATTR(BPF_OBJ) || attr->file_flags & ~BPF_F_PATH_FD) + return -EINVAL; + + /* path_fd has to be accompanied by BPF_F_PATH_FD flag */ + if (!(attr->file_flags & BPF_F_PATH_FD) && attr->path_fd) return -EINVAL; - return bpf_obj_pin_user(attr->bpf_fd, u64_to_user_ptr(attr->pathname)); + path_fd = attr->file_flags & BPF_F_PATH_FD ? attr->path_fd : AT_FDCWD; + return bpf_obj_pin_user(attr->bpf_fd, path_fd, + u64_to_user_ptr(attr->pathname)); } static int bpf_obj_get(const union bpf_attr *attr) { + int path_fd; + if (CHECK_ATTR(BPF_OBJ) || attr->bpf_fd != 0 || - attr->file_flags & ~BPF_OBJ_FLAG_MASK) + attr->file_flags & ~(BPF_OBJ_FLAG_MASK | BPF_F_PATH_FD)) + return -EINVAL; + + /* path_fd has to be accompanied by BPF_F_PATH_FD flag */ + if (!(attr->file_flags & BPF_F_PATH_FD) && attr->path_fd) return -EINVAL; - return bpf_obj_get_user(u64_to_user_ptr(attr->pathname), + path_fd = attr->file_flags & BPF_F_PATH_FD ? attr->path_fd : AT_FDCWD; + return bpf_obj_get_user(attr->path_fd, u64_to_user_ptr(attr->pathname), attr->file_flags); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1bb11a6ee667..3731284671e4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1272,6 +1272,9 @@ enum { /* Create a map that will be registered/unregesitered by the backed bpf_link */ BPF_F_LINK = (1U << 13), + +/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */ + BPF_F_PATH_FD = (1U << 14), }; /* Flags for BPF_PROG_QUERY. */ @@ -1420,6 +1423,13 @@ union bpf_attr { __aligned_u64 pathname; __u32 bpf_fd; __u32 file_flags; + /* Same as dirfd in openat() syscall; see openat(2) + * manpage for details of path FD and pathname semantics; + * path_fd should accompanied by BPF_F_PATH_FD flag set in + * file_flags field, otherwise it should be set to zero; + * if BPF_F_PATH_FD flag is not set, AT_FDCWD is assumed. + */ + __u32 path_fd; }; struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */ From patchwork Thu May 18 21:54:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13247435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21BDCC77B7D for ; Thu, 18 May 2023 21:55:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230273AbjERVzU convert rfc822-to-8bit (ORCPT ); Thu, 18 May 2023 17:55:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230056AbjERVzO (ORCPT ); Thu, 18 May 2023 17:55:14 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77DCE10D2 for ; Thu, 18 May 2023 14:55:05 -0700 (PDT) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34IFpqGm013798 for ; Thu, 18 May 2023 14:55:05 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3qnnmckbm0-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 May 2023 14:55:05 -0700 Received: from twshared34392.14.frc2.facebook.com (2620:10d:c0a8:1b::2d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 18 May 2023 14:55:02 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id E419D30EEE4CE; Thu, 18 May 2023 14:54:49 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , , , , Andrii Nakryiko Subject: [PATCH v2 bpf-next 2/3] libbpf: add opts-based bpf_obj_pin() API and add support for path_fd Date: Thu, 18 May 2023 14:54:43 -0700 Message-ID: <20230518215444.1418789-3-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230518215444.1418789-1-andrii@kernel.org> References: <20230518215444.1418789-1-andrii@kernel.org> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: UOZMrRsfvA8m7A5jt3zdkUGK0CL3iCDS X-Proofpoint-GUID: UOZMrRsfvA8m7A5jt3zdkUGK0CL3iCDS X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-18_15,2023-05-17_02,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add path_fd support for bpf_obj_pin() and bpf_obj_get() operations (through their opts-based variants). This allows to take advantage of new kernel-side support for O_PATH-based pin/get location specification. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/bpf.c | 17 ++++++++++++++--- tools/lib/bpf/bpf.h | 18 ++++++++++++++++-- tools/lib/bpf/libbpf.map | 1 + 3 files changed, 31 insertions(+), 5 deletions(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 128ac723c4ea..ed86b37d8024 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -572,20 +572,30 @@ int bpf_map_update_batch(int fd, const void *keys, const void *values, __u32 *co (void *)keys, (void *)values, count, opts); } -int bpf_obj_pin(int fd, const char *pathname) +int bpf_obj_pin_opts(int fd, const char *pathname, const struct bpf_obj_pin_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, file_flags); + const size_t attr_sz = offsetofend(union bpf_attr, path_fd); union bpf_attr attr; int ret; + if (!OPTS_VALID(opts, bpf_obj_pin_opts)) + return libbpf_err(-EINVAL); + memset(&attr, 0, attr_sz); + attr.path_fd = OPTS_GET(opts, path_fd, 0); attr.pathname = ptr_to_u64((void *)pathname); + attr.file_flags = OPTS_GET(opts, file_flags, 0); attr.bpf_fd = fd; ret = sys_bpf(BPF_OBJ_PIN, &attr, attr_sz); return libbpf_err_errno(ret); } +int bpf_obj_pin(int fd, const char *pathname) +{ + return bpf_obj_pin_opts(fd, pathname, NULL); +} + int bpf_obj_get(const char *pathname) { return bpf_obj_get_opts(pathname, NULL); @@ -593,7 +603,7 @@ int bpf_obj_get(const char *pathname) int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts) { - const size_t attr_sz = offsetofend(union bpf_attr, file_flags); + const size_t attr_sz = offsetofend(union bpf_attr, path_fd); union bpf_attr attr; int fd; @@ -601,6 +611,7 @@ int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts) return libbpf_err(-EINVAL); memset(&attr, 0, attr_sz); + attr.path_fd = OPTS_GET(opts, path_fd, 0); attr.pathname = ptr_to_u64((void *)pathname); attr.file_flags = OPTS_GET(opts, file_flags, 0); diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index a2c091389b18..9aa0ee473754 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -284,16 +284,30 @@ LIBBPF_API int bpf_map_update_batch(int fd, const void *keys, const void *values __u32 *count, const struct bpf_map_batch_opts *opts); -struct bpf_obj_get_opts { +struct bpf_obj_pin_opts { size_t sz; /* size of this struct for forward/backward compatibility */ __u32 file_flags; + int path_fd; size_t :0; }; -#define bpf_obj_get_opts__last_field file_flags +#define bpf_obj_pin_opts__last_field path_fd LIBBPF_API int bpf_obj_pin(int fd, const char *pathname); +LIBBPF_API int bpf_obj_pin_opts(int fd, const char *pathname, + const struct bpf_obj_pin_opts *opts); + +struct bpf_obj_get_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + + __u32 file_flags; + int path_fd; + + size_t :0; +}; +#define bpf_obj_get_opts__last_field path_fd + LIBBPF_API int bpf_obj_get(const char *pathname); LIBBPF_API int bpf_obj_get_opts(const char *pathname, const struct bpf_obj_get_opts *opts); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index a5aa3a383d69..7a4fe80da360 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -389,5 +389,6 @@ LIBBPF_1.2.0 { bpf_link__update_map; bpf_link_get_info_by_fd; bpf_map_get_info_by_fd; + bpf_obj_pin_opts; bpf_prog_get_info_by_fd; } LIBBPF_1.1.0; From patchwork Thu May 18 21:54:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13247434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FAA5C7EE29 for ; Thu, 18 May 2023 21:55:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229487AbjERVzH convert rfc822-to-8bit (ORCPT ); Thu, 18 May 2023 17:55:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229527AbjERVzF (ORCPT ); Thu, 18 May 2023 17:55:05 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 417A610C8 for ; Thu, 18 May 2023 14:55:03 -0700 (PDT) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34IFq9fK001985 for ; Thu, 18 May 2023 14:55:03 -0700 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3qncww6jvy-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 18 May 2023 14:55:02 -0700 Received: from twshared52565.14.frc2.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 18 May 2023 14:55:00 -0700 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 005DE30EEE4D6; Thu, 18 May 2023 14:54:51 -0700 (PDT) From: Andrii Nakryiko To: , , , CC: , , , , Andrii Nakryiko Subject: [PATCH v2 bpf-next 3/3] selftests/bpf: add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests Date: Thu, 18 May 2023 14:54:44 -0700 Message-ID: <20230518215444.1418789-4-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230518215444.1418789-1-andrii@kernel.org> References: <20230518215444.1418789-1-andrii@kernel.org> X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: PCIWVyXDPOrZw_4UHEOqY3WAHoEG4Y_B X-Proofpoint-GUID: PCIWVyXDPOrZw_4UHEOqY3WAHoEG4Y_B X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-18_15,2023-05-17_02,2023-02-09_01 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a selftest demonstrating using detach-mounted BPF FS using new mount APIs, and pinning and getting BPF map using such mount. This demonstrates how something like container manager could setup BPF FS, pin and adjust all the necessary objects in it, all before exposing BPF FS to a particular mount namespace. Signed-off-by: Andrii Nakryiko --- .../bpf/prog_tests/bpf_obj_pinning.c | 112 ++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c b/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c new file mode 100644 index 000000000000..a5f6063198aa --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/bpf_obj_pinning.c @@ -0,0 +1,112 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ + +#include +#include +#include +#include +#include +#include +#include + +static inline int sys_fsopen(const char *fsname, unsigned flags) +{ + return syscall(__NR_fsopen, fsname, flags); +} + +static inline int sys_fsconfig(int fs_fd, unsigned cmd, const char *key, const void *val, int aux) +{ + return syscall(__NR_fsconfig, fs_fd, cmd, key, val, aux); +} + +static inline int sys_fsmount(int fs_fd, unsigned flags, unsigned ms_flags) +{ + return syscall(__NR_fsmount, fs_fd, flags, ms_flags); +} + +static inline int sys_move_mount(int from_dfd, const char *from_path, + int to_dfd, const char *to_path, + unsigned int ms_flags) +{ + return syscall(__NR_move_mount, from_dfd, from_path, to_dfd, to_path, ms_flags); +} + +void test_bpf_obj_pinning(void) +{ + LIBBPF_OPTS(bpf_obj_pin_opts, pin_opts); + LIBBPF_OPTS(bpf_obj_get_opts, get_opts); + int fs_fd = -1, mnt_fd = -1; + int map_fd = -1, map_fd2 = -1; + int zero = 0, src_value, dst_value, err; + const char *map_name = "fsmount_map"; + + /* A bunch of below UAPI calls are constructed based on reading: + * https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html + */ + + /* create VFS context */ + fs_fd = sys_fsopen("bpf", 0); + if (!ASSERT_GE(fs_fd, 0, "fs_fd")) + goto cleanup; + + /* instantiate FS object */ + err = sys_fsconfig(fs_fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); + if (!ASSERT_OK(err, "fs_create")) + goto cleanup; + + /* create O_PATH fd for detached mount */ + mnt_fd = sys_fsmount(fs_fd, 0, 0); + if (!ASSERT_GE(mnt_fd, 0, "mnt_fd")) + goto cleanup; + + /* If we wanted to expose detached mount in the file system, we'd do + * something like below. But the whole point is that we actually don't + * even have to expose BPF FS in the file system to be able to work + * (pin/get objects) with it. + * + * err = sys_move_mount(mnt_fd, "", -EBADF, mnt_path, MOVE_MOUNT_F_EMPTY_PATH); + * if (!ASSERT_OK(err, "move_mount")) + * goto cleanup; + */ + + /* create BPF map to pin */ + map_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, map_name, 4, 4, 1, NULL); + if (!ASSERT_GE(map_fd, 0, "map_fd")) + goto cleanup; + + /* pin BPF map into detached BPF FS through mnt_fd */ + pin_opts.file_flags = BPF_F_PATH_FD; + pin_opts.path_fd = mnt_fd; + err = bpf_obj_pin_opts(map_fd, map_name, &pin_opts); + if (!ASSERT_OK(err, "map_pin")) + goto cleanup; + + /* get BPF map from detached BPF FS through mnt_fd */ + get_opts.file_flags = BPF_F_PATH_FD; + get_opts.path_fd = mnt_fd; + map_fd2 = bpf_obj_get_opts(map_name, &get_opts); + if (!ASSERT_GE(map_fd2, 0, "map_get")) + goto cleanup; + + /* update map through one FD */ + src_value = 0xcafebeef; + err = bpf_map_update_elem(map_fd, &zero, &src_value, 0); + ASSERT_OK(err, "map_update"); + + /* check values written/read through different FDs do match */ + dst_value = 0; + err = bpf_map_lookup_elem(map_fd2, &zero, &dst_value); + ASSERT_OK(err, "map_lookup"); + ASSERT_EQ(dst_value, src_value, "map_value_eq1"); + ASSERT_EQ(dst_value, 0xcafebeef, "map_value_eq2"); + +cleanup: + if (map_fd >= 0) + ASSERT_OK(close(map_fd), "close_map_fd"); + if (map_fd2 >= 0) + ASSERT_OK(close(map_fd2), "close_map_fd2"); + if (fs_fd >= 0) + ASSERT_OK(close(fs_fd), "close_fs_fd"); + if (mnt_fd >= 0) + ASSERT_OK(close(mnt_fd), "close_mnt_fd"); +}