From patchwork Wed Dec 20 23:31:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500787 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 421674AF81 for ; Wed, 20 Dec 2023 23:31:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKN4IIj029626 for ; Wed, 20 Dec 2023 15:31:44 -0800 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v49m28406-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:44 -0800 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:43 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 548583D86503F; Wed, 20 Dec 2023 15:31:31 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 1/8] libbpf: make uniform use of btf__fd() accessor inside libbpf Date: Wed, 20 Dec 2023 15:31:20 -0800 Message-ID: <20231220233127.1990417-2-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: h-SFJQaCFCykuUYDCxX_KjsKRiuhwP7L X-Proofpoint-GUID: h-SFJQaCFCykuUYDCxX_KjsKRiuhwP7L X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net It makes future grepping and code analysis a bit easier. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index ac54ebc0629f..e4fe79d5693f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7048,7 +7048,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog load_attr.prog_ifindex = prog->prog_ifindex; /* specify func_info/line_info only if kernel supports them */ - btf_fd = bpf_object__btf_fd(obj); + btf_fd = btf__fd(obj->btf); if (btf_fd >= 0 && kernel_supports(obj, FEAT_BTF_FUNC)) { load_attr.prog_btf_fd = btf_fd; load_attr.func_info = prog->func_info; From patchwork Wed Dec 20 23:31:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500785 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 201334B5A1 for ; Wed, 20 Dec 2023 23:31:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKLtas8018814 for ; Wed, 20 Dec 2023 15:31:40 -0800 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v3tv15pkm-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:40 -0800 Received: from twshared17891.15.frc2.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:39 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 66EF73D86504A; Wed, 20 Dec 2023 15:31:33 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 2/8] libbpf: use explicit map reuse flag to skip map creation steps Date: Wed, 20 Dec 2023 15:31:21 -0800 Message-ID: <20231220233127.1990417-3-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: eAI6QxzTN7O0DvVG3KaOMittyz4SFCPy X-Proofpoint-ORIG-GUID: eAI6QxzTN7O0DvVG3KaOMittyz4SFCPy X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net Instead of inferring whether map already point to previously created/pinned BPF map (which user can specify through bpf_map__reuse_fd() or bpf_object__reuse_map() APIs), use explicit map->reused flag that is set in such case. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e4fe79d5693f..2ead8fc1e344 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5463,7 +5463,7 @@ bpf_object__create_maps(struct bpf_object *obj) } } - if (map->fd >= 0) { + if (map->reused) { pr_debug("map '%s': skipping creation (preset fd=%d)\n", map->name, map->fd); } else { From patchwork Wed Dec 20 23:31:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500789 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E59444AF81 for ; Wed, 20 Dec 2023 23:31:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKN4IIp029626 for ; Wed, 20 Dec 2023 15:31:46 -0800 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v49m28406-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:46 -0800 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:43 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 87AC93D86505F; Wed, 20 Dec 2023 15:31:35 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 3/8] libbpf: don't rely on map->fd as an indicator of map being created Date: Wed, 20 Dec 2023 15:31:22 -0800 Message-ID: <20231220233127.1990417-4-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: kvByNBhG6JmM0vIJSixxVr2JIiRt7sM8 X-Proofpoint-GUID: kvByNBhG6JmM0vIJSixxVr2JIiRt7sM8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net With the upcoming switch to preallocated placeholder FDs for maps, switch various getters/setter away from checking map->fd. Use map_is_created() helper that detect whether BPF map can be modified based on map->obj->loaded state, with special provision for maps set up with bpf_map__reuse_fd(). For backwards compatibility, we take map_is_created() into account in bpf_map__fd() getter as well. This way before bpf_object__load() phase bpf_map__fd() will always return -1, just as before the changes in subsequent patches adding stable map->fd placeholders. We also get rid of all internal uses of bpf_map__fd() getter, as it's more oriented for uses external to libbpf. The above map_is_created() check actually interferes with some of the internal uses, if map FD is fetched through bpf_map__fd(). Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 42 +++++++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 2ead8fc1e344..467bd187b67d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5198,6 +5198,11 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map) static void bpf_map__destroy(struct bpf_map *map); +static bool map_is_created(const struct bpf_map *map) +{ + return map->obj->loaded || map->reused; +} + static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner) { LIBBPF_OPTS(bpf_map_create_opts, create_attr); @@ -5229,7 +5234,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b map->name, err); return err; } - map->inner_map_fd = bpf_map__fd(map->inner_map); + map->inner_map_fd = map->inner_map->fd; } if (map->inner_map_fd >= 0) create_attr.inner_map_fd = map->inner_map_fd; @@ -5312,7 +5317,7 @@ static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map) continue; targ_map = map->init_slots[i]; - fd = bpf_map__fd(targ_map); + fd = targ_map->fd; if (obj->gen_loader) { bpf_gen__populate_outer_map(obj->gen_loader, @@ -7133,7 +7138,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog if (map->libbpf_type != LIBBPF_MAP_RODATA) continue; - if (bpf_prog_bind_map(ret, bpf_map__fd(map), NULL)) { + if (bpf_prog_bind_map(ret, map->fd, NULL)) { cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); pr_warn("prog '%s': failed to bind map '%s': %s\n", prog->name, map->real_name, cp); @@ -9599,7 +9604,11 @@ int libbpf_attach_type_by_name(const char *name, int bpf_map__fd(const struct bpf_map *map) { - return map ? map->fd : libbpf_err(-EINVAL); + if (!map) + return libbpf_err(-EINVAL); + if (!map_is_created(map)) + return -1; + return map->fd; } static bool map_uses_real_name(const struct bpf_map *map) @@ -9635,7 +9644,7 @@ enum bpf_map_type bpf_map__type(const struct bpf_map *map) int bpf_map__set_type(struct bpf_map *map, enum bpf_map_type type) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->def.type = type; return 0; @@ -9648,7 +9657,7 @@ __u32 bpf_map__map_flags(const struct bpf_map *map) int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->def.map_flags = flags; return 0; @@ -9661,7 +9670,7 @@ __u64 bpf_map__map_extra(const struct bpf_map *map) int bpf_map__set_map_extra(struct bpf_map *map, __u64 map_extra) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->map_extra = map_extra; return 0; @@ -9674,7 +9683,7 @@ __u32 bpf_map__numa_node(const struct bpf_map *map) int bpf_map__set_numa_node(struct bpf_map *map, __u32 numa_node) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->numa_node = numa_node; return 0; @@ -9687,7 +9696,7 @@ __u32 bpf_map__key_size(const struct bpf_map *map) int bpf_map__set_key_size(struct bpf_map *map, __u32 size) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->def.key_size = size; return 0; @@ -9771,7 +9780,7 @@ static int map_btf_datasec_resize(struct bpf_map *map, __u32 size) int bpf_map__set_value_size(struct bpf_map *map, __u32 size) { - if (map->fd >= 0) + if (map->obj->loaded || map->reused) return libbpf_err(-EBUSY); if (map->mmaped) { @@ -9812,8 +9821,11 @@ __u32 bpf_map__btf_value_type_id(const struct bpf_map *map) int bpf_map__set_initial_value(struct bpf_map *map, const void *data, size_t size) { + if (map->obj->loaded || map->reused) + return libbpf_err(-EBUSY); + if (!map->mmaped || map->libbpf_type == LIBBPF_MAP_KCONFIG || - size != map->def.value_size || map->fd >= 0) + size != map->def.value_size) return libbpf_err(-EINVAL); memcpy(map->mmaped, data, size); @@ -9840,7 +9852,7 @@ __u32 bpf_map__ifindex(const struct bpf_map *map) int bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex) { - if (map->fd >= 0) + if (map_is_created(map)) return libbpf_err(-EBUSY); map->map_ifindex = ifindex; return 0; @@ -9945,7 +9957,7 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name) static int validate_map_op(const struct bpf_map *map, size_t key_sz, size_t value_sz, bool check_value_sz) { - if (map->fd <= 0) + if (!map_is_created(map)) /* map is not yet created */ return -ENOENT; if (map->def.key_size != key_sz) { @@ -12398,7 +12410,7 @@ int bpf_link__update_map(struct bpf_link *link, const struct bpf_map *map) __u32 zero = 0; int err; - if (!bpf_map__is_struct_ops(map) || map->fd < 0) + if (!bpf_map__is_struct_ops(map) || !map_is_created(map)) return -EINVAL; st_ops_link = container_of(link, struct bpf_link_struct_ops, link); @@ -13302,7 +13314,7 @@ int bpf_object__load_skeleton(struct bpf_object_skeleton *s) for (i = 0; i < s->map_cnt; i++) { struct bpf_map *map = *s->maps[i].map; size_t mmap_sz = bpf_map_mmap_sz(map->def.value_size, map->def.max_entries); - int prot, map_fd = bpf_map__fd(map); + int prot, map_fd = map->fd; void **mmaped = s->maps[i].mmaped; if (!mmaped) From patchwork Wed Dec 20 23:31:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500792 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 502D94A9B2 for ; Wed, 20 Dec 2023 23:31:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKN2hbC029798 for ; Wed, 20 Dec 2023 15:31:52 -0800 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v49kfr4cp-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:52 -0800 Received: from twshared10507.42.prn1.facebook.com (2620:10d:c0a8:1c::1b) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:48 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id A031E3D86506B; Wed, 20 Dec 2023 15:31:37 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 4/8] libbpf: use stable map placeholder FDs Date: Wed, 20 Dec 2023 15:31:23 -0800 Message-ID: <20231220233127.1990417-5-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: tXN8MaAGs6hp0BzVt4zYxHlWVa7go6vd X-Proofpoint-ORIG-GUID: tXN8MaAGs6hp0BzVt4zYxHlWVa7go6vd X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net Move map creation to later during BPF object loading by pre-creating stable placeholder FDs (initially pointing to /dev/null). Use dup2() syscall to then atomically make those placeholder FDs point to real kernel BPF map objects. This change allows to delay BPF map creation to after all the BPF program relocations. That, in turn, allows to delay BTF finalization and loading into kernel to after all the relocations as well. We'll take advantage of the latter in subsequent patches to allow libbpf to adjust BTF in a way that helps with BPF global function usage. Clean up a few places where we close map->fd, which now shouldn't happen, because map->fd should be a valid FD regardless of whether map was created or not. Surprisingly and nicely it simplifies a bunch of error handling code. If this change doesn't backfire, I'm tempted to pre-create such stable FDs for other entities (progs, maybe even BTF). We previously did some manipulations to make gen_loader work, with stable map FDs this hack is not necessary for maps (we still have it for BTF, but I left it as is for now). Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 94 +++++++++++++++++++-------------- tools/lib/bpf/libbpf_internal.h | 24 +++++++++ 2 files changed, 78 insertions(+), 40 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 467bd187b67d..98bec2f5fe39 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1515,7 +1515,21 @@ static struct bpf_map *bpf_object__add_map(struct bpf_object *obj) map = &obj->maps[obj->nr_maps++]; map->obj = obj; - map->fd = -1; + /* Preallocate map FD without actually creating BPF map just yet. + * These map FD "placeholders" will be reused later without changing + * FD value when map is actually created in the kernel. + * + * This is useful to be able to perform BPF program relocations + * without having to create BPF maps before that step. This allows us + * to finalize and load BTF very late in BPF object's loading phase, + * right before BPF maps have to be created and BPF programs have to + * be loaded. By having these map FD placeholders we can perform all + * the sanitizations, relocations, and any other adjustments before we + * start creating actual BPF kernel objects (BTF, maps, progs). + */ + map->fd = create_placeholder_fd(); + if (map->fd < 0) + return ERR_PTR(map->fd); map->inner_map_fd = -1; map->autocreate = true; @@ -2607,7 +2621,9 @@ static int bpf_object__init_user_btf_map(struct bpf_object *obj, map->inner_map = calloc(1, sizeof(*map->inner_map)); if (!map->inner_map) return -ENOMEM; - map->inner_map->fd = -1; + map->inner_map->fd = create_placeholder_fd(); + if (map->inner_map->fd < 0) + return map->inner_map->fd; map->inner_map->sec_idx = sec_idx; map->inner_map->name = malloc(strlen(map_name) + sizeof(".inner") + 1); if (!map->inner_map->name) @@ -4547,14 +4563,12 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd) goto err_free_new_name; } - err = zclose(map->fd); - if (err) { - err = -errno; - goto err_close_new_fd; - } + err = reuse_fd(map->fd, new_fd); + if (err) + goto err_free_new_name; + free(map->name); - map->fd = new_fd; map->name = new_name; map->def.type = info.type; map->def.key_size = info.key_size; @@ -4568,8 +4582,6 @@ int bpf_map__reuse_fd(struct bpf_map *map, int fd) return 0; -err_close_new_fd: - close(new_fd); err_free_new_name: free(new_name); return libbpf_err(err); @@ -5208,7 +5220,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b LIBBPF_OPTS(bpf_map_create_opts, create_attr); struct bpf_map_def *def = &map->def; const char *map_name = NULL; - int err = 0; + int err = 0, map_fd; if (kernel_supports(obj, FEAT_PROG_NAME)) map_name = map->name; @@ -5267,17 +5279,19 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b bpf_gen__map_create(obj->gen_loader, def->type, map_name, def->key_size, def->value_size, def->max_entries, &create_attr, is_inner ? -1 : map - obj->maps); - /* Pretend to have valid FD to pass various fd >= 0 checks. - * This fd == 0 will not be used with any syscall and will be reset to -1 eventually. + /* We keep pretenting we have valid FD to pass various fd >= 0 + * checks by just keeping original placeholder FDs in place. + * See bpf_object_prepare_maps() comments. + * This placeholder fd will not be used with any syscall and + * will be reset to -1 eventually. */ - map->fd = 0; + map_fd = map->fd; } else { - map->fd = bpf_map_create(def->type, map_name, - def->key_size, def->value_size, - def->max_entries, &create_attr); + map_fd = bpf_map_create(def->type, map_name, + def->key_size, def->value_size, + def->max_entries, &create_attr); } - if (map->fd < 0 && (create_attr.btf_key_type_id || - create_attr.btf_value_type_id)) { + if (map_fd < 0 && (create_attr.btf_key_type_id || create_attr.btf_value_type_id)) { char *cp, errmsg[STRERR_BUFSIZE]; err = -errno; @@ -5289,13 +5303,11 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b create_attr.btf_value_type_id = 0; map->btf_key_type_id = 0; map->btf_value_type_id = 0; - map->fd = bpf_map_create(def->type, map_name, - def->key_size, def->value_size, - def->max_entries, &create_attr); + map_fd = bpf_map_create(def->type, map_name, + def->key_size, def->value_size, + def->max_entries, &create_attr); } - err = map->fd < 0 ? -errno : 0; - if (bpf_map_type__is_map_in_map(def->type) && map->inner_map) { if (obj->gen_loader) map->inner_map->fd = -1; @@ -5303,7 +5315,19 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, b zfree(&map->inner_map); } - return err; + if (map_fd < 0) + return -errno; + + /* obj->gen_loader case, prevent reuse_fd() from closing map_fd */ + if (map->fd == map_fd) + return 0; + + /* Keep placeholder FD value but now point it to the BPF map object. + * This way everything that relied on this map's FD (e.g., relocated + * ldimm64 instructions) will stay valid and won't need adjustments. + * map->fd stays valid but now point to what map_fd points to. + */ + return reuse_fd(map->fd, map_fd); } static int init_map_in_map_slots(struct bpf_object *obj, struct bpf_map *map) @@ -5387,10 +5411,8 @@ static int bpf_object_init_prog_arrays(struct bpf_object *obj) continue; err = init_prog_array_slots(obj, map); - if (err < 0) { - zclose(map->fd); + if (err < 0) return err; - } } return 0; } @@ -5413,8 +5435,7 @@ static int map_set_def_max_entries(struct bpf_map *map) return 0; } -static int -bpf_object__create_maps(struct bpf_object *obj) +static int bpf_object_create_maps(struct bpf_object *obj) { struct bpf_map *map; char *cp, errmsg[STRERR_BUFSIZE]; @@ -5481,25 +5502,20 @@ bpf_object__create_maps(struct bpf_object *obj) if (bpf_map__is_internal(map)) { err = bpf_object__populate_internal_map(obj, map); - if (err < 0) { - zclose(map->fd); + if (err < 0) goto err_out; - } } if (map->init_slots_sz && map->def.type != BPF_MAP_TYPE_PROG_ARRAY) { err = init_map_in_map_slots(obj, map); - if (err < 0) { - zclose(map->fd); + if (err < 0) goto err_out; - } } } if (map->pin_path && !map->pinned) { err = bpf_map__pin(map, NULL); if (err) { - zclose(map->fd); if (!retried && err == -EEXIST) { retried = true; goto retry; @@ -8073,8 +8089,8 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch err = err ? : bpf_object__sanitize_and_load_btf(obj); err = err ? : bpf_object__sanitize_maps(obj); err = err ? : bpf_object__init_kern_struct_ops_maps(obj); - err = err ? : bpf_object__create_maps(obj); err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path); + err = err ? : bpf_object_create_maps(obj); err = err ? : bpf_object__load_progs(obj, extra_log_level); err = err ? : bpf_object_init_prog_arrays(obj); err = err ? : bpf_object_prepare_struct_ops(obj); @@ -8083,8 +8099,6 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch /* reset FDs */ if (obj->btf) btf__set_fd(obj->btf, -1); - for (i = 0; i < obj->nr_maps; i++) - obj->maps[i].fd = -1; if (!err) err = bpf_gen__finish(obj->gen_loader, obj->nr_programs, obj->nr_maps); } diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index b5d334754e5d..662a3df1e29f 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -555,6 +555,30 @@ static inline int ensure_good_fd(int fd) return fd; } +static inline int create_placeholder_fd(void) +{ + int fd; + + fd = ensure_good_fd(open("/dev/null", O_WRONLY | O_CLOEXEC)); + if (fd < 0) + return -errno; + return fd; +} + +/* Point *fixed_fd* to the same file that *tmp_fd* points to. + * Regardless of success, *tmp_fd* is closed. + * Whatever *fixed_fd* pointed to is closed silently. + */ +static inline int reuse_fd(int fixed_fd, int tmp_fd) +{ + int err; + + err = dup2(tmp_fd, fixed_fd); + err = err < 0 ? -errno : 0; + close(tmp_fd); /* clean up temporary FD */ + return err; +} + /* The following two functions are exposed to bpftool */ int bpf_core_add_cands(struct bpf_core_cand *local_cand, size_t local_essent_len, From patchwork Wed Dec 20 23:31:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500791 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D63184AF81 for ; Wed, 20 Dec 2023 23:31:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKN2hbB029798 for ; Wed, 20 Dec 2023 15:31:51 -0800 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v49kfr4cp-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:51 -0800 Received: from twshared10507.42.prn1.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:48 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id C50533D865073; Wed, 20 Dec 2023 15:31:39 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 5/8] libbpf: move exception callbacks assignment logic into relocation step Date: Wed, 20 Dec 2023 15:31:24 -0800 Message-ID: <20231220233127.1990417-6-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: 6VsXcge-9_pkXJm-V6viiZxAopGdBNLR X-Proofpoint-ORIG-GUID: 6VsXcge-9_pkXJm-V6viiZxAopGdBNLR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net Move the logic of finding and assigning exception callback indices from BTF sanitization step to program relocations step, which seems more logical and will unblock moving BTF loading to after relocation step. Exception callbacks discovery and assignment has no dependency on BTF being loaded into the kernel, it only uses BTF information. It does need to happen before subprogram relocations happen, though. Which is why the split. No functional changes. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 165 +++++++++++++++++++++-------------------- 1 file changed, 85 insertions(+), 80 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 98bec2f5fe39..26a0869626bf 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3182,86 +3182,6 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) } } - if (!kernel_supports(obj, FEAT_BTF_DECL_TAG)) - goto skip_exception_cb; - for (i = 0; i < obj->nr_programs; i++) { - struct bpf_program *prog = &obj->programs[i]; - int j, k, n; - - if (prog_is_subprog(obj, prog)) - continue; - n = btf__type_cnt(obj->btf); - for (j = 1; j < n; j++) { - const char *str = "exception_callback:", *name; - size_t len = strlen(str); - struct btf_type *t; - - t = btf_type_by_id(obj->btf, j); - if (!btf_is_decl_tag(t) || btf_decl_tag(t)->component_idx != -1) - continue; - - name = btf__str_by_offset(obj->btf, t->name_off); - if (strncmp(name, str, len)) - continue; - - t = btf_type_by_id(obj->btf, t->type); - if (!btf_is_func(t) || btf_func_linkage(t) != BTF_FUNC_GLOBAL) { - pr_warn("prog '%s': exception_callback: decl tag not applied to the main program\n", - prog->name); - return -EINVAL; - } - if (strcmp(prog->name, btf__str_by_offset(obj->btf, t->name_off))) - continue; - /* Multiple callbacks are specified for the same prog, - * the verifier will eventually return an error for this - * case, hence simply skip appending a subprog. - */ - if (prog->exception_cb_idx >= 0) { - prog->exception_cb_idx = -1; - break; - } - - name += len; - if (str_is_empty(name)) { - pr_warn("prog '%s': exception_callback: decl tag contains empty value\n", - prog->name); - return -EINVAL; - } - - for (k = 0; k < obj->nr_programs; k++) { - struct bpf_program *subprog = &obj->programs[k]; - - if (!prog_is_subprog(obj, subprog)) - continue; - if (strcmp(name, subprog->name)) - continue; - /* Enforce non-hidden, as from verifier point of - * view it expects global functions, whereas the - * mark_btf_static fixes up linkage as static. - */ - if (!subprog->sym_global || subprog->mark_btf_static) { - pr_warn("prog '%s': exception callback %s must be a global non-hidden function\n", - prog->name, subprog->name); - return -EINVAL; - } - /* Let's see if we already saw a static exception callback with the same name */ - if (prog->exception_cb_idx >= 0) { - pr_warn("prog '%s': multiple subprogs with same name as exception callback '%s'\n", - prog->name, subprog->name); - return -EINVAL; - } - prog->exception_cb_idx = k; - break; - } - - if (prog->exception_cb_idx >= 0) - continue; - pr_warn("prog '%s': cannot find exception callback '%s'\n", prog->name, name); - return -ENOENT; - } - } -skip_exception_cb: - sanitize = btf_needs_sanitization(obj); if (sanitize) { const void *raw_data; @@ -6648,6 +6568,88 @@ static void bpf_object__sort_relos(struct bpf_object *obj) } } +static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *prog) +{ + const char *str = "exception_callback:"; + size_t pfx_len = strlen(str); + int i, j, n; + + if (!obj->btf || !kernel_supports(obj, FEAT_BTF_DECL_TAG)) + return 0; + + n = btf__type_cnt(obj->btf); + for (i = 1; i < n; i++) { + const char *name; + struct btf_type *t; + + t = btf_type_by_id(obj->btf, i); + if (!btf_is_decl_tag(t) || btf_decl_tag(t)->component_idx != -1) + continue; + + name = btf__str_by_offset(obj->btf, t->name_off); + if (strncmp(name, str, pfx_len) != 0) + continue; + + t = btf_type_by_id(obj->btf, t->type); + if (!btf_is_func(t) || btf_func_linkage(t) != BTF_FUNC_GLOBAL) { + pr_warn("prog '%s': exception_callback: decl tag not applied to the main program\n", + prog->name); + return -EINVAL; + } + if (strcmp(prog->name, btf__str_by_offset(obj->btf, t->name_off)) != 0) + continue; + /* Multiple callbacks are specified for the same prog, + * the verifier will eventually return an error for this + * case, hence simply skip appending a subprog. + */ + if (prog->exception_cb_idx >= 0) { + prog->exception_cb_idx = -1; + break; + } + + name += pfx_len; + if (str_is_empty(name)) { + pr_warn("prog '%s': exception_callback: decl tag contains empty value\n", + prog->name); + return -EINVAL; + } + + for (j = 0; j < obj->nr_programs; j++) { + struct bpf_program *subprog = &obj->programs[j]; + + if (!prog_is_subprog(obj, subprog)) + continue; + if (strcmp(name, subprog->name) != 0) + continue; + /* Enforce non-hidden, as from verifier point of + * view it expects global functions, whereas the + * mark_btf_static fixes up linkage as static. + */ + if (!subprog->sym_global || subprog->mark_btf_static) { + pr_warn("prog '%s': exception callback %s must be a global non-hidden function\n", + prog->name, subprog->name); + return -EINVAL; + } + /* Let's see if we already saw a static exception callback with the same name */ + if (prog->exception_cb_idx >= 0) { + pr_warn("prog '%s': multiple subprogs with same name as exception callback '%s'\n", + prog->name, subprog->name); + return -EINVAL; + } + prog->exception_cb_idx = j; + break; + } + + if (prog->exception_cb_idx >= 0) + continue; + + pr_warn("prog '%s': cannot find exception callback '%s'\n", prog->name, name); + return -ENOENT; + } + + return 0; +} + static int bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) { @@ -6708,6 +6710,9 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) return err; } + err = bpf_prog_assign_exc_cb(obj, prog); + if (err) + return err; /* Now, also append exception callback if it has not been done already. */ if (prog->exception_cb_idx >= 0) { struct bpf_program *subprog = &obj->programs[prog->exception_cb_idx]; From patchwork Wed Dec 20 23:31:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500788 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE8FA4A988 for ; Wed, 20 Dec 2023 23:31:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109334.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKN4IIo029626 for ; Wed, 20 Dec 2023 15:31:46 -0800 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v49m28406-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:46 -0800 Received: from twshared19681.14.frc2.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:44 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id DD9673D86507B; Wed, 20 Dec 2023 15:31:41 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 6/8] libbpf: move BTF loading step after relocation step Date: Wed, 20 Dec 2023 15:31:25 -0800 Message-ID: <20231220233127.1990417-7-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: n-HxP1hmHnmbg3XqhrH83U-174RvjPhz X-Proofpoint-GUID: n-HxP1hmHnmbg3XqhrH83U-174RvjPhz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net With all the preparations in previous patches done we are ready to postpone BTF loading and sanitization step until after all the relocations are performed. While at it, simplify step name from bpf_object__sanitize_and_load_btf to bpf_object_load_btf. Map creation and program loading steps also perform sanitization, but they don't explicitly reflect it in overly verbose function name. So keep naming and approch consistent here. Signed-off-by: Andrii Nakryiko --- tools/lib/bpf/libbpf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 26a0869626bf..92171bcf4c25 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -3132,7 +3132,7 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force) return 0; } -static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj) +static int bpf_object_load_btf(struct bpf_object *obj) { struct btf *kern_btf = obj->btf; bool btf_mandatory, sanitize; @@ -8091,10 +8091,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch err = bpf_object__probe_loading(obj); err = err ? : bpf_object__load_vmlinux_btf(obj, false); err = err ? : bpf_object__resolve_externs(obj, obj->kconfig); - err = err ? : bpf_object__sanitize_and_load_btf(obj); err = err ? : bpf_object__sanitize_maps(obj); err = err ? : bpf_object__init_kern_struct_ops_maps(obj); err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path); + err = err ? : bpf_object_load_btf(obj); err = err ? : bpf_object_create_maps(obj); err = err ? : bpf_object__load_progs(obj, extra_log_level); err = err ? : bpf_object_init_prog_arrays(obj); From patchwork Wed Dec 20 23:31:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500790 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B43224A9B2 for ; Wed, 20 Dec 2023 23:31:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKLuCaM007062 for ; Wed, 20 Dec 2023 15:31:49 -0800 Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v45gesqxc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:49 -0800 Received: from twshared17205.35.frc1.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:47 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 165F13D86508A; Wed, 20 Dec 2023 15:31:43 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 7/8] libbpf: implement __arg_ctx fallback logic Date: Wed, 20 Dec 2023 15:31:26 -0800 Message-ID: <20231220233127.1990417-8-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: EmMEHB0q-Vc5LblhqouIjfA-LgFLRmE_ X-Proofpoint-ORIG-GUID: EmMEHB0q-Vc5LblhqouIjfA-LgFLRmE_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net Out of all special global func arg tag annotations, __arg_ctx is practically is the most immediately useful and most critical to have working across multitude kernel version, if possible. This would allow end users to write much simpler code if __arg_ctx semantics worked for older kernels that don't natively understand btf_decl_tag("arg:ctx") in verifier logic. Luckily, it is possible to ensure __arg_ctx works on old kernels through a bit of extra work done by libbpf, at least in a lot of common cases. To explain the overall idea, we need to go back at how context argument was supported in global funcs before __arg_ctx support was added. This was done based on special struct name checks in kernel. E.g., for BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct bpf_perf_event_data *` mark that argument as PTR_TO_CTX. This is all good as long as global function is used from the same BPF program types only, which is often not the case. If the same subprog has to be called from, say, kprobe and perf_event program types, there is no single definition that would satisfy BPF verifier. Subprog will have context argument either for kprobe (if using bpf_user_pt_regs_t struct name) or perf_event (with bpf_perf_event_data struct name), but not both. This limitation was the reason to add btf_decl_tag("arg:ctx"), making the actual argument type not important, so that user can just define "generic" signature: __noinline int global_subprog(void *ctx __arg_ctx) { ... } I won't belabor how libbpf is implementing subprograms, see a huge comment next to bpf_object__relocate_calls() function. The idea is that each main/entry BPF program gets its own copy of global_subprog's code appended. This per-program copy of global subprog code *and* associated func_info .BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain allows libbpf to simulate __arg_ctx behavior transparently, even if the kernel doesn't yet support __arg_ctx annotation natively. The idea is straightforward: each time we append global subprog's code and func_info information, we adjust its FUNC -> FUNC_PROTO type information, if necessary (that is, libbpf can detect the presence of btf_decl_tag("arg:ctx") just like BPF verifier would do it). The rest is just mechanical and somewhat painful BTF manipulation code. It's painful because we need to clone FUNC -> FUNC_PROTO, instead of reusing it, as same FUNC -> FUNC_PROTO chain might be used by another main BPF program within the same BPF object, so we can't just modify it in-place (and cloning BTF types within the same struct btf object is painful due to constant memory invalidation, see comments in code). Uploaded BPF object's BTF information has to work for all BPF programs at the same time. Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of using some `void *ctx` parameter definition, we have an expected `struct bpf_perf_event_data *ctx` definition (as far as BPF verifier and kernel is concerned), which will mark it as context for BPF verifier. Same global subprog relocated and copied into another main BPF program will get different type information according to main program's type. It all works out in the end in a completely transparent way for end user. Libbpf maintains internal program type -> expected context struct name mapping internally. Note, not all BPF program types have named context struct, so this approach won't work for such programs (just like it didn't before __arg_ctx). So native __arg_ctx is still important to have in kernel to have generic context support across all BPF program types. Signed-off-by: Andrii Nakryiko Acked-by: Jiri Olsa --- tools/lib/bpf/libbpf.c | 239 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 231 insertions(+), 8 deletions(-) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 92171bcf4c25..1a7354b6a289 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6168,7 +6168,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj, int err; /* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't - * supprot func/line info + * support func/line info */ if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC)) return 0; @@ -6650,8 +6650,223 @@ static int bpf_prog_assign_exc_cb(struct bpf_object *obj, struct bpf_program *pr return 0; } -static int -bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) +static struct { + enum bpf_prog_type prog_type; + const char *ctx_name; +} global_ctx_map[] = { + { BPF_PROG_TYPE_CGROUP_DEVICE, "bpf_cgroup_dev_ctx" }, + { BPF_PROG_TYPE_CGROUP_SKB, "__sk_buff" }, + { BPF_PROG_TYPE_CGROUP_SOCK, "bpf_sock" }, + { BPF_PROG_TYPE_CGROUP_SOCK_ADDR, "bpf_sock_addr" }, + { BPF_PROG_TYPE_CGROUP_SOCKOPT, "bpf_sockopt" }, + { BPF_PROG_TYPE_CGROUP_SYSCTL, "bpf_sysctl" }, + { BPF_PROG_TYPE_FLOW_DISSECTOR, "__sk_buff" }, + { BPF_PROG_TYPE_KPROBE, "bpf_user_pt_regs_t" }, + { BPF_PROG_TYPE_LWT_IN, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_OUT, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_SEG6LOCAL, "__sk_buff" }, + { BPF_PROG_TYPE_LWT_XMIT, "__sk_buff" }, + { BPF_PROG_TYPE_NETFILTER, "bpf_nf_ctx" }, + { BPF_PROG_TYPE_PERF_EVENT, "bpf_perf_event_data" }, + { BPF_PROG_TYPE_RAW_TRACEPOINT, "bpf_raw_tracepoint_args" }, + { BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, "bpf_raw_tracepoint_args" }, + { BPF_PROG_TYPE_SCHED_ACT, "__sk_buff" }, + { BPF_PROG_TYPE_SCHED_CLS, "__sk_buff" }, + { BPF_PROG_TYPE_SK_LOOKUP, "bpf_sk_lookup" }, + { BPF_PROG_TYPE_SK_MSG, "sk_msg_md" }, + { BPF_PROG_TYPE_SK_REUSEPORT, "sk_reuseport_md" }, + { BPF_PROG_TYPE_SK_SKB, "__sk_buff" }, + { BPF_PROG_TYPE_SOCK_OPS, "bpf_sock_ops" }, + { BPF_PROG_TYPE_SOCKET_FILTER, "__sk_buff" }, + { BPF_PROG_TYPE_XDP, "xdp_md" }, + /* all other program types don't have "named" context structs */ +}; + +/* Check if main program or global subprog's function prototype has `arg:ctx` + * argument tags, and, if necessary, substitute correct type to match what BPF + * verifier would expect, taking into account specific program type. This + * allows to support __arg_ctx tag transparently on old kernels that don't yet + * have a native support for it in the verifier, making user's life much + * easier. + */ +static int bpf_program_fixup_func_info(struct bpf_object *obj, struct bpf_program *prog) +{ + const char *ctx_name = NULL, *ctx_tag = "arg:ctx"; + struct bpf_func_info_min *func_rec; + struct btf_type *fn_t, *fn_proto_t; + struct btf *btf = obj->btf; + const struct btf_type *t; + struct btf_param *p; + int ptr_id = 0, struct_id, tag_id, orig_fn_id; + int i, j, n, arg_idx, arg_cnt, err, name_off, rec_idx; + int *orig_ids; + + /* no .BTF.ext, no problem */ + if (!obj->btf_ext || !prog->func_info) + return 0; + + /* some BPF program types just don't have named context structs, so + * this fallback mechanism doesn't work for them + */ + for (i = 0; i < ARRAY_SIZE(global_ctx_map); i++) { + if (global_ctx_map[i].prog_type != prog->type) + continue; + ctx_name = global_ctx_map[i].ctx_name; + break; + } + if (!ctx_name) + return 0; + + /* remember original func BTF IDs to detect if we already cloned them */ + orig_ids = calloc(prog->func_info_cnt, sizeof(*orig_ids)); + if (!orig_ids) + return -ENOMEM; + for (i = 0; i < prog->func_info_cnt; i++) { + func_rec = prog->func_info + prog->func_info_rec_size * i; + orig_ids[i] = func_rec->type_id; + } + + /* go through each DECL_TAG with "arg:ctx" and see if it points to one + * of our subprogs; if yes and subprog is global and needs adjustment, + * clone and adjust FUNC -> FUNC_PROTO combo + */ + for (i = 1, n = btf__type_cnt(btf); i < n; i++) { + /* only DECL_TAG with "arg:ctx" value are interesting */ + t = btf__type_by_id(btf, i); + if (!btf_is_decl_tag(t)) + continue; + if (strcmp(btf__str_by_offset(btf, t->name_off), ctx_tag) != 0) + continue; + + /* only global funcs need adjustment, if at all */ + orig_fn_id = t->type; + fn_t = btf_type_by_id(btf, orig_fn_id); + if (!btf_is_func(fn_t) || btf_func_linkage(fn_t) != BTF_FUNC_GLOBAL) + continue; + + /* sanity check FUNC -> FUNC_PROTO chain, just in case */ + fn_proto_t = btf_type_by_id(btf, fn_t->type); + if (!fn_proto_t || !btf_is_func_proto(fn_proto_t)) + continue; + + /* find corresponding func_info record */ + func_rec = NULL; + for (rec_idx = 0; rec_idx < prog->func_info_cnt; rec_idx++) { + if (orig_ids[rec_idx] == t->type) { + func_rec = prog->func_info + prog->func_info_rec_size * rec_idx; + break; + } + } + /* current main program doesn't call into this subprog */ + if (!func_rec) + continue; + + /* some more sanity checking of DECL_TAG */ + arg_cnt = btf_vlen(fn_proto_t); + arg_idx = btf_decl_tag(t)->component_idx; + if (arg_idx < 0 || arg_idx >= arg_cnt) + continue; + + /* check if existing parameter already matches verifier expectations */ + p = &btf_params(fn_proto_t)[arg_idx]; + t = skip_mods_and_typedefs(btf, p->type, NULL); + if (btf_is_ptr(t) && + (t = skip_mods_and_typedefs(btf, t->type, NULL)) && + btf_is_struct(t) && + strcmp(btf__str_by_offset(btf, t->name_off), ctx_name) == 0) { + continue; /* no need for fix up */ + } + + /* clone fn/fn_proto, unless we already did it for another arg */ + if (func_rec->type_id == orig_fn_id) { + int fn_id, fn_proto_id, ret_type_id, orig_proto_id; + + /* Note that each btf__add_xxx() operation invalidates + * all btf_type and string pointers, so we need to be + * very careful when cloning BTF types. BTF type + * pointers have to be always refetched. And to avoid + * problems with invalidated string pointers, we + * add empty strings initially, then just fix up + * name_off offsets in place. Offsets are stable for + * existing strings, so that works out. + */ + name_off = fn_t->name_off; /* we are about to invalidate fn_t */ + ret_type_id = fn_proto_t->type; /* and fn_proto_t as well */ + orig_proto_id = fn_t->type; /* original FUNC_PROTO ID */ + + /* clone FUNC first, btf__add_func() enforces + * non-empty name, so use entry program's name as + * a placeholder, which we replace immediately + */ + fn_id = btf__add_func(btf, prog->name, btf_func_linkage(fn_t), fn_t->type); + if (fn_id < 0) + return -EINVAL; + fn_t = btf_type_by_id(btf, fn_id); + fn_t->name_off = name_off; /* reuse original string */ + fn_t->type = fn_id + 1; /* we can predict FUNC_PROTO ID */ + + /* clone FUNC_PROTO and its params now */ + fn_proto_id = btf__add_func_proto(btf, ret_type_id); + if (fn_proto_id < 0) { + err = -EINVAL; + goto err_out; + } + for (j = 0; j < arg_cnt; j++) { + /* copy original parameter data */ + t = btf_type_by_id(btf, orig_proto_id); + p = &btf_params(t)[j]; + name_off = p->name_off; + + err = btf__add_func_param(btf, "", p->type); + if (err) + goto err_out; + fn_proto_t = btf_type_by_id(btf, fn_proto_id); + p = &btf_params(fn_proto_t)[j]; + p->name_off = name_off; /* use remembered str offset */ + } + + /* point func_info record to a cloned FUNC type */ + func_rec->type_id = fn_id; + } + + /* create PTR -> STRUCT type chain to mark PTR_TO_CTX argument; + * we do it just once per main BPF program, as all global + * funcs share the same program type, so need only PTR -> + * STRUCT type chain + */ + if (ptr_id == 0) { + struct_id = btf__add_struct(btf, ctx_name, 0); + ptr_id = btf__add_ptr(btf, struct_id); + if (ptr_id < 0 || struct_id < 0) { + err = -EINVAL; + goto err_out; + } + } + + /* for completeness, clone DECL_TAG and point it to cloned param */ + tag_id = btf__add_decl_tag(btf, ctx_tag, func_rec->type_id, arg_idx); + if (tag_id < 0) { + err = -EINVAL; + goto err_out; + } + + /* all the BTF manipulations invalidated pointers, refetch them */ + fn_t = btf_type_by_id(btf, func_rec->type_id); + fn_proto_t = btf_type_by_id(btf, fn_t->type); + + /* fix up type ID pointed to by param */ + p = &btf_params(fn_proto_t)[arg_idx]; + p->type = ptr_id; + } + + free(orig_ids); + return 0; +err_out: + free(orig_ids); + return err; +} + +static int bpf_object_relocate(struct bpf_object *obj, const char *targ_btf_path) { struct bpf_program *prog; size_t i, j; @@ -6732,19 +6947,28 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path) } } } - /* Process data relos for main programs */ for (i = 0; i < obj->nr_programs; i++) { prog = &obj->programs[i]; if (prog_is_subprog(obj, prog)) continue; if (!prog->autoload) continue; + + /* Process data relos for main programs */ err = bpf_object__relocate_data(obj, prog); if (err) { pr_warn("prog '%s': failed to relocate data references: %d\n", prog->name, err); return err; } + + /* Fix up .BTF.ext information, if necessary */ + err = bpf_program_fixup_func_info(obj, prog); + if (err) { + pr_warn("prog '%s': failed to perform .BTF.ext fix ups: %d\n", + prog->name, err); + return err; + } } return 0; @@ -7456,8 +7680,7 @@ static int bpf_program_record_relos(struct bpf_program *prog) return 0; } -static int -bpf_object__load_progs(struct bpf_object *obj, int log_level) +static int bpf_object_load_progs(struct bpf_object *obj, int log_level) { struct bpf_program *prog; size_t i; @@ -8093,10 +8316,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch err = err ? : bpf_object__resolve_externs(obj, obj->kconfig); err = err ? : bpf_object__sanitize_maps(obj); err = err ? : bpf_object__init_kern_struct_ops_maps(obj); - err = err ? : bpf_object__relocate(obj, obj->btf_custom_path ? : target_btf_path); + err = err ? : bpf_object_relocate(obj, obj->btf_custom_path ? : target_btf_path); err = err ? : bpf_object_load_btf(obj); err = err ? : bpf_object_create_maps(obj); - err = err ? : bpf_object__load_progs(obj, extra_log_level); + err = err ? : bpf_object_load_progs(obj, extra_log_level); err = err ? : bpf_object_init_prog_arrays(obj); err = err ? : bpf_object_prepare_struct_ops(obj); From patchwork Wed Dec 20 23:31:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13500793 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 208294B14A for ; Wed, 20 Dec 2023 23:31:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=meta.com Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3BKLuBft007006 for ; Wed, 20 Dec 2023 15:31:55 -0800 Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3v45gesqxt-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 20 Dec 2023 15:31:55 -0800 Received: from twshared2123.40.prn1.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Wed, 20 Dec 2023 15:31:53 -0800 Received: by devbig019.vll3.facebook.com (Postfix, from userid 137359) id 2CE3E3D8650A4; Wed, 20 Dec 2023 15:31:46 -0800 (PST) From: Andrii Nakryiko To: , , , CC: , Subject: [PATCH bpf-next 8/8] selftests/bpf: add arg:ctx cases to test_global_funcs tests Date: Wed, 20 Dec 2023 15:31:27 -0800 Message-ID: <20231220233127.1990417-9-andrii@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231220233127.1990417-1-andrii@kernel.org> References: <20231220233127.1990417-1-andrii@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: ad9mS_7hxtbMnT5BGdz_EYjozg2qmPfx X-Proofpoint-ORIG-GUID: ad9mS_7hxtbMnT5BGdz_EYjozg2qmPfx X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-20_13,2023-12-20_01,2023-05-22_02 X-Patchwork-Delegate: bpf@iogearbox.net Add a few extra cases of global funcs with context arguments. This time rely on "arg:ctx" decl_tag (__arg_ctx macro), but put it next to "classic" cases where context argument has to be of an exact type that BPF verifier expects (e.g., bpf_user_pt_regs_t for kprobe/uprobe). Colocating all these cases separately from other global func args that rely on arg:xxx decl tags (in verifier_global_subprogs.c) allows for simpler backwards compatibility testing on old kernels. All the cases in test_global_func_ctx_args.c are supposed to work on older kernels, which was manually validated during development. Signed-off-by: Andrii Nakryiko --- .../bpf/progs/test_global_func_ctx_args.c | 49 +++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c index 7faa8eef0598..9a06e5eb1fbe 100644 --- a/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c +++ b/tools/testing/selftests/bpf/progs/test_global_func_ctx_args.c @@ -102,3 +102,52 @@ int perf_event_ctx(void *ctx) { return perf_event_ctx_subprog(ctx); } + +/* this global subprog can be now called from many types of entry progs, each + * with different context type + */ +__weak int subprog_ctx_tag(void *ctx __arg_ctx) +{ + return bpf_get_stack(ctx, stack, sizeof(stack), 0); +} + +struct my_struct { int x; }; + +__weak int subprog_multi_ctx_tags(void *ctx1 __arg_ctx, + struct my_struct *mem, + void *ctx2 __arg_ctx) +{ + if (!mem) + return 0; + + return bpf_get_stack(ctx1, stack, sizeof(stack), 0) + + mem->x + + bpf_get_stack(ctx2, stack, sizeof(stack), 0); +} + +SEC("?raw_tp") +__success __log_level(2) +int arg_tag_ctx_raw_tp(void *ctx) +{ + struct my_struct x = { .x = 123 }; + + return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); +} + +SEC("?perf_event") +__success __log_level(2) +int arg_tag_ctx_perf(void *ctx) +{ + struct my_struct x = { .x = 123 }; + + return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); +} + +SEC("?kprobe") +__success __log_level(2) +int arg_tag_ctx_kprobe(void *ctx) +{ + struct my_struct x = { .x = 123 }; + + return subprog_ctx_tag(ctx) + subprog_multi_ctx_tags(ctx, &x, ctx); +}