[PATCHv4,bpf-next,03/28] bpf: Add multi uprobe link

Message ID	20230720113550.369257-4-jolsa@kernel.org (mailing list archive)
State	Superseded
Delegated to:	BPF
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 053E817744 for <bpf@vger.kernel.org>; Thu, 20 Jul 2023 11:36:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E267AC433C9; Thu, 20 Jul 2023 11:36:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689852985; bh=KAc5w9SdHtk1BmKjz8E7gz81mfbpnD3PAAcQXO6zE2o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cXFxKU8aKhnGX4y3ZfhwO6W25rta8zmhrHqPD9qCJnHxUzJQleslDZyfkEPZwFbCK TbBHg35ksD47e0ICaNwgnxL2RrKbX3Tjq6PSvkFmd4IrPztmSkfd42/TJ2W86BNOrB OeLfrcMqFq44nifAWBtvi6uApWoxC02DpZdDWwtPaSSyvKXLE17u5Xg5gAMcgT7Fqu qMgsWrXm6cLXKlh6pFqFx9rD9Jazo3C7x3/RhNt3WnDm1wvn7n9Lef/e48hR9Gfnay 8hRhGRbvaZ7er+7JRlTqhuC4WnMYd6n7kLz1ZbPpQoYTff/cDIT8fEAHPPAoNMnF/s TyeB1eQZQ8L1Q== From: Jiri Olsa <jolsa@kernel.org> To: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org, Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@chromium.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com> Subject: [PATCHv4 bpf-next 03/28] bpf: Add multi uprobe link Date: Thu, 20 Jul 2023 13:35:25 +0200 Message-ID: <20230720113550.369257-4-jolsa@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230720113550.369257-1-jolsa@kernel.org> References: <20230720113550.369257-1-jolsa@kernel.org> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: <bpf.vger.kernel.org> List-Subscribe: <mailto:bpf+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:bpf+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Delegate: bpf@iogearbox.net
Series	bpf: Add multi uprobe link \| expand [PATCHv4,bpf-next,00/28] bpf: Add multi uprobe link [PATCHv4,bpf-next,01/28] bpf: Switch BPF_F_KPROBE_MULTI_RETURN macro to enum [PATCHv4,bpf-next,02/28] bpf: Add attach_type checks under bpf_prog_attach_check_attach_type [PATCHv4,bpf-next,03/28] bpf: Add multi uprobe link [PATCHv4,bpf-next,04/28] bpf: Add cookies support for uprobe_multi link [PATCHv4,bpf-next,05/28] bpf: Add pid filter support for uprobe_multi link [PATCHv4,bpf-next,06/28] bpf: Add bpf_get_func_ip helper support for uprobe link [PATCHv4,bpf-next,07/28] libbpf: Add uprobe_multi attach type and link names [PATCHv4,bpf-next,08/28] libbpf: Move elf_find_func_offset* functions to elf object [PATCHv4,bpf-next,09/28] libbpf: Add elf_open/elf_close functions [PATCHv4,bpf-next,10/28] libbpf: Add elf symbol iterator [PATCHv4,bpf-next,11/28] libbpf: Add elf_resolve_syms_offsets function [PATCHv4,bpf-next,12/28] libbpf: Add elf_resolve_pattern_offsets function [PATCHv4,bpf-next,13/28] libbpf: Add bpf_link_create support for multi uprobes [PATCHv4,bpf-next,14/28] libbpf: Add bpf_program__attach_uprobe_multi function [PATCHv4,bpf-next,15/28] libbpf: Add support for u[ret]probe.multi[.s] program sections [PATCHv4,bpf-next,16/28] libbpf: Add uprobe multi link detection [PATCHv4,bpf-next,17/28] libbpf: Add uprobe multi link support to bpf_program__attach_usdt [PATCHv4,bpf-next,18/28] selftests/bpf: Move get_time_ns to testing_helpers.h [PATCHv4,bpf-next,19/28] selftests/bpf: Add uprobe_multi skel test [PATCHv4,bpf-next,20/28] selftests/bpf: Add uprobe_multi api test [PATCHv4,bpf-next,21/28] selftests/bpf: Add uprobe_multi link test [PATCHv4,bpf-next,22/28] selftests/bpf: Add uprobe_multi test program [PATCHv4,bpf-next,23/28] selftests/bpf: Add uprobe_multi bench test [PATCHv4,bpf-next,24/28] selftests/bpf: Add uprobe_multi usdt test code [PATCHv4,bpf-next,25/28] selftests/bpf: Add uprobe_multi usdt bench test [PATCHv4,bpf-next,26/28] selftests/bpf: Add uprobe_multi cookie test [PATCHv4,bpf-next,27/28] selftests/bpf: Add uprobe_multi pid filter tests [PATCHv4,bpf-next,28/28] selftests/bpf: Add extra link to uprobe_multi tests

Context	Check	Description
netdev/tree_selection	success	Clearly marked for bpf-next, async
netdev/apply	fail	Patch does not apply to bpf-next
bpf/vmtest-bpf-next-PR	success	PR summary
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26	success	Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for veristat

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index e66d04dbe56a..5b85cf18c350 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -752,6 +752,7 @@ int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id, u32 *fd_type, const char **buf, u64 *probe_offset, u64 *probe_addr); int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); +int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog); #else static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx) { @@ -798,6 +799,11 @@ bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) { return -EOPNOTSUPP; } +static inline int +bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + return -EOPNOTSUPP; +} #endif enum { diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 730c0ec99a11..e764af513425 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1036,6 +1036,7 @@ enum bpf_attach_type { BPF_LSM_CGROUP, BPF_STRUCT_OPS, BPF_NETFILTER, + BPF_TRACE_UPROBE_MULTI, __MAX_BPF_ATTACH_TYPE }; @@ -1053,6 +1054,7 @@ enum bpf_link_type { BPF_LINK_TYPE_KPROBE_MULTI = 8, BPF_LINK_TYPE_STRUCT_OPS = 9, BPF_LINK_TYPE_NETFILTER = 10, + BPF_LINK_TYPE_UPROBE_MULTI = 11, MAX_BPF_LINK_TYPE, }; @@ -1182,6 +1184,13 @@ enum { BPF_F_KPROBE_MULTI_RETURN = (1U << 0) }; +/* link_create.uprobe_multi.flags used in LINK_CREATE command for + * BPF_TRACE_UPROBE_MULTI attach type to create return probe. + */ +enum { + BPF_F_UPROBE_MULTI_RETURN = (1U << 0) +}; + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * @@ -1591,6 +1600,13 @@ union bpf_attr { __s32 priority; __u32 flags; } netfilter; + struct { + __aligned_u64 path; + __aligned_u64 offsets; + __aligned_u64 ref_ctr_offsets; + __u32 cnt; + __u32 flags; + } uprobe_multi; }; } link_create; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 7864cf4d653d..f4c6a265731e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2813,10 +2813,12 @@ static void bpf_link_free_id(int id) /* Clean up bpf_link and corresponding anon_inode file and FD. After * anon_inode is created, bpf_link can't be just kfree()'d due to deferred - * anon_inode's release() call. This helper marksbpf_link as + * anon_inode's release() call. This helper marks bpf_link as * defunct, releases anon_inode file and puts reserved FD. bpf_prog's refcnt * is not decremented, it's the responsibility of a calling code that failed * to complete bpf_link initialization. + * This helper eventually calls link's dealloc callback, but does not call + * link's release callback. */ void bpf_link_cleanup(struct bpf_link_primer *primer) { @@ -3752,8 +3754,12 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog, if (prog->expected_attach_type == BPF_TRACE_KPROBE_MULTI && attach_type != BPF_TRACE_KPROBE_MULTI) return -EINVAL; + if (prog->expected_attach_type == BPF_TRACE_UPROBE_MULTI && + attach_type != BPF_TRACE_UPROBE_MULTI) + return -EINVAL; if (attach_type != BPF_PERF_EVENT && - attach_type != BPF_TRACE_KPROBE_MULTI) + attach_type != BPF_TRACE_KPROBE_MULTI && + attach_type != BPF_TRACE_UPROBE_MULTI) return -EINVAL; return 0; default: @@ -4900,8 +4906,10 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) case BPF_PROG_TYPE_KPROBE: if (attr->link_create.attach_type == BPF_PERF_EVENT) ret = bpf_perf_link_attach(attr, prog); - else + else if (attr->link_create.attach_type == BPF_TRACE_KPROBE_MULTI) ret = bpf_kprobe_multi_link_attach(attr, prog); + else if (attr->link_create.attach_type == BPF_TRACE_UPROBE_MULTI) + ret = bpf_uprobe_multi_link_attach(attr, prog); break; default: ret = -EINVAL; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index c92eb8c6ff08..10284fd46f98 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -23,6 +23,7 @@ #include <linux/sort.h> #include <linux/key.h> #include <linux/verification.h> +#include <linux/namei.h> #include <net/bpf_sk_storage.h> @@ -2965,3 +2966,239 @@ static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx) return 0; } #endif + +#ifdef CONFIG_UPROBES +struct bpf_uprobe_multi_link; + +struct bpf_uprobe { + struct bpf_uprobe_multi_link *link; + loff_t offset; + struct uprobe_consumer consumer; +}; + +struct bpf_uprobe_multi_link { + struct path path; + struct bpf_link link; + u32 cnt; + struct bpf_uprobe *uprobes; +}; + +struct bpf_uprobe_multi_run_ctx { + struct bpf_run_ctx run_ctx; + unsigned long entry_ip; +}; + +static void bpf_uprobe_unregister(struct path *path, struct bpf_uprobe *uprobes, + u32 cnt) +{ + u32 i; + + for (i = 0; i < cnt; i++) { + uprobe_unregister(d_real_inode(path->dentry), uprobes[i].offset, + &uprobes[i].consumer); + } +} + +static void bpf_uprobe_multi_link_release(struct bpf_link *link) +{ + struct bpf_uprobe_multi_link *umulti_link; + + umulti_link = container_of(link, struct bpf_uprobe_multi_link, link); + bpf_uprobe_unregister(&umulti_link->path, umulti_link->uprobes, umulti_link->cnt); +} + +static void bpf_uprobe_multi_link_dealloc(struct bpf_link *link) +{ + struct bpf_uprobe_multi_link *umulti_link; + + umulti_link = container_of(link, struct bpf_uprobe_multi_link, link); + path_put(&umulti_link->path); + kvfree(umulti_link->uprobes); + kfree(umulti_link); +} + +static const struct bpf_link_ops bpf_uprobe_multi_link_lops = { + .release = bpf_uprobe_multi_link_release, + .dealloc = bpf_uprobe_multi_link_dealloc, +}; + +static int uprobe_prog_run(struct bpf_uprobe *uprobe, + unsigned long entry_ip, + struct pt_regs *regs) +{ + struct bpf_uprobe_multi_link *link = uprobe->link; + struct bpf_uprobe_multi_run_ctx run_ctx = { + .entry_ip = entry_ip, + }; + struct bpf_prog *prog = link->link.prog; + bool sleepable = prog->aux->sleepable; + struct bpf_run_ctx *old_run_ctx; + int err = 0; + + might_fault(); + + migrate_disable(); + + if (sleepable) + rcu_read_lock_trace(); + else + rcu_read_lock(); + + old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx); + err = bpf_prog_run(link->link.prog, regs); + bpf_reset_run_ctx(old_run_ctx); + + if (sleepable) + rcu_read_unlock_trace(); + else + rcu_read_unlock(); + + migrate_enable(); + return err; +} + +static int +uprobe_multi_link_handler(struct uprobe_consumer *con, struct pt_regs *regs) +{ + struct bpf_uprobe *uprobe; + + uprobe = container_of(con, struct bpf_uprobe, consumer); + return uprobe_prog_run(uprobe, instruction_pointer(regs), regs); +} + +static int +uprobe_multi_link_ret_handler(struct uprobe_consumer *con, unsigned long func, struct pt_regs *regs) +{ + struct bpf_uprobe *uprobe; + + uprobe = container_of(con, struct bpf_uprobe, consumer); + return uprobe_prog_run(uprobe, func, regs); +} + +int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + struct bpf_uprobe_multi_link *link = NULL; + unsigned long __user *uref_ctr_offsets; + unsigned long *ref_ctr_offsets = NULL; + struct bpf_link_primer link_primer; + struct bpf_uprobe *uprobes = NULL; + unsigned long __user *uoffsets; + void __user *upath; + u32 flags, cnt, i; + struct path path; + char *name; + int err; + + /* no support for 32bit archs yet */ + if (sizeof(u64) != sizeof(void *)) + return -EOPNOTSUPP; + + if (prog->expected_attach_type != BPF_TRACE_UPROBE_MULTI) + return -EINVAL; + + flags = attr->link_create.uprobe_multi.flags; + if (flags & ~BPF_F_UPROBE_MULTI_RETURN) + return -EINVAL; + + /* + * path, offsets and cnt are mandatory, + * ref_ctr_offsets is optional + */ + upath = u64_to_user_ptr(attr->link_create.uprobe_multi.path); + uoffsets = u64_to_user_ptr(attr->link_create.uprobe_multi.offsets); + cnt = attr->link_create.uprobe_multi.cnt; + + if (!upath || !uoffsets || !cnt) + return -EINVAL; + + uref_ctr_offsets = u64_to_user_ptr(attr->link_create.uprobe_multi.ref_ctr_offsets); + + name = strndup_user(upath, PATH_MAX); + if (IS_ERR(name)) { + err = PTR_ERR(name); + return err; + } + + err = kern_path(name, LOOKUP_FOLLOW, &path); + kfree(name); + if (err) + return err; + + if (!d_is_reg(path.dentry)) { + err = -EBADF; + goto error_path_put; + } + + err = -ENOMEM; + + link = kzalloc(sizeof(*link), GFP_KERNEL); + uprobes = kvcalloc(cnt, sizeof(*uprobes), GFP_KERNEL); + + if (!uprobes || !link) + goto error_free; + + if (uref_ctr_offsets) { + ref_ctr_offsets = kvcalloc(cnt, sizeof(*ref_ctr_offsets), GFP_KERNEL); + if (!ref_ctr_offsets) + goto error_free; + } + + for (i = 0; i < cnt; i++) { + if (uref_ctr_offsets && __get_user(ref_ctr_offsets[i], uref_ctr_offsets + i)) { + err = -EFAULT; + goto error_free; + } + if (__get_user(uprobes[i].offset, uoffsets + i)) { + err = -EFAULT; + goto error_free; + } + + uprobes[i].link = link; + + if (flags & BPF_F_UPROBE_MULTI_RETURN) + uprobes[i].consumer.ret_handler = uprobe_multi_link_ret_handler; + else + uprobes[i].consumer.handler = uprobe_multi_link_handler; + } + + link->cnt = cnt; + link->uprobes = uprobes; + link->path = path; + + bpf_link_init(&link->link, BPF_LINK_TYPE_UPROBE_MULTI, + &bpf_uprobe_multi_link_lops, prog); + + err = bpf_link_prime(&link->link, &link_primer); + if (err) + goto error_free; + + for (i = 0; i < cnt; i++) { + err = uprobe_register_refctr(d_real_inode(link->path.dentry), + uprobes[i].offset, + ref_ctr_offsets ? ref_ctr_offsets[i] : 0, + &uprobes[i].consumer); + if (err) { + bpf_uprobe_unregister(&path, uprobes, i); + bpf_link_cleanup(&link_primer); + kvfree(ref_ctr_offsets); + return err; + } + } + + kvfree(ref_ctr_offsets); + return bpf_link_settle(&link_primer); + +error_free: + kvfree(ref_ctr_offsets); + kvfree(uprobes); + kfree(link); +error_path_put: + path_put(&path); + return err; +} +#else /* !CONFIG_UPROBES */ +int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) +{ + return -EOPNOTSUPP; +} +#endif /* CONFIG_UPROBES */ diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f1ce5c149eee..659afbf9bb8b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1036,6 +1036,7 @@ enum bpf_attach_type { BPF_LSM_CGROUP, BPF_STRUCT_OPS, BPF_NETFILTER, + BPF_TRACE_UPROBE_MULTI, __MAX_BPF_ATTACH_TYPE }; @@ -1053,6 +1054,7 @@ enum bpf_link_type { BPF_LINK_TYPE_KPROBE_MULTI = 8, BPF_LINK_TYPE_STRUCT_OPS = 9, BPF_LINK_TYPE_NETFILTER = 10, + BPF_LINK_TYPE_UPROBE_MULTI = 11, MAX_BPF_LINK_TYPE, }; @@ -1182,6 +1184,13 @@ enum { BPF_F_KPROBE_MULTI_RETURN = (1U << 0) }; +/* link_create.uprobe_multi.flags used in LINK_CREATE command for + * BPF_TRACE_UPROBE_MULTI attach type to create return probe. + */ +enum { + BPF_F_UPROBE_MULTI_RETURN = (1U << 0) +}; + /* When BPF ldimm64's insn[0].src_reg != 0 then this can have * the following extensions: * @@ -1591,6 +1600,13 @@ union bpf_attr { __s32 priority; __u32 flags; } netfilter; + struct { + __aligned_u64 path; + __aligned_u64 offsets; + __aligned_u64 ref_ctr_offsets; + __u32 cnt; + __u32 flags; + } uprobe_multi; }; } link_create;

[PATCHv4,bpf-next,03/28] bpf: Add multi uprobe link

Checks

Commit Message

Patch