Message ID | 20230612151608.99661-9-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | bpf: Support ->fill_link_info for kprobe_multi and perf_event links | expand |
On 6/12/23 8:16 AM, Yafang Shao wrote: > By introducing support for ->fill_link_info to the perf_event link, users > gain the ability to inspect it using `bpftool link show`. While the current > approach involves accessing this information via `bpftool perf show`, > consolidating link information for all link types in one place offers > greater convenience. Additionally, this patch extends support to the > generic perf event, which is not currently accommodated by > `bpftool perf show`. While only the perf type and config are exposed to > userspace, other attributes such as sample_period and sample_freq are > ignored. It's important to note that if kptr_restrict is not permitted, the > probed address will not be exposed, maintaining security measures. > > A new enum bpf_link_perf_event_type is introduced to help the user > understand which struct is relevant. > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > --- > include/uapi/linux/bpf.h | 32 +++++++++++ > kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ > tools/include/uapi/linux/bpf.h | 32 +++++++++++ > 3 files changed, 188 insertions(+) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 23691ea..8d4556e 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -1056,6 +1056,16 @@ enum bpf_link_type { > MAX_BPF_LINK_TYPE, > }; > > +enum bpf_perf_link_type { > + BPF_PERF_LINK_UNSPEC = 0, > + BPF_PERF_LINK_UPROBE = 1, > + BPF_PERF_LINK_KPROBE = 2, > + BPF_PERF_LINK_TRACEPOINT = 3, > + BPF_PERF_LINK_PERF_EVENT = 4, > + > + MAX_BPF_LINK_PERF_EVENT_TYPE, > +}; > + > /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command > * > * NONE(default): No further bpf programs allowed in the subtree. > @@ -6443,7 +6453,29 @@ struct bpf_link_info { > __u32 count; > __u32 flags; > } kprobe_multi; > + struct { > + __u64 config; > + __u32 type; > + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ > + struct { > + __aligned_u64 file_name; /* in/out: buff ptr */ > + __u32 name_len; > + __u32 offset; /* offset from name */ > + __u32 flags; > + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ > + struct { > + __aligned_u64 func_name; /* in/out: buff ptr */ > + __u32 name_len; > + __u32 offset; /* offset from name */ > + __u64 addr; > + __u32 flags; > + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ > + struct { > + __aligned_u64 tp_name; /* in/out: buff ptr */ > + __u32 name_len; > + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ > }; > + __u32 perf_link_type; /* enum bpf_perf_link_type */ I think put perf_link_type into each indivual struct is better. It won't increase the bpf_link_info struct size. It will allow extensions for all structs in the big union (raw_tracepoint, tracing, cgroup, iter, ..., kprobe_multi, ...) etc. > } __attribute__((aligned(8))); > > /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed [...]
On Tue, Jun 13, 2023 at 1:36 AM Yonghong Song <yhs@meta.com> wrote: > > > > On 6/12/23 8:16 AM, Yafang Shao wrote: > > By introducing support for ->fill_link_info to the perf_event link, users > > gain the ability to inspect it using `bpftool link show`. While the current > > approach involves accessing this information via `bpftool perf show`, > > consolidating link information for all link types in one place offers > > greater convenience. Additionally, this patch extends support to the > > generic perf event, which is not currently accommodated by > > `bpftool perf show`. While only the perf type and config are exposed to > > userspace, other attributes such as sample_period and sample_freq are > > ignored. It's important to note that if kptr_restrict is not permitted, the > > probed address will not be exposed, maintaining security measures. > > > > A new enum bpf_link_perf_event_type is introduced to help the user > > understand which struct is relevant. > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > --- > > include/uapi/linux/bpf.h | 32 +++++++++++ > > kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ > > tools/include/uapi/linux/bpf.h | 32 +++++++++++ > > 3 files changed, 188 insertions(+) > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index 23691ea..8d4556e 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -1056,6 +1056,16 @@ enum bpf_link_type { > > MAX_BPF_LINK_TYPE, > > }; > > > > +enum bpf_perf_link_type { > > + BPF_PERF_LINK_UNSPEC = 0, > > + BPF_PERF_LINK_UPROBE = 1, > > + BPF_PERF_LINK_KPROBE = 2, > > + BPF_PERF_LINK_TRACEPOINT = 3, > > + BPF_PERF_LINK_PERF_EVENT = 4, > > + > > + MAX_BPF_LINK_PERF_EVENT_TYPE, > > +}; > > + > > /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command > > * > > * NONE(default): No further bpf programs allowed in the subtree. > > @@ -6443,7 +6453,29 @@ struct bpf_link_info { > > __u32 count; > > __u32 flags; > > } kprobe_multi; > > + struct { > > + __u64 config; > > + __u32 type; > > + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ > > + struct { > > + __aligned_u64 file_name; /* in/out: buff ptr */ > > + __u32 name_len; > > + __u32 offset; /* offset from name */ > > + __u32 flags; > > + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ > > + struct { > > + __aligned_u64 func_name; /* in/out: buff ptr */ > > + __u32 name_len; > > + __u32 offset; /* offset from name */ > > + __u64 addr; > > + __u32 flags; > > + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ > > + struct { > > + __aligned_u64 tp_name; /* in/out: buff ptr */ > > + __u32 name_len; > > + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ > > }; > > + __u32 perf_link_type; /* enum bpf_perf_link_type */ > > I think put perf_link_type into each indivual struct is better. > It won't increase the bpf_link_info struct size. It will allow > extensions for all structs in the big union (raw_tracepoint, > tracing, cgroup, iter, ..., kprobe_multi, ...) etc. If we put it into each individual struct, we have to choose one specific struct to get the type before we use the real struct, for example, if (info.perf_event.type == BPF_PERF_LINK_PERF_EVENT) goto out; if (info.perf_event.type == BPF_PERF_LINK_TRACEPOINT && !info.tracepoint.tp_name) { info.tracepoint.tp_name = (unsigned long)&buf; info.tracepoint.name_len = sizeof(buf); goto again; } ... That doesn't look perfect. However I agree with you that the perf_link_type may disallow the extensions for the big union. I will think about it.
On 6/12/23 19:47, Yafang Shao wrote: > On Tue, Jun 13, 2023 at 1:36 AM Yonghong Song <yhs@meta.com> wrote: >> >> >> >> On 6/12/23 8:16 AM, Yafang Shao wrote: >>> By introducing support for ->fill_link_info to the perf_event link, users >>> gain the ability to inspect it using `bpftool link show`. While the current >>> approach involves accessing this information via `bpftool perf show`, >>> consolidating link information for all link types in one place offers >>> greater convenience. Additionally, this patch extends support to the >>> generic perf event, which is not currently accommodated by >>> `bpftool perf show`. While only the perf type and config are exposed to >>> userspace, other attributes such as sample_period and sample_freq are >>> ignored. It's important to note that if kptr_restrict is not permitted, the >>> probed address will not be exposed, maintaining security measures. >>> >>> A new enum bpf_link_perf_event_type is introduced to help the user >>> understand which struct is relevant. >>> >>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> >>> --- >>> include/uapi/linux/bpf.h | 32 +++++++++++ >>> kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ >>> tools/include/uapi/linux/bpf.h | 32 +++++++++++ >>> 3 files changed, 188 insertions(+) >>> >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >>> index 23691ea..8d4556e 100644 >>> --- a/include/uapi/linux/bpf.h >>> +++ b/include/uapi/linux/bpf.h >>> @@ -1056,6 +1056,16 @@ enum bpf_link_type { >>> MAX_BPF_LINK_TYPE, >>> }; >>> >>> +enum bpf_perf_link_type { >>> + BPF_PERF_LINK_UNSPEC = 0, >>> + BPF_PERF_LINK_UPROBE = 1, >>> + BPF_PERF_LINK_KPROBE = 2, >>> + BPF_PERF_LINK_TRACEPOINT = 3, >>> + BPF_PERF_LINK_PERF_EVENT = 4, >>> + >>> + MAX_BPF_LINK_PERF_EVENT_TYPE, >>> +}; >>> + >>> /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command >>> * >>> * NONE(default): No further bpf programs allowed in the subtree. >>> @@ -6443,7 +6453,29 @@ struct bpf_link_info { >>> __u32 count; >>> __u32 flags; >>> } kprobe_multi; >>> + struct { >>> + __u64 config; >>> + __u32 type; >>> + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ >>> + struct { >>> + __aligned_u64 file_name; /* in/out: buff ptr */ >>> + __u32 name_len; >>> + __u32 offset; /* offset from name */ >>> + __u32 flags; >>> + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ >>> + struct { >>> + __aligned_u64 func_name; /* in/out: buff ptr */ >>> + __u32 name_len; >>> + __u32 offset; /* offset from name */ >>> + __u64 addr; >>> + __u32 flags; >>> + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ >>> + struct { >>> + __aligned_u64 tp_name; /* in/out: buff ptr */ >>> + __u32 name_len; >>> + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ >>> }; >>> + __u32 perf_link_type; /* enum bpf_perf_link_type */ >> >> I think put perf_link_type into each indivual struct is better. >> It won't increase the bpf_link_info struct size. It will allow >> extensions for all structs in the big union (raw_tracepoint, >> tracing, cgroup, iter, ..., kprobe_multi, ...) etc. > > If we put it into each individual struct, we have to choose one > specific struct to get the type before we use the real struct, for > example, > if (info.perf_event.type == BPF_PERF_LINK_PERF_EVENT) > goto out; > if (info.perf_event.type == BPF_PERF_LINK_TRACEPOINT && > !info.tracepoint.tp_name) { > info.tracepoint.tp_name = (unsigned long)&buf; > info.tracepoint.name_len = sizeof(buf); > goto again; > } > ... > > That doesn't look perfect. How about adding a common struct? struct { __u32 type; } perf_common; Then you check info.perf_common.type. > > However I agree with you that the perf_link_type may disallow the > extensions for the big union. I will think about it. >
On Wed, Jun 14, 2023 at 10:34 AM Kui-Feng Lee <sinquersw@gmail.com> wrote: > > > > On 6/12/23 19:47, Yafang Shao wrote: > > On Tue, Jun 13, 2023 at 1:36 AM Yonghong Song <yhs@meta.com> wrote: > >> > >> > >> > >> On 6/12/23 8:16 AM, Yafang Shao wrote: > >>> By introducing support for ->fill_link_info to the perf_event link, users > >>> gain the ability to inspect it using `bpftool link show`. While the current > >>> approach involves accessing this information via `bpftool perf show`, > >>> consolidating link information for all link types in one place offers > >>> greater convenience. Additionally, this patch extends support to the > >>> generic perf event, which is not currently accommodated by > >>> `bpftool perf show`. While only the perf type and config are exposed to > >>> userspace, other attributes such as sample_period and sample_freq are > >>> ignored. It's important to note that if kptr_restrict is not permitted, the > >>> probed address will not be exposed, maintaining security measures. > >>> > >>> A new enum bpf_link_perf_event_type is introduced to help the user > >>> understand which struct is relevant. > >>> > >>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > >>> --- > >>> include/uapi/linux/bpf.h | 32 +++++++++++ > >>> kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ > >>> tools/include/uapi/linux/bpf.h | 32 +++++++++++ > >>> 3 files changed, 188 insertions(+) > >>> > >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > >>> index 23691ea..8d4556e 100644 > >>> --- a/include/uapi/linux/bpf.h > >>> +++ b/include/uapi/linux/bpf.h > >>> @@ -1056,6 +1056,16 @@ enum bpf_link_type { > >>> MAX_BPF_LINK_TYPE, > >>> }; > >>> > >>> +enum bpf_perf_link_type { > >>> + BPF_PERF_LINK_UNSPEC = 0, > >>> + BPF_PERF_LINK_UPROBE = 1, > >>> + BPF_PERF_LINK_KPROBE = 2, > >>> + BPF_PERF_LINK_TRACEPOINT = 3, > >>> + BPF_PERF_LINK_PERF_EVENT = 4, > >>> + > >>> + MAX_BPF_LINK_PERF_EVENT_TYPE, > >>> +}; > >>> + > >>> /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command > >>> * > >>> * NONE(default): No further bpf programs allowed in the subtree. > >>> @@ -6443,7 +6453,29 @@ struct bpf_link_info { > >>> __u32 count; > >>> __u32 flags; > >>> } kprobe_multi; > >>> + struct { > >>> + __u64 config; > >>> + __u32 type; > >>> + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ > >>> + struct { > >>> + __aligned_u64 file_name; /* in/out: buff ptr */ > >>> + __u32 name_len; > >>> + __u32 offset; /* offset from name */ > >>> + __u32 flags; > >>> + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ > >>> + struct { > >>> + __aligned_u64 func_name; /* in/out: buff ptr */ > >>> + __u32 name_len; > >>> + __u32 offset; /* offset from name */ > >>> + __u64 addr; > >>> + __u32 flags; > >>> + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ > >>> + struct { > >>> + __aligned_u64 tp_name; /* in/out: buff ptr */ > >>> + __u32 name_len; > >>> + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ > >>> }; > >>> + __u32 perf_link_type; /* enum bpf_perf_link_type */ > >> > >> I think put perf_link_type into each indivual struct is better. > >> It won't increase the bpf_link_info struct size. It will allow > >> extensions for all structs in the big union (raw_tracepoint, > >> tracing, cgroup, iter, ..., kprobe_multi, ...) etc. > > > > If we put it into each individual struct, we have to choose one > > specific struct to get the type before we use the real struct, for > > example, > > if (info.perf_event.type == BPF_PERF_LINK_PERF_EVENT) > > goto out; > > if (info.perf_event.type == BPF_PERF_LINK_TRACEPOINT && > > !info.tracepoint.tp_name) { > > info.tracepoint.tp_name = (unsigned long)&buf; > > info.tracepoint.name_len = sizeof(buf); > > goto again; > > } > > ... > > > > That doesn't look perfect. > > How about adding a common struct? > > struct { > __u32 type; > } perf_common; > > Then you check info.perf_common.type. I perfer below struct, struct { __u32 type; /* enum bpf_perf_link_type */ union { struct { __u64 config; __u32 type; } perf_event; /* BPF_PERF_LINK_PERF_EVENT */ struct { __aligned_u64 file_name; /* in/out */ __u32 name_len; __u32 offset;/* offset from file_name */ __u32 flags; } uprobe; /* BPF_PERF_LINK_UPROBE */ struct { __aligned_u64 func_name; /* in/out */ __u32 name_len; __u32 offset;/* offset from func_name */ __u64 addr; __u32 flags; } kprobe; /* BPF_PERF_LINK_KPROBE */ struct { __aligned_u64 tp_name; /* in/out */ __u32 name_len; } tracepoint; /* BPF_PERF_LINK_TRACEPOINT */ }; } perf_link; I think that would be more clear.
On Mon, Jun 12, 2023 at 03:16:06PM +0000, Yafang Shao wrote: SNIP > > /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 80c9ec0..fe354d5 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -3303,9 +3303,133 @@ static void bpf_perf_link_dealloc(struct bpf_link *link) > kfree(perf_link); > } > > +static int bpf_perf_link_fill_name(const struct perf_event *event, > + char __user *uname, u32 ulen, > + u64 *probe_offset, u64 *probe_addr, > + u32 *fd_type) > +{ this function name sounds misleading, it does query all the link data plus copying the name.. seems like this should be renamed and separated > + const char *buf; > + u32 prog_id; > + size_t len; > + int err; > + > + if (!ulen ^ !uname) > + return -EINVAL; > + if (!uname) > + return 0; > + > + err = bpf_get_perf_event_info(event, &prog_id, fd_type, &buf, > + probe_offset, probe_addr); > + if (err) > + return err; > + > + len = strlen(buf); > + if (buf) { > + err = bpf_copy_to_user(uname, buf, ulen, len); > + if (err) > + return err; > + } else { > + char zero = '\0'; > + > + if (put_user(zero, uname)) > + return -EFAULT; > + } > + return 0; > +} > + > +static int bpf_perf_link_fill_probe(const struct perf_event *event, > + struct bpf_link_info *info) > +{ > + char __user *uname; > + u64 addr, offset; > + u32 ulen, type; > + int err; > + > +#ifdef CONFIG_KPROBE_EVENTS this will break compilation when CONFIG_KPROBE_EVENTS or CONFIG_UPROBE_EVENTS options are not defined jirka > + if (event->tp_event->flags & TRACE_EVENT_FL_KPROBE) { > + uname = u64_to_user_ptr(info->kprobe.func_name); > + ulen = info->kprobe.name_len; > + info->perf_link_type = BPF_PERF_LINK_KPROBE; > + err = bpf_perf_link_fill_name(event, uname, ulen, &offset, > + &addr, &type); > + if (err) > + return err; > + > + info->kprobe.offset = offset; > + if (type == BPF_FD_TYPE_KRETPROBE) > + info->kprobe.flags = 1; > + if (!kallsyms_show_value(current_cred())) > + return 0; > + info->kprobe.addr = addr; > + return 0; > + } > +#endif > + > +#ifdef CONFIG_UPROBE_EVENTS > + if (event->tp_event->flags & TRACE_EVENT_FL_UPROBE) { > + uname = u64_to_user_ptr(info->uprobe.file_name); > + ulen = info->uprobe.name_len; > + info->perf_link_type = BPF_PERF_LINK_UPROBE; > + err = bpf_perf_link_fill_name(event, uname, ulen, &offset, > + &addr, &type); > + if (err) > + return err; > + > + info->uprobe.offset = offset; > + if (type == BPF_FD_TYPE_URETPROBE) > + info->uprobe.flags = 1; > + return 0; > + } > +#endif > + > + return -EOPNOTSUPP; > +} > + SNIP
On Thu, Jun 15, 2023 at 6:21 PM Jiri Olsa <olsajiri@gmail.com> wrote: > > On Mon, Jun 12, 2023 at 03:16:06PM +0000, Yafang Shao wrote: > > SNIP > > > > > /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > index 80c9ec0..fe354d5 100644 > > --- a/kernel/bpf/syscall.c > > +++ b/kernel/bpf/syscall.c > > @@ -3303,9 +3303,133 @@ static void bpf_perf_link_dealloc(struct bpf_link *link) > > kfree(perf_link); > > } > > > > +static int bpf_perf_link_fill_name(const struct perf_event *event, > > + char __user *uname, u32 ulen, > > + u64 *probe_offset, u64 *probe_addr, > > + u32 *fd_type) > > +{ > > this function name sounds misleading, it does query all the link data > plus copying the name.. seems like this should be renamed and separated Will do it. > > > > + const char *buf; > > + u32 prog_id; > > + size_t len; > > + int err; > > + > > + if (!ulen ^ !uname) > > + return -EINVAL; > > + if (!uname) > > + return 0; > > + > > + err = bpf_get_perf_event_info(event, &prog_id, fd_type, &buf, > > + probe_offset, probe_addr); > > + if (err) > > + return err; > > + > > + len = strlen(buf); > > + if (buf) { > > + err = bpf_copy_to_user(uname, buf, ulen, len); > > + if (err) > > + return err; > > + } else { > > + char zero = '\0'; > > + > > + if (put_user(zero, uname)) > > + return -EFAULT; > > + } > > + return 0; > > +} > > + > > +static int bpf_perf_link_fill_probe(const struct perf_event *event, > > + struct bpf_link_info *info) > > +{ > > + char __user *uname; > > + u64 addr, offset; > > + u32 ulen, type; > > + int err; > > + > > +#ifdef CONFIG_KPROBE_EVENTS > > this will break compilation when CONFIG_KPROBE_EVENTS or CONFIG_UPROBE_EVENTS > options are not defined Indeed. Will improve it.
On Tue, Jun 13, 2023 at 7:46 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > On Wed, Jun 14, 2023 at 10:34 AM Kui-Feng Lee <sinquersw@gmail.com> wrote: > > > > > > > > On 6/12/23 19:47, Yafang Shao wrote: > > > On Tue, Jun 13, 2023 at 1:36 AM Yonghong Song <yhs@meta.com> wrote: > > >> > > >> > > >> > > >> On 6/12/23 8:16 AM, Yafang Shao wrote: > > >>> By introducing support for ->fill_link_info to the perf_event link, users > > >>> gain the ability to inspect it using `bpftool link show`. While the current > > >>> approach involves accessing this information via `bpftool perf show`, > > >>> consolidating link information for all link types in one place offers > > >>> greater convenience. Additionally, this patch extends support to the > > >>> generic perf event, which is not currently accommodated by > > >>> `bpftool perf show`. While only the perf type and config are exposed to > > >>> userspace, other attributes such as sample_period and sample_freq are > > >>> ignored. It's important to note that if kptr_restrict is not permitted, the > > >>> probed address will not be exposed, maintaining security measures. > > >>> > > >>> A new enum bpf_link_perf_event_type is introduced to help the user > > >>> understand which struct is relevant. > > >>> > > >>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > >>> --- > > >>> include/uapi/linux/bpf.h | 32 +++++++++++ > > >>> kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ > > >>> tools/include/uapi/linux/bpf.h | 32 +++++++++++ > > >>> 3 files changed, 188 insertions(+) > > >>> > > >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > >>> index 23691ea..8d4556e 100644 > > >>> --- a/include/uapi/linux/bpf.h > > >>> +++ b/include/uapi/linux/bpf.h > > >>> @@ -1056,6 +1056,16 @@ enum bpf_link_type { > > >>> MAX_BPF_LINK_TYPE, > > >>> }; > > >>> > > >>> +enum bpf_perf_link_type { > > >>> + BPF_PERF_LINK_UNSPEC = 0, > > >>> + BPF_PERF_LINK_UPROBE = 1, > > >>> + BPF_PERF_LINK_KPROBE = 2, > > >>> + BPF_PERF_LINK_TRACEPOINT = 3, > > >>> + BPF_PERF_LINK_PERF_EVENT = 4, > > >>> + > > >>> + MAX_BPF_LINK_PERF_EVENT_TYPE, > > >>> +}; > > >>> + > > >>> /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command > > >>> * > > >>> * NONE(default): No further bpf programs allowed in the subtree. > > >>> @@ -6443,7 +6453,29 @@ struct bpf_link_info { > > >>> __u32 count; > > >>> __u32 flags; > > >>> } kprobe_multi; > > >>> + struct { > > >>> + __u64 config; > > >>> + __u32 type; > > >>> + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ > > >>> + struct { > > >>> + __aligned_u64 file_name; /* in/out: buff ptr */ > > >>> + __u32 name_len; > > >>> + __u32 offset; /* offset from name */ > > >>> + __u32 flags; > > >>> + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ > > >>> + struct { > > >>> + __aligned_u64 func_name; /* in/out: buff ptr */ > > >>> + __u32 name_len; > > >>> + __u32 offset; /* offset from name */ > > >>> + __u64 addr; > > >>> + __u32 flags; > > >>> + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ > > >>> + struct { > > >>> + __aligned_u64 tp_name; /* in/out: buff ptr */ > > >>> + __u32 name_len; > > >>> + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ > > >>> }; > > >>> + __u32 perf_link_type; /* enum bpf_perf_link_type */ > > >> > > >> I think put perf_link_type into each indivual struct is better. > > >> It won't increase the bpf_link_info struct size. It will allow > > >> extensions for all structs in the big union (raw_tracepoint, > > >> tracing, cgroup, iter, ..., kprobe_multi, ...) etc. > > > > > > If we put it into each individual struct, we have to choose one > > > specific struct to get the type before we use the real struct, for > > > example, > > > if (info.perf_event.type == BPF_PERF_LINK_PERF_EVENT) > > > goto out; > > > if (info.perf_event.type == BPF_PERF_LINK_TRACEPOINT && > > > !info.tracepoint.tp_name) { > > > info.tracepoint.tp_name = (unsigned long)&buf; > > > info.tracepoint.name_len = sizeof(buf); > > > goto again; > > > } > > > ... > > > > > > That doesn't look perfect. > > > > How about adding a common struct? > > > > struct { > > __u32 type; > > } perf_common; > > > > Then you check info.perf_common.type. > > I perfer below struct, +1, we should do it this way > struct { > __u32 type; /* enum bpf_perf_link_type */ > union { > struct { > __u64 config; > __u32 type; > } perf_event; /* BPF_PERF_LINK_PERF_EVENT */ > struct { > __aligned_u64 file_name; /* in/out */ > __u32 name_len; > __u32 offset;/* offset from file_name */ > __u32 flags; > } uprobe; /* BPF_PERF_LINK_UPROBE */ > struct { > __aligned_u64 func_name; /* in/out */ > __u32 name_len; > __u32 offset;/* offset from func_name */ > __u64 addr; > __u32 flags; > } kprobe; /* BPF_PERF_LINK_KPROBE */ > struct { > __aligned_u64 tp_name; /* in/out */ > __u32 name_len; > } tracepoint; /* BPF_PERF_LINK_TRACEPOINT */ > }; > } perf_link; this should be named "perf_event" to match BPF_LINK_TYPE_PERF_EVENT and "perf_event" above probably could be just "event" then? Similarly we can s/BPF_PERF_LINK_PERF_EVENT/BPF_PERF_LINK_EVENT/? > > I think that would be more clear. > > -- > Regards > Yafang >
On Sat, Jun 17, 2023 at 4:36 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, Jun 13, 2023 at 7:46 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > On Wed, Jun 14, 2023 at 10:34 AM Kui-Feng Lee <sinquersw@gmail.com> wrote: > > > > > > > > > > > > On 6/12/23 19:47, Yafang Shao wrote: > > > > On Tue, Jun 13, 2023 at 1:36 AM Yonghong Song <yhs@meta.com> wrote: > > > >> > > > >> > > > >> > > > >> On 6/12/23 8:16 AM, Yafang Shao wrote: > > > >>> By introducing support for ->fill_link_info to the perf_event link, users > > > >>> gain the ability to inspect it using `bpftool link show`. While the current > > > >>> approach involves accessing this information via `bpftool perf show`, > > > >>> consolidating link information for all link types in one place offers > > > >>> greater convenience. Additionally, this patch extends support to the > > > >>> generic perf event, which is not currently accommodated by > > > >>> `bpftool perf show`. While only the perf type and config are exposed to > > > >>> userspace, other attributes such as sample_period and sample_freq are > > > >>> ignored. It's important to note that if kptr_restrict is not permitted, the > > > >>> probed address will not be exposed, maintaining security measures. > > > >>> > > > >>> A new enum bpf_link_perf_event_type is introduced to help the user > > > >>> understand which struct is relevant. > > > >>> > > > >>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > > >>> --- > > > >>> include/uapi/linux/bpf.h | 32 +++++++++++ > > > >>> kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ > > > >>> tools/include/uapi/linux/bpf.h | 32 +++++++++++ > > > >>> 3 files changed, 188 insertions(+) > > > >>> > > > >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > >>> index 23691ea..8d4556e 100644 > > > >>> --- a/include/uapi/linux/bpf.h > > > >>> +++ b/include/uapi/linux/bpf.h > > > >>> @@ -1056,6 +1056,16 @@ enum bpf_link_type { > > > >>> MAX_BPF_LINK_TYPE, > > > >>> }; > > > >>> > > > >>> +enum bpf_perf_link_type { > > > >>> + BPF_PERF_LINK_UNSPEC = 0, > > > >>> + BPF_PERF_LINK_UPROBE = 1, > > > >>> + BPF_PERF_LINK_KPROBE = 2, > > > >>> + BPF_PERF_LINK_TRACEPOINT = 3, > > > >>> + BPF_PERF_LINK_PERF_EVENT = 4, > > > >>> + > > > >>> + MAX_BPF_LINK_PERF_EVENT_TYPE, > > > >>> +}; > > > >>> + > > > >>> /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command > > > >>> * > > > >>> * NONE(default): No further bpf programs allowed in the subtree. > > > >>> @@ -6443,7 +6453,29 @@ struct bpf_link_info { > > > >>> __u32 count; > > > >>> __u32 flags; > > > >>> } kprobe_multi; > > > >>> + struct { > > > >>> + __u64 config; > > > >>> + __u32 type; > > > >>> + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ > > > >>> + struct { > > > >>> + __aligned_u64 file_name; /* in/out: buff ptr */ > > > >>> + __u32 name_len; > > > >>> + __u32 offset; /* offset from name */ > > > >>> + __u32 flags; > > > >>> + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ > > > >>> + struct { > > > >>> + __aligned_u64 func_name; /* in/out: buff ptr */ > > > >>> + __u32 name_len; > > > >>> + __u32 offset; /* offset from name */ > > > >>> + __u64 addr; > > > >>> + __u32 flags; > > > >>> + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ > > > >>> + struct { > > > >>> + __aligned_u64 tp_name; /* in/out: buff ptr */ > > > >>> + __u32 name_len; > > > >>> + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ > > > >>> }; > > > >>> + __u32 perf_link_type; /* enum bpf_perf_link_type */ > > > >> > > > >> I think put perf_link_type into each indivual struct is better. > > > >> It won't increase the bpf_link_info struct size. It will allow > > > >> extensions for all structs in the big union (raw_tracepoint, > > > >> tracing, cgroup, iter, ..., kprobe_multi, ...) etc. > > > > > > > > If we put it into each individual struct, we have to choose one > > > > specific struct to get the type before we use the real struct, for > > > > example, > > > > if (info.perf_event.type == BPF_PERF_LINK_PERF_EVENT) > > > > goto out; > > > > if (info.perf_event.type == BPF_PERF_LINK_TRACEPOINT && > > > > !info.tracepoint.tp_name) { > > > > info.tracepoint.tp_name = (unsigned long)&buf; > > > > info.tracepoint.name_len = sizeof(buf); > > > > goto again; > > > > } > > > > ... > > > > > > > > That doesn't look perfect. > > > > > > How about adding a common struct? > > > > > > struct { > > > __u32 type; > > > } perf_common; > > > > > > Then you check info.perf_common.type. > > > > I perfer below struct, > > +1, we should do it this way > > > struct { > > __u32 type; /* enum bpf_perf_link_type */ > > union { > > struct { > > __u64 config; > > __u32 type; > > } perf_event; /* BPF_PERF_LINK_PERF_EVENT */ > > struct { > > __aligned_u64 file_name; /* in/out */ > > __u32 name_len; > > __u32 offset;/* offset from file_name */ > > __u32 flags; > > } uprobe; /* BPF_PERF_LINK_UPROBE */ > > struct { > > __aligned_u64 func_name; /* in/out */ > > __u32 name_len; > > __u32 offset;/* offset from func_name */ > > __u64 addr; > > __u32 flags; > > } kprobe; /* BPF_PERF_LINK_KPROBE */ > > struct { > > __aligned_u64 tp_name; /* in/out */ > > __u32 name_len; > > } tracepoint; /* BPF_PERF_LINK_TRACEPOINT */ > > }; > > } perf_link; > > this should be named "perf_event" to match BPF_LINK_TYPE_PERF_EVENT > > and "perf_event" above probably could be just "event" then? Similarly > we can s/BPF_PERF_LINK_PERF_EVENT/BPF_PERF_LINK_EVENT/? Agree. Will change it.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 23691ea..8d4556e 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1056,6 +1056,16 @@ enum bpf_link_type { MAX_BPF_LINK_TYPE, }; +enum bpf_perf_link_type { + BPF_PERF_LINK_UNSPEC = 0, + BPF_PERF_LINK_UPROBE = 1, + BPF_PERF_LINK_KPROBE = 2, + BPF_PERF_LINK_TRACEPOINT = 3, + BPF_PERF_LINK_PERF_EVENT = 4, + + MAX_BPF_LINK_PERF_EVENT_TYPE, +}; + /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command * * NONE(default): No further bpf programs allowed in the subtree. @@ -6443,7 +6453,29 @@ struct bpf_link_info { __u32 count; __u32 flags; } kprobe_multi; + struct { + __u64 config; + __u32 type; + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ + struct { + __aligned_u64 file_name; /* in/out: buff ptr */ + __u32 name_len; + __u32 offset; /* offset from name */ + __u32 flags; + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ + struct { + __aligned_u64 func_name; /* in/out: buff ptr */ + __u32 name_len; + __u32 offset; /* offset from name */ + __u64 addr; + __u32 flags; + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ + struct { + __aligned_u64 tp_name; /* in/out: buff ptr */ + __u32 name_len; + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ }; + __u32 perf_link_type; /* enum bpf_perf_link_type */ } __attribute__((aligned(8))); /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 80c9ec0..fe354d5 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3303,9 +3303,133 @@ static void bpf_perf_link_dealloc(struct bpf_link *link) kfree(perf_link); } +static int bpf_perf_link_fill_name(const struct perf_event *event, + char __user *uname, u32 ulen, + u64 *probe_offset, u64 *probe_addr, + u32 *fd_type) +{ + const char *buf; + u32 prog_id; + size_t len; + int err; + + if (!ulen ^ !uname) + return -EINVAL; + if (!uname) + return 0; + + err = bpf_get_perf_event_info(event, &prog_id, fd_type, &buf, + probe_offset, probe_addr); + if (err) + return err; + + len = strlen(buf); + if (buf) { + err = bpf_copy_to_user(uname, buf, ulen, len); + if (err) + return err; + } else { + char zero = '\0'; + + if (put_user(zero, uname)) + return -EFAULT; + } + return 0; +} + +static int bpf_perf_link_fill_probe(const struct perf_event *event, + struct bpf_link_info *info) +{ + char __user *uname; + u64 addr, offset; + u32 ulen, type; + int err; + +#ifdef CONFIG_KPROBE_EVENTS + if (event->tp_event->flags & TRACE_EVENT_FL_KPROBE) { + uname = u64_to_user_ptr(info->kprobe.func_name); + ulen = info->kprobe.name_len; + info->perf_link_type = BPF_PERF_LINK_KPROBE; + err = bpf_perf_link_fill_name(event, uname, ulen, &offset, + &addr, &type); + if (err) + return err; + + info->kprobe.offset = offset; + if (type == BPF_FD_TYPE_KRETPROBE) + info->kprobe.flags = 1; + if (!kallsyms_show_value(current_cred())) + return 0; + info->kprobe.addr = addr; + return 0; + } +#endif + +#ifdef CONFIG_UPROBE_EVENTS + if (event->tp_event->flags & TRACE_EVENT_FL_UPROBE) { + uname = u64_to_user_ptr(info->uprobe.file_name); + ulen = info->uprobe.name_len; + info->perf_link_type = BPF_PERF_LINK_UPROBE; + err = bpf_perf_link_fill_name(event, uname, ulen, &offset, + &addr, &type); + if (err) + return err; + + info->uprobe.offset = offset; + if (type == BPF_FD_TYPE_URETPROBE) + info->uprobe.flags = 1; + return 0; + } +#endif + + return -EOPNOTSUPP; +} + +static int bpf_perf_link_fill_tracepoint(const struct perf_event *event, + struct bpf_link_info *info) +{ + char __user *uname = u64_to_user_ptr(info->tracepoint.tp_name); + u32 ulen = info->tracepoint.name_len; + u64 addr, off; + u32 type; + + info->perf_link_type = BPF_PERF_LINK_TRACEPOINT; + return bpf_perf_link_fill_name(event, uname, ulen, &off, &addr, &type); +} + +static int bpf_perf_link_fill_perf_event(const struct perf_event *event, + struct bpf_link_info *info) +{ + info->perf_event.type = event->attr.type; + info->perf_event.config = event->attr.config; + info->perf_link_type = BPF_PERF_LINK_PERF_EVENT; + return 0; +} + +static int bpf_perf_link_fill_link_info(const struct bpf_link *link, + struct bpf_link_info *info) +{ + struct bpf_perf_link *perf_link; + const struct perf_event *event; + + perf_link = container_of(link, struct bpf_perf_link, link); + event = perf_get_event(perf_link->perf_file); + if (IS_ERR(event)) + return PTR_ERR(event); + + if (!event->prog) + return -EINVAL; + if (event->prog->type == BPF_PROG_TYPE_PERF_EVENT) + return bpf_perf_link_fill_perf_event(event, info); + if (event->prog->type == BPF_PROG_TYPE_TRACEPOINT) + return bpf_perf_link_fill_tracepoint(event, info); + return bpf_perf_link_fill_probe(event, info); +} + static const struct bpf_link_ops bpf_perf_link_lops = { .release = bpf_perf_link_release, .dealloc = bpf_perf_link_dealloc, + .fill_link_info = bpf_perf_link_fill_link_info, }; static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 23691ea..8d4556e 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1056,6 +1056,16 @@ enum bpf_link_type { MAX_BPF_LINK_TYPE, }; +enum bpf_perf_link_type { + BPF_PERF_LINK_UNSPEC = 0, + BPF_PERF_LINK_UPROBE = 1, + BPF_PERF_LINK_KPROBE = 2, + BPF_PERF_LINK_TRACEPOINT = 3, + BPF_PERF_LINK_PERF_EVENT = 4, + + MAX_BPF_LINK_PERF_EVENT_TYPE, +}; + /* cgroup-bpf attach flags used in BPF_PROG_ATTACH command * * NONE(default): No further bpf programs allowed in the subtree. @@ -6443,7 +6453,29 @@ struct bpf_link_info { __u32 count; __u32 flags; } kprobe_multi; + struct { + __u64 config; + __u32 type; + } perf_event; /* BPF_LINK_PERF_EVENT_PERF_EVENT */ + struct { + __aligned_u64 file_name; /* in/out: buff ptr */ + __u32 name_len; + __u32 offset; /* offset from name */ + __u32 flags; + } uprobe; /* BPF_LINK_PERF_EVENT_UPROBE */ + struct { + __aligned_u64 func_name; /* in/out: buff ptr */ + __u32 name_len; + __u32 offset; /* offset from name */ + __u64 addr; + __u32 flags; + } kprobe; /* BPF_LINK_PERF_EVENT_KPROBE */ + struct { + __aligned_u64 tp_name; /* in/out: buff ptr */ + __u32 name_len; + } tracepoint; /* BPF_LINK_PERF_EVENT_TRACEPOINT */ }; + __u32 perf_link_type; /* enum bpf_perf_link_type */ } __attribute__((aligned(8))); /* User bpf_sock_addr struct to access socket fields and sockaddr struct passed
By introducing support for ->fill_link_info to the perf_event link, users gain the ability to inspect it using `bpftool link show`. While the current approach involves accessing this information via `bpftool perf show`, consolidating link information for all link types in one place offers greater convenience. Additionally, this patch extends support to the generic perf event, which is not currently accommodated by `bpftool perf show`. While only the perf type and config are exposed to userspace, other attributes such as sample_period and sample_freq are ignored. It's important to note that if kptr_restrict is not permitted, the probed address will not be exposed, maintaining security measures. A new enum bpf_link_perf_event_type is introduced to help the user understand which struct is relevant. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> --- include/uapi/linux/bpf.h | 32 +++++++++++ kernel/bpf/syscall.c | 124 +++++++++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 32 +++++++++++ 3 files changed, 188 insertions(+)