mbox series

[bpf-next,0/5] Attach a cookie to a tracing program.

Message ID 20220126214809.3868787-1-kuifeng@fb.com (mailing list archive)
Headers show
Series Attach a cookie to a tracing program. | expand

Message

Kui-Feng Lee Jan. 26, 2022, 9:48 p.m. UTC
Allow users to attach a 64-bits cookie to a BPF program when link it
to fentry, fexit, or fmod_ret of a function.

This changeset includes several major changes.

 - Add a new field bpf_cookie to struct raw_tracepoint, so that a user
   can attach a cookie to a program.

 - Store flags in trampoline frames to provide the flexibility of
   storing more values in these frames.

 - Store the program ID of the current BPF program in the trampoline
   frame.

 - The implmentation of bpf_get_attach_cookie() for tracing programs
   to read the attached cookie.

Kui-Feng Lee (5):
  bpf: Add a flags value on trampoline frames.
  bpf: Detect if a program needs its program ID.
  bpf, x86: Store program ID to trampoline frames.
  bpf: Attach a cookie to a BPF program.
  bpf: Implement bpf_get_attach_cookie() for tracing programs.

 arch/x86/net/bpf_jit_comp.c                   | 53 ++++++++++++++---
 include/linux/bpf.h                           |  3 +
 include/linux/filter.h                        |  3 +-
 include/uapi/linux/bpf.h                      |  1 +
 kernel/bpf/syscall.c                          | 12 ++--
 kernel/bpf/trampoline.c                       | 10 +++-
 kernel/bpf/verifier.c                         |  5 +-
 kernel/trace/bpf_trace.c                      | 45 ++++++++++++++-
 tools/include/uapi/linux/bpf.h                |  1 +
 tools/lib/bpf/bpf.c                           | 14 +++++
 tools/lib/bpf/bpf.h                           |  1 +
 tools/lib/bpf/libbpf.map                      |  1 +
 .../selftests/bpf/prog_tests/bpf_cookie.c     | 57 +++++++++++++++++++
 .../selftests/bpf/progs/test_bpf_cookie.c     | 24 ++++++++
 14 files changed, 211 insertions(+), 19 deletions(-)

Comments

Alexei Starovoitov Jan. 27, 2022, 5:17 a.m. UTC | #1
On Wed, Jan 26, 2022 at 1:48 PM Kui-Feng Lee <kuifeng@fb.com> wrote:
>
> Allow users to attach a 64-bits cookie to a BPF program when link it
> to fentry, fexit, or fmod_ret of a function.
>
> This changeset includes several major changes.
>
>  - Add a new field bpf_cookie to struct raw_tracepoint, so that a user
>    can attach a cookie to a program.
>
>  - Store flags in trampoline frames to provide the flexibility of
>    storing more values in these frames.
>
>  - Store the program ID of the current BPF program in the trampoline
>    frame.
>
>  - The implmentation of bpf_get_attach_cookie() for tracing programs
>    to read the attached cookie.

flags, prog_id, cookie... I don't follow what's going on here.

cookie is supposed to be per link.
Doing it for fentry only will be inconvenient for users.
For existing kprobes there is no good place to store it. iirc.
For multi attach kprobes there won't be a good place either.
I think cookie should be out of band.
Maybe lets try a resizable map[ip]->cookie and don't add
'cookie' arrays to multi-kprobe attach,
'cookie' field to kprobe, fentry, and other attach apis.
Adding 'cookie' to all of them is quite a bit of churn for kernel
and user space.
I think resizable bpf map[u64]->u64 solves this problem.
Maybe cookie isn't even needed.
If the bpf prog can have a clean map[bpf_get_func_ip()] that
doesn't have to be sized up front it will address the need.
Jiri Olsa Jan. 31, 2022, 4:56 p.m. UTC | #2
On Wed, Jan 26, 2022 at 09:17:07PM -0800, Alexei Starovoitov wrote:
> On Wed, Jan 26, 2022 at 1:48 PM Kui-Feng Lee <kuifeng@fb.com> wrote:
> >
> > Allow users to attach a 64-bits cookie to a BPF program when link it
> > to fentry, fexit, or fmod_ret of a function.
> >
> > This changeset includes several major changes.
> >
> >  - Add a new field bpf_cookie to struct raw_tracepoint, so that a user
> >    can attach a cookie to a program.
> >
> >  - Store flags in trampoline frames to provide the flexibility of
> >    storing more values in these frames.
> >
> >  - Store the program ID of the current BPF program in the trampoline
> >    frame.
> >
> >  - The implmentation of bpf_get_attach_cookie() for tracing programs
> >    to read the attached cookie.
> 
> flags, prog_id, cookie... I don't follow what's going on here.
> 
> cookie is supposed to be per link.
> Doing it for fentry only will be inconvenient for users.
> For existing kprobes there is no good place to store it. iirc.
> For multi attach kprobes there won't be a good place either.
> I think cookie should be out of band.
> Maybe lets try a resizable map[ip]->cookie and don't add
> 'cookie' arrays to multi-kprobe attach,
> 'cookie' field to kprobe, fentry, and other attach apis.
> Adding 'cookie' to all of them is quite a bit of churn for kernel
> and user space.
> I think resizable bpf map[u64]->u64 solves this problem.

so you mean passing such map fd to link and have bpf_get_attach_cookie
using that map to get the cookie? that could be generic way for all
the links

I have the 'cookie arrays to multi-kprobe attach' code ready and I think 
it should be faster than map[ip] hash lookup? I'll need to check

jirka

> Maybe cookie isn't even needed.
> If the bpf prog can have a clean map[bpf_get_func_ip()] that
> doesn't have to be sized up front it will address the need.
>
Andrii Nakryiko Feb. 1, 2022, 6:45 a.m. UTC | #3
On Wed, Jan 26, 2022 at 9:17 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 1:48 PM Kui-Feng Lee <kuifeng@fb.com> wrote:
> >
> > Allow users to attach a 64-bits cookie to a BPF program when link it
> > to fentry, fexit, or fmod_ret of a function.
> >
> > This changeset includes several major changes.
> >
> >  - Add a new field bpf_cookie to struct raw_tracepoint, so that a user
> >    can attach a cookie to a program.
> >
> >  - Store flags in trampoline frames to provide the flexibility of
> >    storing more values in these frames.
> >
> >  - Store the program ID of the current BPF program in the trampoline
> >    frame.
> >
> >  - The implmentation of bpf_get_attach_cookie() for tracing programs
> >    to read the attached cookie.
>
> flags, prog_id, cookie... I don't follow what's going on here.
>
> cookie is supposed to be per link.
> Doing it for fentry only will be inconvenient for users.
> For existing kprobes there is no good place to store it. iirc.

Hm... are you talking about current kprobes? We already have
bpf_cookie support for them. We store it in the associated perf_event,
it's actually pretty convenient.

> For multi attach kprobes there won't be a good place either.

As Jiri mentioned, for multi-attach kprobes the idea was to keep a
sorted array of ips and associated cookies to do log(N) search in
bpf_get_attach_cookie() helper.

For multi-attach fentry, we could use the same approach if we let
either bpf_link or bpf_prog available to fentry/fexit program at
runtime. Kui-Feng seems to be storing prog_id in BPF trampoline
generated code, I don't think that's the best approach, it would be
probably better to store bpf_link or bpf_prog pointer, whichever is
more convenient. I think we can't attach the same BPF program twice to
the same BPF trampoline, so storing this array of ip -> cookie
mappings in bpf_prog would work (because the mapping is unique), but
might be a more cumbersome. Storing bpf_link is conceptually better,
probably, but I haven't thought through if there are any problems if
we support updating bpf_link's underlying program.

But keep in mind, this patch set is basically an RFC to arrive at the
best approach that will also be compatible with multi-attach
fentry/fexit. Though it seems like we'll first need to establish the
need for bpf_cookie (I thought that was not controversial, my bad),
which is fine, see below.

> I think cookie should be out of band.
> Maybe lets try a resizable map[ip]->cookie and don't add
> 'cookie' arrays to multi-kprobe attach,
> 'cookie' field to kprobe, fentry, and other attach apis.
> Adding 'cookie' to all of them is quite a bit of churn for kernel
> and user space.

We don't need all BPF program types, but anything that's useful for
generic tracing is much more powerful with cookies. We have them for
kprobe, uprobe and perf_event programs already. For multi-attach
kprobe/kretprobe and fentry/fexit they are essentially a must to let
users use those BPF program types to their fullest.

> I think resizable bpf map[u64]->u64 solves this problem.
> Maybe cookie isn't even needed.
> If the bpf prog can have a clean map[bpf_get_func_ip()] that
> doesn't have to be sized up front it will address the need.

You mean for users to maintain their own BPF map and pre-populated it
before attaching their BPF programs? Or did I misunderstand?

Sizing such a BPF map is just one problem.

Figuring out the correct IP to use as a key is another and arguably
bigger hassle. Everyone will be forced to consult kallsyms, even if
their use case is simple and they don't need to parse kallsyms at all.
Normally, for fentry/fexit programs, users don't even need kallsyms,
as vmlinux BTF is used to find BTF ID and that's enough to load and
attach fentry/fexit.

And while for kprobe/fentry programs it's inconvenient but can be done
with a bunch of extra work, for uprobes there are situations where
it's impossible to know IP at all. One very specific example is USDTs
in shared library. If user needs to attach to USDT defined in a shared
library across *any* PID (current and future), it's impossible to do
without BPF cookie support, because each different process will load
shared library at unknown base address, so it's impossible to
calculate upfront any absolute IP to look up by. This is the reason
BCC allows to attach to USDTs in shared library only for one specific
process (i.e., PID has to be specified in such a case).


Or did you propose to maintain such IP -> cookie mapping inside the
kernel during bpf_link creation? I think the key would have to be at
least a (link ID, IP) pair, no? The question is whether it's going to
be more convenient to get link ID and IP for all supported BPF
programs at runtime and whether it will cause the same or more amount
of churn?
Kui-Feng Lee Feb. 1, 2022, 5:37 p.m. UTC | #4
On Mon, 2022-01-31 at 22:45 -0800, Andrii Nakryiko wrote:
...... cut ......
> For multi-attach fentry, we could use the same approach if we let
> either bpf_link or bpf_prog available to fentry/fexit program at
> runtime. Kui-Feng seems to be storing prog_id in BPF trampoline
> generated code, I don't think that's the best approach, it would be
> probably better to store bpf_link or bpf_prog pointer, whichever is
> more convenient. I think we can't attach the same BPF program twice
> to

FYI, the prog_id used is casting from a bpf_prog pointer.  It gets a
bpf_prog by casting a prog_id back.
Alexei Starovoitov Feb. 1, 2022, 7:32 p.m. UTC | #5
On Mon, Jan 31, 2022 at 10:45:53PM -0800, Andrii Nakryiko wrote:
> 
> As Jiri mentioned, for multi-attach kprobes the idea was to keep a
> sorted array of ips and associated cookies to do log(N) search in
> bpf_get_attach_cookie() helper.
> 
> For multi-attach fentry, we could use the same approach if we let
> either bpf_link or bpf_prog available to fentry/fexit program at
> runtime.

Makes sense to me.
It's probably better to land multi-attach kprobe and fentry first,
so we don't need to refactor trampolines once again.
iirc the trampolines were not easy to refactor for Jiri.
I'm afraid that adding prog_id or a pointer to the trampoline
will complicate things even more for multi attach.

It's easy to store hard coded bpf_tramp_image pointer in the generated
trampoline. Storing prog_id or bpf_prog pointer there is a bit
harder, since the [sp-X] store needs to happen right in there invoke_bpf_prog()
(since there can be multiple progs per trampoline).

From there bpf_get_attach_cookie() can either do binary search
in the ip->cookie array or single load in case of non-multi attach.

Anyway the cookie support in trampoline seems to be easier to design
when there is a clarity on multi-attach fentry.

I would probably add support for cookie in raw_tp/tp_btf first,
since that part is not going to be affected by multi-* work.

> We don't need all BPF program types, but anything that's useful for
> generic tracing is much more powerful with cookies. We have them for
> kprobe, uprobe and perf_event programs already. For multi-attach
> kprobe/kretprobe and fentry/fexit they are essentially a must to let
> users use those BPF program types to their fullest.

agree. I missed the part that cookie is already supported with kuprobes
when attach is done via bpf_link_create.
Andrii Nakryiko Feb. 2, 2022, 1:06 a.m. UTC | #6
On Tue, Feb 1, 2022 at 9:37 AM Kui-Feng Lee <kuifeng@fb.com> wrote:
>
> On Mon, 2022-01-31 at 22:45 -0800, Andrii Nakryiko wrote:
> ...... cut ......
> > For multi-attach fentry, we could use the same approach if we let
> > either bpf_link or bpf_prog available to fentry/fexit program at
> > runtime. Kui-Feng seems to be storing prog_id in BPF trampoline
> > generated code, I don't think that's the best approach, it would be
> > probably better to store bpf_link or bpf_prog pointer, whichever is
> > more convenient. I think we can't attach the same BPF program twice
> > to
>
> FYI, the prog_id used is casting from a bpf_prog pointer.  It gets a
> bpf_prog by casting a prog_id back.
>

Ok, this is why we all got confused. There is already a concept of
"prog ID" (it is also visible to user-space, btw), but you were
actually storing an actual `struct bpf_prog *` pointer. So just a
misleading use of the term in this context.
Andrii Nakryiko Feb. 2, 2022, 1:15 a.m. UTC | #7
On Tue, Feb 1, 2022 at 11:32 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Jan 31, 2022 at 10:45:53PM -0800, Andrii Nakryiko wrote:
> >
> > As Jiri mentioned, for multi-attach kprobes the idea was to keep a
> > sorted array of ips and associated cookies to do log(N) search in
> > bpf_get_attach_cookie() helper.
> >
> > For multi-attach fentry, we could use the same approach if we let
> > either bpf_link or bpf_prog available to fentry/fexit program at
> > runtime.
>
> Makes sense to me.
> It's probably better to land multi-attach kprobe and fentry first,
> so we don't need to refactor trampolines once again.
> iirc the trampolines were not easy to refactor for Jiri.
> I'm afraid that adding prog_id or a pointer to the trampoline
> will complicate things even more for multi attach.

Yep, sure, makes sense to me.

>
> It's easy to store hard coded bpf_tramp_image pointer in the generated
> trampoline. Storing prog_id or bpf_prog pointer there is a bit
> harder, since the [sp-X] store needs to happen right in there invoke_bpf_prog()
> (since there can be multiple progs per trampoline).
>
> From there bpf_get_attach_cookie() can either do binary search
> in the ip->cookie array or single load in case of non-multi attach.
>
> Anyway the cookie support in trampoline seems to be easier to design
> when there is a clarity on multi-attach fentry.
>

True. bpf_tramp_image pointer probably won't work for multi-attach
fentry because one bpf_tramp_image will be shared by multiple
attachments (bpf_links), right?

Ideally we should provide a way to lookup "current bpf_link" at
runtime from BPF helpers (given fentry ctx) and do that binary search
there. There could be few other options (e.g., bpf_tramp_image +
binary search based on prog_id or bpf_prog pointer), but it's a bit
more cumbersome. It still feels like something per-program would need
to come from the context to be able to distinguish between multiple
attachments. But as you said, more clarity on multi-attach fentry
first would be best.

> I would probably add support for cookie in raw_tp/tp_btf first,
> since that part is not going to be affected by multi-* work.

Yep, let's try that first.

>
> > We don't need all BPF program types, but anything that's useful for
> > generic tracing is much more powerful with cookies. We have them for
> > kprobe, uprobe and perf_event programs already. For multi-attach
> > kprobe/kretprobe and fentry/fexit they are essentially a must to let
> > users use those BPF program types to their fullest.
>
> agree. I missed the part that cookie is already supported with kuprobes
> when attach is done via bpf_link_create.