Message ID | 20230712010504.818008-1-liu.yun@linux.dev (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | BPF |
Headers | show |
Series | libbpf: Support POSIX regular expressions for multi kprobe | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
bpf/vmtest-bpf-next-VM_Test-9 | success | Logs for test_maps on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-10 | success | Logs for test_maps on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-11 | success | Logs for test_progs on aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-13 | success | Logs for test_progs on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-14 | success | Logs for test_progs on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-15 | success | Logs for test_progs_no_alu32 on aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-17 | success | Logs for test_progs_no_alu32 on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-18 | success | Logs for test_progs_no_alu32 on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-19 | success | Logs for test_progs_no_alu32_parallel on aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-20 | success | Logs for test_progs_no_alu32_parallel on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-21 | success | Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-22 | success | Logs for test_progs_parallel on aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-23 | success | Logs for test_progs_parallel on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-24 | success | Logs for test_progs_parallel on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-25 | success | Logs for test_verifier on aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-26 | success | Logs for test_verifier on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-27 | success | Logs for test_verifier on x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-28 | success | Logs for test_verifier on x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-29 | success | Logs for veristat |
bpf/vmtest-bpf-next-VM_Test-16 | success | Logs for test_progs_no_alu32 on s390x with gcc |
bpf/vmtest-bpf-next-VM_Test-12 | success | Logs for test_progs on s390x with gcc |
bpf/vmtest-bpf-next-PR | success | PR summary |
bpf/vmtest-bpf-next-VM_Test-4 | success | Logs for build for x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-1 | success | Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain_full }} |
bpf/vmtest-bpf-next-VM_Test-2 | success | Logs for ShellCheck |
bpf/vmtest-bpf-next-VM_Test-3 | success | Logs for build for aarch64 with gcc |
bpf/vmtest-bpf-next-VM_Test-5 | success | Logs for build for x86_64 with gcc |
bpf/vmtest-bpf-next-VM_Test-6 | success | Logs for build for x86_64 with llvm-16 |
bpf/vmtest-bpf-next-VM_Test-7 | success | Logs for set-matrix |
bpf/vmtest-bpf-next-VM_Test-8 | success | Logs for veristat |
On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: > > From: Jackie Liu <liuyun01@kylinos.cn> > > Now multi kprobe uses glob_match for function matching, it's not enough, > and sometimes we need more powerful regular expressions to support fuzzy > matching, and now provides a use_regex in bpf_kprobe_multi_opts to support > POSIX regular expressions. > > This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also > be implemented with libbpf. > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> > --- > tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- > tools/lib/bpf/libbpf.h | 4 +++- > 2 files changed, 51 insertions(+), 5 deletions(-) > Let's hold off on adding regex support assumptions into libbpf API. Globs are pretty flexible already for most cases, and for some more advanced use cases users can provide an exact list of function names through opts argument. We can revisit this decision down the road, but right now it seems premature to sign up for such relatively heavy-weight API dependency. > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 81aa52fa6807..fd217e9a232d 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -25,6 +25,7 @@ > #include <fcntl.h> > #include <errno.h> > #include <ctype.h> > +#include <regex.h> > #include <asm/unistd.h> > #include <linux/err.h> > #include <linux/kernel.h> [...]
On Tue, Jul 11, 2023 at 10:42 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: > > > > From: Jackie Liu <liuyun01@kylinos.cn> > > > > Now multi kprobe uses glob_match for function matching, it's not enough, > > and sometimes we need more powerful regular expressions to support fuzzy > > matching, and now provides a use_regex in bpf_kprobe_multi_opts to support > > POSIX regular expressions. > > > > This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also > > be implemented with libbpf. > > > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> > > --- > > tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- > > tools/lib/bpf/libbpf.h | 4 +++- > > 2 files changed, 51 insertions(+), 5 deletions(-) > > > > Let's hold off on adding regex support assumptions into libbpf API. > Globs are pretty flexible already for most cases, and for some more > advanced use cases users can provide an exact list of function names > through opts argument. > > We can revisit this decision down the road, but right now it seems > premature to sign up for such relatively heavy-weight API dependency. regexec() is part of glibc and we cannot link it statically, so no change in libbpf.a/so size. Are you worried about ulibc-like environment?
在 2023/7/12 23:04, Alexei Starovoitov 写道: > On Tue, Jul 11, 2023 at 10:42 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> >> On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: >>> >>> From: Jackie Liu <liuyun01@kylinos.cn> >>> >>> Now multi kprobe uses glob_match for function matching, it's not enough, >>> and sometimes we need more powerful regular expressions to support fuzzy >>> matching, and now provides a use_regex in bpf_kprobe_multi_opts to support >>> POSIX regular expressions. >>> >>> This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also >>> be implemented with libbpf. >>> >>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> >>> --- >>> tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- >>> tools/lib/bpf/libbpf.h | 4 +++- >>> 2 files changed, 51 insertions(+), 5 deletions(-) >>> >> >> Let's hold off on adding regex support assumptions into libbpf API. >> Globs are pretty flexible already for most cases, and for some more >> advanced use cases users can provide an exact list of function names >> through opts argument. >> >> We can revisit this decision down the road, but right now it seems >> premature to sign up for such relatively heavy-weight API dependency. > > regexec() is part of glibc and we cannot link it statically, > so no change in libbpf.a/so size. > Are you worried about ulibc-like environment? uclibc has regexec too. https://elixir.bootlin.com/uclibc-ng/latest/source/libc/misc/regex/regex.c
On Wed, Jul 12, 2023 at 8:05 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Jul 11, 2023 at 10:42 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: > > > > > > From: Jackie Liu <liuyun01@kylinos.cn> > > > > > > Now multi kprobe uses glob_match for function matching, it's not enough, > > > and sometimes we need more powerful regular expressions to support fuzzy > > > matching, and now provides a use_regex in bpf_kprobe_multi_opts to support > > > POSIX regular expressions. > > > > > > This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also > > > be implemented with libbpf. > > > > > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> > > > --- > > > tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- > > > tools/lib/bpf/libbpf.h | 4 +++- > > > 2 files changed, 51 insertions(+), 5 deletions(-) > > > > > > > Let's hold off on adding regex support assumptions into libbpf API. > > Globs are pretty flexible already for most cases, and for some more > > advanced use cases users can provide an exact list of function names > > through opts argument. > > > > We can revisit this decision down the road, but right now it seems > > premature to sign up for such relatively heavy-weight API dependency. > > regexec() is part of glibc and we cannot link it statically, > so no change in libbpf.a/so size. right, I wasn't worried about the code size increase of libbpf itself > Are you worried about ulibc-like environment? This is one part. musl, uclibc, and other alternative implementations of glibc: do they support same functionality with all the same options and syntax. I'd feel more comfortable if we understood well all the implications of relying on this regex API: which glibc versions support it, same for musl. Are there any extra library dependencies that we might need to add (like -lm for some math functions). I'm not very familiar also with what regex flavor is implemented by POSIX regex, is it the commonly-expected Perl-compatible one, or something else? Also, we should have a good story on how this regex syntax is supported in SEC() definitions for both kprobe.multi and uprobe.multi. Stuff like this. But also, looking at bpftrace, I don't think it supports regex-based probe matching (e.g., I tried 'kprobe:.*sys_bpf' and it matched nothing; maybe there is some other syntax, but my Google-fu failed me). So assuming I didn't miss anything obvious with bpftrace, the fact that it's been around for so long with so many users and lack of regex doesn't seem to be the problem, I'm just not convinced we want to add regex to libbpf. At least not yet. Meanwhile, users can do regex-based symbol resolution on their own and just provide resolved function names to bpf_program__attach_kprobe_multi(), so lack of regex is not a blocker for anything. So that was my thought process and some reasons for hesitation.
On Wed, Jul 12, 2023 at 09:13:04PM -0700, Andrii Nakryiko wrote: > On Wed, Jul 12, 2023 at 8:05 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Tue, Jul 11, 2023 at 10:42 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > > > > > > On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: > > > > > > > > From: Jackie Liu <liuyun01@kylinos.cn> > > > > > > > > Now multi kprobe uses glob_match for function matching, it's not enough, > > > > and sometimes we need more powerful regular expressions to support fuzzy > > > > matching, and now provides a use_regex in bpf_kprobe_multi_opts to support > > > > POSIX regular expressions. > > > > > > > > This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also > > > > be implemented with libbpf. > > > > > > > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> > > > > --- > > > > tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- > > > > tools/lib/bpf/libbpf.h | 4 +++- > > > > 2 files changed, 51 insertions(+), 5 deletions(-) > > > > > > > > > > Let's hold off on adding regex support assumptions into libbpf API. > > > Globs are pretty flexible already for most cases, and for some more > > > advanced use cases users can provide an exact list of function names > > > through opts argument. > > > > > > We can revisit this decision down the road, but right now it seems > > > premature to sign up for such relatively heavy-weight API dependency. > > > > regexec() is part of glibc and we cannot link it statically, > > so no change in libbpf.a/so size. > > right, I wasn't worried about the code size increase of libbpf itself > > > Are you worried about ulibc-like environment? > > This is one part. musl, uclibc, and other alternative implementations > of glibc: do they support same functionality with all the same options > and syntax. I'd feel more comfortable if we understood well all the > implications of relying on this regex API: which glibc versions > support it, same for musl. Are there any extra library dependencies > that we might need to add (like -lm for some math functions). I'm not > very familiar also with what regex flavor is implemented by POSIX > regex, is it the commonly-expected Perl-compatible one, or something > else? > > Also, we should have a good story on how this regex syntax is > supported in SEC() definitions for both kprobe.multi and uprobe.multi. > > Stuff like this. > > But also, looking at bpftrace, I don't think it supports regex-based > probe matching (e.g., I tried 'kprobe:.*sys_bpf' and it matched > nothing; maybe there is some other syntax, but my Google-fu failed > me). So assuming I didn't miss anything obvious with bpftrace, the > fact that it's been around for so long with so many users and lack of > regex doesn't seem to be the problem, bpftrace only supports wildcard (`*`) operator like in globs. One thing that might help bpftrace get away with that is being able to specify multiple attachpoints for a single probe. Eg. ``` tracepoint:foo:bar, tracepoint:baz:something { print("hi") } ``` Thanks, Daniel
On Wed, Jul 12, 2023 at 9:56 PM Daniel Xu <dxu@dxuuu.xyz> wrote: > > On Wed, Jul 12, 2023 at 09:13:04PM -0700, Andrii Nakryiko wrote: > > On Wed, Jul 12, 2023 at 8:05 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Tue, Jul 11, 2023 at 10:42 PM Andrii Nakryiko > > > <andrii.nakryiko@gmail.com> wrote: > > > > > > > > On Tue, Jul 11, 2023 at 6:05 PM Jackie Liu <liu.yun@linux.dev> wrote: > > > > > > > > > > From: Jackie Liu <liuyun01@kylinos.cn> > > > > > > > > > > Now multi kprobe uses glob_match for function matching, it's not enough, > > > > > and sometimes we need more powerful regular expressions to support fuzzy > > > > > matching, and now provides a use_regex in bpf_kprobe_multi_opts to support > > > > > POSIX regular expressions. > > > > > > > > > > This is useful, similar to `funccount.py -r '^vfs.*'` in BCC, and can also > > > > > be implemented with libbpf. > > > > > > > > > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> > > > > > --- > > > > > tools/lib/bpf/libbpf.c | 52 ++++++++++++++++++++++++++++++++++++++---- > > > > > tools/lib/bpf/libbpf.h | 4 +++- > > > > > 2 files changed, 51 insertions(+), 5 deletions(-) > > > > > > > > > > > > > Let's hold off on adding regex support assumptions into libbpf API. > > > > Globs are pretty flexible already for most cases, and for some more > > > > advanced use cases users can provide an exact list of function names > > > > through opts argument. > > > > > > > > We can revisit this decision down the road, but right now it seems > > > > premature to sign up for such relatively heavy-weight API dependency. > > > > > > regexec() is part of glibc and we cannot link it statically, > > > so no change in libbpf.a/so size. > > > > right, I wasn't worried about the code size increase of libbpf itself > > > > > Are you worried about ulibc-like environment? > > > > This is one part. musl, uclibc, and other alternative implementations > > of glibc: do they support same functionality with all the same options > > and syntax. I'd feel more comfortable if we understood well all the > > implications of relying on this regex API: which glibc versions > > support it, same for musl. Are there any extra library dependencies > > that we might need to add (like -lm for some math functions). I'm not > > very familiar also with what regex flavor is implemented by POSIX > > regex, is it the commonly-expected Perl-compatible one, or something > > else? > > > > Also, we should have a good story on how this regex syntax is > > supported in SEC() definitions for both kprobe.multi and uprobe.multi. > > > > Stuff like this. > > > > But also, looking at bpftrace, I don't think it supports regex-based > > probe matching (e.g., I tried 'kprobe:.*sys_bpf' and it matched > > nothing; maybe there is some other syntax, but my Google-fu failed > > me). So assuming I didn't miss anything obvious with bpftrace, the > > fact that it's been around for so long with so many users and lack of > > regex doesn't seem to be the problem, > > bpftrace only supports wildcard (`*`) operator like in globs. One thing > that might help bpftrace get away with that is being able to specify multiple > attachpoints for a single probe. Eg. Right, and you can do the same with libbpf. Call bpf_program__attach_kprobe_multi() multiple times with different globs and/or define multiple SEC("kprobe.multi/xxx") entry functions that just call into a common logic-handling subprog. I find, in practice, that regexes are often completely unnecessary for selecting subsets of functions, if one has globs already. > > ``` > tracepoint:foo:bar, > tracepoint:baz:something > { > print("hi") > } > ``` > > Thanks, > Daniel
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 81aa52fa6807..fd217e9a232d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -25,6 +25,7 @@ #include <fcntl.h> #include <errno.h> #include <ctype.h> +#include <regex.h> #include <asm/unistd.h> #include <linux/err.h> #include <linux/kernel.h> @@ -10549,6 +10550,7 @@ struct kprobe_multi_resolve { unsigned long *addrs; size_t cap; size_t cnt; + bool use_regex; }; struct avail_kallsyms_data { @@ -10589,6 +10591,7 @@ static int libbpf_available_kallsyms_parse(struct kprobe_multi_resolve *res) int err = 0, ret, i; char **syms = NULL; size_t cap = 0, cnt = 0; + regex_t regex; f = fopen(available_functions_file, "re"); if (!f) { @@ -10597,6 +10600,18 @@ static int libbpf_available_kallsyms_parse(struct kprobe_multi_resolve *res) return err; } + if (res->use_regex) { + ret = regcomp(®ex, res->pattern, REG_EXTENDED | REG_NOSUB); + if (ret) { + char errbuf[128]; + + regerror(ret, ®ex, errbuf, sizeof(errbuf)); + pr_warn("Failed to compile regex: %s\n", errbuf); + fclose(f); + return -EINVAL; + } + } + while (true) { char *name; @@ -10610,8 +10625,13 @@ static int libbpf_available_kallsyms_parse(struct kprobe_multi_resolve *res) goto cleanup; } - if (!glob_match(sym_name, res->pattern)) - continue; + if (res->use_regex) { + if (regexec(®ex, sym_name, 0, NULL, 0)) + continue; + } else { + if (!glob_match(sym_name, res->pattern)) + continue; + } err = libbpf_ensure_mem((void **)&syms, &cap, sizeof(*syms), cnt + 1); if (err) @@ -10644,6 +10664,8 @@ static int libbpf_available_kallsyms_parse(struct kprobe_multi_resolve *res) err = -ENOENT; cleanup: + if (res->use_regex) + regfree(®ex); for (i = 0; i < cnt; i++) free((char *)syms[i]); free(syms); @@ -10664,6 +10686,7 @@ static int libbpf_available_kprobes_parse(struct kprobe_multi_resolve *res) FILE *f; int ret, err = 0; unsigned long long sym_addr; + regex_t regex; f = fopen(available_path, "re"); if (!f) { @@ -10672,6 +10695,18 @@ static int libbpf_available_kprobes_parse(struct kprobe_multi_resolve *res) return err; } + if (res->use_regex) { + ret = regcomp(®ex, res->pattern, REG_EXTENDED | REG_NOSUB); + if (ret) { + char errbuf[128]; + + regerror(ret, ®ex, errbuf, sizeof(errbuf)); + pr_warn("Failed to compile regex: %s\n", errbuf); + fclose(f); + return -EINVAL; + } + } + while (true) { ret = fscanf(f, "%llx %499s%*[^\n]\n", &sym_addr, sym_name); if (ret == EOF && feof(f)) @@ -10684,8 +10719,13 @@ static int libbpf_available_kprobes_parse(struct kprobe_multi_resolve *res) goto cleanup; } - if (!glob_match(sym_name, res->pattern)) - continue; + if (res->use_regex) { + if (regexec(®ex, sym_name, 0, NULL, 0)) + continue; + } else { + if (!glob_match(sym_name, res->pattern)) + continue; + } err = libbpf_ensure_mem((void **)&res->addrs, &res->cap, sizeof(*res->addrs), res->cnt + 1); @@ -10699,6 +10739,8 @@ static int libbpf_available_kprobes_parse(struct kprobe_multi_resolve *res) err = -ENOENT; cleanup: + if (res->use_regex) + regfree(®ex); fclose(f); return err; } @@ -10739,6 +10781,8 @@ bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog, return libbpf_err_ptr(-EINVAL); if (pattern) { + res.use_regex = OPTS_GET(opts, use_regex, false); + if (has_available_filter_functions_addrs()) err = libbpf_available_kprobes_parse(&res); else diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 754da73c643b..34031c722213 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -519,10 +519,12 @@ struct bpf_kprobe_multi_opts { size_t cnt; /* create return kprobes */ bool retprobe; + /* use regular expression */ + bool use_regex; size_t :0; }; -#define bpf_kprobe_multi_opts__last_field retprobe +#define bpf_kprobe_multi_opts__last_field use_regex LIBBPF_API struct bpf_link * bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,