Message ID | 1643645554-28723-3-git-send-email-alan.maguire@oracle.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | libbpf: name-based u[ret]probe attach | expand |
On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > Now that u[ret]probes can use name-based specification, it makes > sense to add support for auto-attach based on SEC() definition. > The format proposed is > > SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") > > For example, to trace malloc() in libc: > > SEC("uprobe//usr/lib64/libc.so.6:malloc") I assume that path to library can be relative path as well, right? Also, should be look at trying to locate library in the system if it's specified as "libc"? Or if the binary is "bash", for example. Just bringing this up, because I think it came up before in the context of one of libbpf-tools. > > Auto-attach is done for all tasks (pid -1). > > Note that there is a backwards-compatibility issue here. Consider a BPF > object consisting of a set of BPF programs, including a uprobe program. > Because uprobes did not previously support auto-attach, it's possible that > because the uprobe section name is not in auto-attachable form, overall > BPF skeleton attach would now fail due to the failure of the uprobe program > to auto-attach. So we need to handle the case of auto-attach failure where > the form of the section name is not suitable for auto-attach without > a complete attach failure. On surveying the code, bpf_program__attach() > already returns -ESRCH in cases where no auto-attach function is > supplied, so for consistency with that - and because that return value > is less likely to collide with actual attach failures than -EOPNOTSUPP - > it is used as the attach function return value signalling auto-attach > is not possible. I'm actually working on generalizing and extending this part of libbpf's SEC() handling, I should post code today or early next week. So we can base your code on those changes and we won't need to worry about error code collisions. > > Signed-off-by: Alan Maguire <alan.maguire@oracle.com> > --- > tools/lib/bpf/libbpf.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 74 insertions(+), 3 deletions(-) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index eb95629..e2b4415 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -8581,6 +8581,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log > } > [...]
On Fri, 4 Feb 2022, Andrii Nakryiko wrote: > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > Now that u[ret]probes can use name-based specification, it makes > > sense to add support for auto-attach based on SEC() definition. > > The format proposed is > > > > SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") > > > > For example, to trace malloc() in libc: > > > > SEC("uprobe//usr/lib64/libc.so.6:malloc") > > I assume that path to library can be relative path as well, right? > > Also, should be look at trying to locate library in the system if it's > specified as "libc"? Or if the binary is "bash", for example. Just > bringing this up, because I think it came up before in the context of > one of libbpf-tools. > This is a great suggestion for usability, but I'm trying to puzzle out how to carry out the location search for cases where the path specified is not a relative or absolute path. A few things we can can do - use search paths from PATH and LD_LIBRARY_PATH, with an appended set of standard locations such as /usr/bin, /usr/sbin for cases where those environment variables are missing or incomplete. However, when it comes to libraries, do we search in /usr/lib64 or /usr/lib? We could use whether the version of libbpf is 64-bit or not I suppose, but it's at least conceivable that the user might want to instrument a 32-bit library from a 64-bit libbpf. Do you think that approach is sufficient, or are there other things we should do? Thanks! Alan
On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > On Fri, 4 Feb 2022, Andrii Nakryiko wrote: > > > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > Now that u[ret]probes can use name-based specification, it makes > > > sense to add support for auto-attach based on SEC() definition. > > > The format proposed is > > > > > > SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") > > > > > > For example, to trace malloc() in libc: > > > > > > SEC("uprobe//usr/lib64/libc.so.6:malloc") > > > > I assume that path to library can be relative path as well, right? > > > > Also, should be look at trying to locate library in the system if it's > > specified as "libc"? Or if the binary is "bash", for example. Just > > bringing this up, because I think it came up before in the context of > > one of libbpf-tools. > > > > This is a great suggestion for usability, but I'm trying to puzzle > out how to carry out the location search for cases where the path > specified is not a relative or absolute path. > > A few things we can can do - use search paths from PATH and > LD_LIBRARY_PATH, with an appended set of standard locations > such as /usr/bin, /usr/sbin for cases where those environment > variables are missing or incomplete. > > However, when it comes to libraries, do we search in /usr/lib64 or > /usr/lib? We could use whether the version of libbpf is 64-bit or not I > suppose, but it's at least conceivable that the user might want to > instrument a 32-bit library from a 64-bit libbpf. Do you think that > approach is sufficient, or are there other things we should do? Thanks! How does dynamic linker do this? When I specify "libbpf.so", is there some documented algorithm for finding the library? If it's more or less codified, we could implement something like that. If not, well, too bad, we can do some useful heuristic, but ultimately there will be cases that won't be supported. Worst case user will have to specify an absolute path. > > Alan
On Thu, 24 Feb 2022, Andrii Nakryiko wrote: > On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > On Fri, 4 Feb 2022, Andrii Nakryiko wrote: > > > > > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > > > Now that u[ret]probes can use name-based specification, it makes > > > > sense to add support for auto-attach based on SEC() definition. > > > > The format proposed is > > > > > > > > SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") > > > > > > > > For example, to trace malloc() in libc: > > > > > > > > SEC("uprobe//usr/lib64/libc.so.6:malloc") > > > > > > I assume that path to library can be relative path as well, right? > > > > > > Also, should be look at trying to locate library in the system if it's > > > specified as "libc"? Or if the binary is "bash", for example. Just > > > bringing this up, because I think it came up before in the context of > > > one of libbpf-tools. > > > > > > > This is a great suggestion for usability, but I'm trying to puzzle > > out how to carry out the location search for cases where the path > > specified is not a relative or absolute path. > > > > A few things we can can do - use search paths from PATH and > > LD_LIBRARY_PATH, with an appended set of standard locations > > such as /usr/bin, /usr/sbin for cases where those environment > > variables are missing or incomplete. > > > > However, when it comes to libraries, do we search in /usr/lib64 or > > /usr/lib? We could use whether the version of libbpf is 64-bit or not I > > suppose, but it's at least conceivable that the user might want to > > instrument a 32-bit library from a 64-bit libbpf. Do you think that > > approach is sufficient, or are there other things we should do? Thanks! > > How does dynamic linker do this? When I specify "libbpf.so", is there > some documented algorithm for finding the library? If it's more or > less codified, we could implement something like that. If not, well, > too bad, we can do some useful heuristic, but ultimately there will be > cases that won't be supported. Worst case user will have to specify an > absolute path. > There's a nice description in [1]: If filename is NULL, then the returned handle is for the main program. If filename contains a slash ("/"), then it is interpreted as a (relative or absolute) pathname. Otherwise, the dynamic linker searches for the object as follows (see ld.so(8) for further details): o (ELF only) If the calling object (i.e., the shared library or executable from which dlopen() is called) contains a DT_RPATH tag, and does not contain a DT_RUNPATH tag, then the directories listed in the DT_RPATH tag are searched. o If, at the time that the program was started, the environment variable LD_LIBRARY_PATH was defined to contain a colon- separated list of directories, then these are searched. (As a security measure, this variable is ignored for set-user-ID and set-group-ID programs.) o (ELF only) If the calling object contains a DT_RUNPATH tag, then the directories listed in that tag are searched. o The cache file /etc/ld.so.cache (maintained by ldconfig(8)) is checked to see whether it contains an entry for filename. o The directories /lib and /usr/lib are searched (in that order). Rather than re-inventing all of that however, we could use it by dlopen()ing the file when it is a library (contains .so) and is not a relative/absolute path, and then use dlinfo()'s RTLD_DI_ORIGIN command to extract the path discovered, and then dlclose() it. It would require linking libbpf with -ldl however. What do you think? Alan [1] https://man7.org/linux/man-pages/man3/dlopen.3.html > > > > Alan >
On Thu, Feb 24, 2022 at 7:40 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > On Thu, 24 Feb 2022, Andrii Nakryiko wrote: > > > On Wed, Feb 23, 2022 at 1:33 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > On Fri, 4 Feb 2022, Andrii Nakryiko wrote: > > > > > > > On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote: > > > > > > > > > > Now that u[ret]probes can use name-based specification, it makes > > > > > sense to add support for auto-attach based on SEC() definition. > > > > > The format proposed is > > > > > > > > > > SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") > > > > > > > > > > For example, to trace malloc() in libc: > > > > > > > > > > SEC("uprobe//usr/lib64/libc.so.6:malloc") > > > > > > > > I assume that path to library can be relative path as well, right? > > > > > > > > Also, should be look at trying to locate library in the system if it's > > > > specified as "libc"? Or if the binary is "bash", for example. Just > > > > bringing this up, because I think it came up before in the context of > > > > one of libbpf-tools. > > > > > > > > > > This is a great suggestion for usability, but I'm trying to puzzle > > > out how to carry out the location search for cases where the path > > > specified is not a relative or absolute path. > > > > > > A few things we can can do - use search paths from PATH and > > > LD_LIBRARY_PATH, with an appended set of standard locations > > > such as /usr/bin, /usr/sbin for cases where those environment > > > variables are missing or incomplete. > > > > > > However, when it comes to libraries, do we search in /usr/lib64 or > > > /usr/lib? We could use whether the version of libbpf is 64-bit or not I > > > suppose, but it's at least conceivable that the user might want to > > > instrument a 32-bit library from a 64-bit libbpf. Do you think that > > > approach is sufficient, or are there other things we should do? Thanks! > > > > How does dynamic linker do this? When I specify "libbpf.so", is there > > some documented algorithm for finding the library? If it's more or > > less codified, we could implement something like that. If not, well, > > too bad, we can do some useful heuristic, but ultimately there will be > > cases that won't be supported. Worst case user will have to specify an > > absolute path. > > > > There's a nice description in [1]: > > If filename is NULL, then the returned handle is for the main > program. If filename contains a slash ("/"), then it is > interpreted as a (relative or absolute) pathname. Otherwise, the > dynamic linker searches for the object as follows (see ld.so(8) > for further details): > > o (ELF only) If the calling object (i.e., the shared library or > executable from which dlopen() is called) contains a DT_RPATH > tag, and does not contain a DT_RUNPATH tag, then the > directories listed in the DT_RPATH tag are searched. > > o If, at the time that the program was started, the environment > variable LD_LIBRARY_PATH was defined to contain a colon- > separated list of directories, then these are searched. (As > a security measure, this variable is ignored for set-user-ID > and set-group-ID programs.) > > o (ELF only) If the calling object contains a DT_RUNPATH tag, > then the directories listed in that tag are searched. > > o The cache file /etc/ld.so.cache (maintained by ldconfig(8)) > is checked to see whether it contains an entry for filename. > > o The directories /lib and /usr/lib are searched (in that > order). > > Rather than re-inventing all of that however, we could use it > by dlopen()ing the file when it is a library (contains .so) and > is not a relative/absolute path, and then use dlinfo()'s > RTLD_DI_ORIGIN command to extract the path discovered, and then > dlclose() it. It would require linking libbpf with -ldl however. > What do you think? What do I think about dlopen()'ing some random library under root by libbpf into the host process?.. I'd say that's a bad idea. I'd probably start with just checking /lib, /usr/lib (and maybe those 32-bit and 64-bit specific ones, depending on host architecture; not sure about all the details there, tbh). Or just say that the path to the shared library has to be specified. There is a similar problem with doing something like SEC("uprobe/bash:readline"). Do we want to "search" for bash? I think bpftrace is supporting that, but I haven't checked what it is doing. > > Alan > > [1] https://man7.org/linux/man-pages/man3/dlopen.3.html > > > > > > > Alan > >
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index eb95629..e2b4415 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8581,6 +8581,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log } static struct bpf_link *attach_kprobe(const struct bpf_program *prog, long cookie); +static struct bpf_link *attach_uprobe(const struct bpf_program *prog, long cookie); static struct bpf_link *attach_tp(const struct bpf_program *prog, long cookie); static struct bpf_link *attach_raw_tp(const struct bpf_program *prog, long cookie); static struct bpf_link *attach_trace(const struct bpf_program *prog, long cookie); @@ -8592,9 +8593,9 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX), SEC_DEF("sk_reuseport", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), SEC_DEF("kprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), - SEC_DEF("uprobe/", KPROBE, 0, SEC_NONE), + SEC_DEF("uprobe/", KPROBE, 0, SEC_NONE, attach_uprobe), SEC_DEF("kretprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), - SEC_DEF("uretprobe/", KPROBE, 0, SEC_NONE), + SEC_DEF("uretprobe/", KPROBE, 0, SEC_NONE, attach_uprobe), SEC_DEF("tc", SCHED_CLS, 0, SEC_NONE), SEC_DEF("classifier", SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX), SEC_DEF("action", SCHED_ACT, 0, SEC_NONE | SEC_SLOPPY_PFX), @@ -10525,6 +10526,64 @@ static long elf_find_func_offset(const char *binary_path, const char *name) } +/* Format of u[ret]probe section definition supporting auto-attach: + * u[ret]probe//path/to/prog:function[+offset] + * + * Many uprobe programs do not avail of auto-attach, so we need to handle the + * case where the format is uprobe/myfunc by returning NULL rather than an + * error. + */ +static struct bpf_link *attach_uprobe(const struct bpf_program *prog, long cookie) +{ + DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts); + char *func_name, binary_path[512]; + unsigned int raw_offset; + char *func, *probe_name; + struct bpf_link *link; + size_t offset = 0; + int n, err; + + opts.retprobe = str_has_pfx(prog->sec_name, "uretprobe/"); + if (opts.retprobe) + probe_name = prog->sec_name + sizeof("uretprobe/") - 1; + else + probe_name = prog->sec_name + sizeof("uprobe/") - 1; + + snprintf(binary_path, sizeof(binary_path), "%s", probe_name); + /* ':' should be prior to function+offset */ + func_name = strrchr(binary_path, ':'); + if (!func_name) { + pr_debug("section '%s' is old-style u[ret]probe/function, cannot auto-attach\n", + prog->sec_name); + return libbpf_err_ptr(-ESRCH); + } + func_name[0] = '\0'; + func_name++; + n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset); + if (n < 1) { + err = -EINVAL; + pr_warn("uprobe name is invalid: %s\n", func_name); + return libbpf_err_ptr(err); + } + /* Is func a raw address? */ + if (n == 1 && sscanf(func, "%x", &raw_offset) == 1) { + free(func); + func = NULL; + offset = (size_t)raw_offset; + } + if (opts.retprobe && offset != 0) { + free(func); + err = -EINVAL; + pr_warn("uretprobes do not support offset specification\n"); + return libbpf_err_ptr(err); + } + opts.func_name = func; + + link = bpf_program__attach_uprobe_opts(prog, -1, binary_path, offset, &opts); + free(func); + return link; +} + struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog, bool retprobe, pid_t pid, const char *binary_path, @@ -12041,7 +12100,19 @@ int bpf_object__attach_skeleton(struct bpf_object_skeleton *s) *link = bpf_program__attach(prog); err = libbpf_get_error(*link); - if (err) { + switch (err) { + case 0: + break; + case -ESRCH: + /* -ESRCH is used as it is less likely to collide with other error + * cases in program attach while being consistent with the value + * returned by bpf_program__attach() where no auto-attach function + * is provided. + */ + pr_warn("auto-attach not supported for program '%s'\n", + bpf_program__name(prog)); + break; + default: pr_warn("failed to auto-attach program '%s': %d\n", bpf_program__name(prog), err); return libbpf_err(err);
Now that u[ret]probes can use name-based specification, it makes sense to add support for auto-attach based on SEC() definition. The format proposed is SEC("u[ret]probe//path/to/prog:[raw_offset|[function_name[+offset]]") For example, to trace malloc() in libc: SEC("uprobe//usr/lib64/libc.so.6:malloc") Auto-attach is done for all tasks (pid -1). Note that there is a backwards-compatibility issue here. Consider a BPF object consisting of a set of BPF programs, including a uprobe program. Because uprobes did not previously support auto-attach, it's possible that because the uprobe section name is not in auto-attachable form, overall BPF skeleton attach would now fail due to the failure of the uprobe program to auto-attach. So we need to handle the case of auto-attach failure where the form of the section name is not suitable for auto-attach without a complete attach failure. On surveying the code, bpf_program__attach() already returns -ESRCH in cases where no auto-attach function is supplied, so for consistency with that - and because that return value is less likely to collide with actual attach failures than -EOPNOTSUPP - it is used as the attach function return value signalling auto-attach is not possible. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> --- tools/lib/bpf/libbpf.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 74 insertions(+), 3 deletions(-)