Message ID | 20250307140120.1261890-1-hengqi.chen@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | libbpf: Fix uprobe offset calculation | expand |
On 3/7/25 6:01 AM, Hengqi Chen wrote: > As reported on libbpf-rs issue([0]), the current implementation > may resolve symbol to a wrong offset and thus missing uprobe > event. Calculate the symbol offset from program header instead. > See the BCC implementation (which in turn used by bpftrace) and > the spec ([1]) for references. > > [0]: https://github.com/libbpf/libbpf-rs/issues/1110 > [1]: https://refspecs.linuxfoundation.org/elf/ > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Hengqi, There are some test failures in the CI. For example, https://github.com/kernel-patches/bpf/actions/runs/13725803997/job/38392284640?pr=8631 Please take a look. Your below elf_sym_offset change matches some bcc implementation, but maybe maybe this is only under certain condition? Also, it would be great if you can add detailed description in commit message about what is the problem and why a different approach is necessary to fix the issue. > --- > tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- > 1 file changed, 24 insertions(+), 8 deletions(-) > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > index 823f83ad819c..9b561c8d1eec 100644 > --- a/tools/lib/bpf/elf.c > +++ b/tools/lib/bpf/elf.c > @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym > * for shared libs) into file offset, which is what kernel is expecting > * for uprobe/uretprobe attachment. > * See Documentation/trace/uprobetracer.rst for more details. This is done > - * by looking up symbol's containing section's header and using iter's virtual > - * address (sh_addr) and corresponding file offset (sh_offset) to transform > + * by looking up symbol's containing program header and using its virtual > + * address (p_vaddr) and corresponding file offset (p_offset) to transform > * sym.st_value (virtual address) into desired final file offset. > */ > -static unsigned long elf_sym_offset(struct elf_sym *sym) > +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) > { > - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; > + size_t nhdrs, i; > + GElf_Phdr phdr; > + > + if (elf_getphdrnum(elf, &nhdrs)) > + return -1; > + > + for (i = 0; i < nhdrs; i++) { > + if (!gelf_getphdr(elf, (int)i, &phdr)) > + continue; > + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) > + continue; > + if (sym->sym.st_value >= phdr.p_vaddr && > + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) > + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; > + } > + > + return -1; > } > > /* Find offset of function name in the provided ELF object. "binary_path" is > @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) > > if (ret > 0) { > /* handle multiple matches */ > - if (elf_sym_offset(sym) == ret) { > + if (elf_sym_offset(elf, sym) == ret) { > /* same offset, no problem */ > continue; > } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { [...]
Hi Yonghong, On Sat, Mar 8, 2025 at 2:48 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > On 3/7/25 6:01 AM, Hengqi Chen wrote: > > As reported on libbpf-rs issue([0]), the current implementation > > may resolve symbol to a wrong offset and thus missing uprobe > > event. Calculate the symbol offset from program header instead. > > See the BCC implementation (which in turn used by bpftrace) and > > the spec ([1]) for references. > > > > [0]: https://github.com/libbpf/libbpf-rs/issues/1110 > > [1]: https://refspecs.linuxfoundation.org/elf/ > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > Hengqi, > > There are some test failures in the CI. For example, > https://github.com/kernel-patches/bpf/actions/runs/13725803997/job/38392284640?pr=8631 > Yes, I've received an email from BPF CI. It seems like the uprobe multi testcase is unhappy with this change. > Please take a look. > Your below elf_sym_offset change matches some bcc implementation, but > maybe maybe this is only under certain condition? > Remove the `phdr.p_flags & PF_X` check fix the issue. Need more investigation. > Also, it would be great if you can add detailed description in commit message > about what is the problem and why a different approach is necessary to > fix the issue. > Will do. Thanks. > > --- > > tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > > index 823f83ad819c..9b561c8d1eec 100644 > > --- a/tools/lib/bpf/elf.c > > +++ b/tools/lib/bpf/elf.c > > @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym > > * for shared libs) into file offset, which is what kernel is expecting > > * for uprobe/uretprobe attachment. > > * See Documentation/trace/uprobetracer.rst for more details. This is done > > - * by looking up symbol's containing section's header and using iter's virtual > > - * address (sh_addr) and corresponding file offset (sh_offset) to transform > > + * by looking up symbol's containing program header and using its virtual > > + * address (p_vaddr) and corresponding file offset (p_offset) to transform > > * sym.st_value (virtual address) into desired final file offset. > > */ > > -static unsigned long elf_sym_offset(struct elf_sym *sym) > > +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) > > { > > - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; > > + size_t nhdrs, i; > > + GElf_Phdr phdr; > > + > > + if (elf_getphdrnum(elf, &nhdrs)) > > + return -1; > > + > > + for (i = 0; i < nhdrs; i++) { > > + if (!gelf_getphdr(elf, (int)i, &phdr)) > > + continue; > > + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) > > + continue; > > + if (sym->sym.st_value >= phdr.p_vaddr && > > + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) > > + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; > > + } > > + > > + return -1; > > } > > > > /* Find offset of function name in the provided ELF object. "binary_path" is > > @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) > > > > if (ret > 0) { > > /* handle multiple matches */ > > - if (elf_sym_offset(sym) == ret) { > > + if (elf_sym_offset(elf, sym) == ret) { > > /* same offset, no problem */ > > continue; > > } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { > > [...] >
On Mon, Mar 10, 2025 at 10:16 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > Hi Yonghong, > > On Sat, Mar 8, 2025 at 2:48 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > > > > > On 3/7/25 6:01 AM, Hengqi Chen wrote: > > > As reported on libbpf-rs issue([0]), the current implementation > > > may resolve symbol to a wrong offset and thus missing uprobe > > > event. Calculate the symbol offset from program header instead. > > > See the BCC implementation (which in turn used by bpftrace) and > > > the spec ([1]) for references. > > > > > > [0]: https://github.com/libbpf/libbpf-rs/issues/1110 > > > [1]: https://refspecs.linuxfoundation.org/elf/ > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > Hengqi, > > > > There are some test failures in the CI. For example, > > https://github.com/kernel-patches/bpf/actions/runs/13725803997/job/38392284640?pr=8631 > > > > Yes, I've received an email from BPF CI. > It seems like the uprobe multi testcase is unhappy with this change. > > > Please take a look. > > Your below elf_sym_offset change matches some bcc implementation, but > > maybe maybe this is only under certain condition? > > > > Remove the `phdr.p_flags & PF_X` check fix the issue. Need more investigation. > > > Also, it would be great if you can add detailed description in commit message > > about what is the problem and why a different approach is necessary to > > fix the issue. > > > > Will do. Thanks. > > > > --- > > > tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- > > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > > > index 823f83ad819c..9b561c8d1eec 100644 > > > --- a/tools/lib/bpf/elf.c > > > +++ b/tools/lib/bpf/elf.c > > > @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym > > > * for shared libs) into file offset, which is what kernel is expecting > > > * for uprobe/uretprobe attachment. > > > * See Documentation/trace/uprobetracer.rst for more details. This is done > > > - * by looking up symbol's containing section's header and using iter's virtual > > > - * address (sh_addr) and corresponding file offset (sh_offset) to transform > > > + * by looking up symbol's containing program header and using its virtual > > > + * address (p_vaddr) and corresponding file offset (p_offset) to transform > > > * sym.st_value (virtual address) into desired final file offset. > > > */ > > > -static unsigned long elf_sym_offset(struct elf_sym *sym) > > > +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) > > > { > > > - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; > > > + size_t nhdrs, i; > > > + GElf_Phdr phdr; > > > + > > > + if (elf_getphdrnum(elf, &nhdrs)) > > > + return -1; > > > + > > > + for (i = 0; i < nhdrs; i++) { > > > + if (!gelf_getphdr(elf, (int)i, &phdr)) > > > + continue; > > > + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) > > > + continue; > > > + if (sym->sym.st_value >= phdr.p_vaddr && > > > + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) > > > + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; Hengqi, Can you please provide an example where existing code doesn't work? I think that sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset and sym->sym.st_value - phdr.p_vaddr + phdr.p_offset Should result in the same, and we don't need to search for a matching segment if we have an ELF symbol and its section. But maybe I'm mistaken, so can you please elaborate a bit? > > > + } > > > + > > > + return -1; > > > } > > > > > > /* Find offset of function name in the provided ELF object. "binary_path" is > > > @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) > > > > > > if (ret > 0) { > > > /* handle multiple matches */ > > > - if (elf_sym_offset(sym) == ret) { > > > + if (elf_sym_offset(elf, sym) == ret) { > > > /* same offset, no problem */ > > > continue; > > > } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { > > > > [...] > >
Hi Andrii, On Thu, Mar 13, 2025 at 2:47 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Mon, Mar 10, 2025 at 10:16 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > Hi Yonghong, > > > > On Sat, Mar 8, 2025 at 2:48 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > > > > > > > > > On 3/7/25 6:01 AM, Hengqi Chen wrote: > > > > As reported on libbpf-rs issue([0]), the current implementation > > > > may resolve symbol to a wrong offset and thus missing uprobe > > > > event. Calculate the symbol offset from program header instead. > > > > See the BCC implementation (which in turn used by bpftrace) and > > > > the spec ([1]) for references. > > > > > > > > [0]: https://github.com/libbpf/libbpf-rs/issues/1110 > > > > [1]: https://refspecs.linuxfoundation.org/elf/ > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > Hengqi, > > > > > > There are some test failures in the CI. For example, > > > https://github.com/kernel-patches/bpf/actions/runs/13725803997/job/38392284640?pr=8631 > > > > > > > Yes, I've received an email from BPF CI. > > It seems like the uprobe multi testcase is unhappy with this change. > > > > > Please take a look. > > > Your below elf_sym_offset change matches some bcc implementation, but > > > maybe maybe this is only under certain condition? > > > > > > > Remove the `phdr.p_flags & PF_X` check fix the issue. Need more investigation. > > > > > Also, it would be great if you can add detailed description in commit message > > > about what is the problem and why a different approach is necessary to > > > fix the issue. > > > > > > > Will do. Thanks. > > > > > > --- > > > > tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- > > > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > > > > index 823f83ad819c..9b561c8d1eec 100644 > > > > --- a/tools/lib/bpf/elf.c > > > > +++ b/tools/lib/bpf/elf.c > > > > @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym > > > > * for shared libs) into file offset, which is what kernel is expecting > > > > * for uprobe/uretprobe attachment. > > > > * See Documentation/trace/uprobetracer.rst for more details. This is done > > > > - * by looking up symbol's containing section's header and using iter's virtual > > > > - * address (sh_addr) and corresponding file offset (sh_offset) to transform > > > > + * by looking up symbol's containing program header and using its virtual > > > > + * address (p_vaddr) and corresponding file offset (p_offset) to transform > > > > * sym.st_value (virtual address) into desired final file offset. > > > > */ > > > > -static unsigned long elf_sym_offset(struct elf_sym *sym) > > > > +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) > > > > { > > > > - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; > > > > + size_t nhdrs, i; > > > > + GElf_Phdr phdr; > > > > + > > > > + if (elf_getphdrnum(elf, &nhdrs)) > > > > + return -1; > > > > + > > > > + for (i = 0; i < nhdrs; i++) { > > > > + if (!gelf_getphdr(elf, (int)i, &phdr)) > > > > + continue; > > > > + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) > > > > + continue; > > > > + if (sym->sym.st_value >= phdr.p_vaddr && > > > > + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) > > > > + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; > > Hengqi, > > Can you please provide an example where existing code doesn't work? I think that > > sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset > > and > > sym->sym.st_value - phdr.p_vaddr + phdr.p_offset > > > Should result in the same, and we don't need to search for a matching > segment if we have an ELF symbol and its section. But maybe I'm > mistaken, so can you please elaborate a bit? The binary ([0]) provided in the issue shows some counterexamples. I could't find an authoritative documentation describing this though. A modified version ([1]) of this patch could pass the CI now. [0]: https://github.com/libbpf/libbpf-rs/issues/1110#issuecomment-2699221802 [1]: https://github.com/kernel-patches/bpf/pull/8647 > > > > > + } > > > > + > > > > + return -1; > > > > } > > > > > > > > /* Find offset of function name in the provided ELF object. "binary_path" is > > > > @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) > > > > > > > > if (ret > 0) { > > > > /* handle multiple matches */ > > > > - if (elf_sym_offset(sym) == ret) { > > > > + if (elf_sym_offset(elf, sym) == ret) { > > > > /* same offset, no problem */ > > > > continue; > > > > } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { > > > > > > [...] > > >
On Thu, Mar 13, 2025 at 12:23:10PM +0800, Hengqi Chen wrote: SNIP > > > > > tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- > > > > > 1 file changed, 24 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c > > > > > index 823f83ad819c..9b561c8d1eec 100644 > > > > > --- a/tools/lib/bpf/elf.c > > > > > +++ b/tools/lib/bpf/elf.c > > > > > @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym > > > > > * for shared libs) into file offset, which is what kernel is expecting > > > > > * for uprobe/uretprobe attachment. > > > > > * See Documentation/trace/uprobetracer.rst for more details. This is done > > > > > - * by looking up symbol's containing section's header and using iter's virtual > > > > > - * address (sh_addr) and corresponding file offset (sh_offset) to transform > > > > > + * by looking up symbol's containing program header and using its virtual > > > > > + * address (p_vaddr) and corresponding file offset (p_offset) to transform > > > > > * sym.st_value (virtual address) into desired final file offset. > > > > > */ > > > > > -static unsigned long elf_sym_offset(struct elf_sym *sym) > > > > > +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) > > > > > { > > > > > - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; > > > > > + size_t nhdrs, i; > > > > > + GElf_Phdr phdr; > > > > > + > > > > > + if (elf_getphdrnum(elf, &nhdrs)) > > > > > + return -1; > > > > > + > > > > > + for (i = 0; i < nhdrs; i++) { > > > > > + if (!gelf_getphdr(elf, (int)i, &phdr)) > > > > > + continue; > > > > > + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) > > > > > + continue; > > > > > + if (sym->sym.st_value >= phdr.p_vaddr && > > > > > + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) > > > > > + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; > > > > Hengqi, > > > > Can you please provide an example where existing code doesn't work? I think that > > > > sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset > > > > and > > > > sym->sym.st_value - phdr.p_vaddr + phdr.p_offset > > > > > > Should result in the same, and we don't need to search for a matching > > segment if we have an ELF symbol and its section. But maybe I'm > > mistaken, so can you please elaborate a bit? > > The binary ([0]) provided in the issue shows some counterexamples. > I could't find an authoritative documentation describing this though. > A modified version ([1]) of this patch could pass the CI now. yes, I tried that binary and it gives me different offsets IIUC the symbol seems to be from .eh_frame_hdr (odd?) while the new logic base it on offset of .text section.. I'm still not following that binary layout completely.. will try to check on that more later today jirka > > [0]: https://github.com/libbpf/libbpf-rs/issues/1110#issuecomment-2699221802 > [1]: https://github.com/kernel-patches/bpf/pull/8647 > > > > > > > > + } > > > > > + > > > > > + return -1; > > > > > } > > > > > > > > > > /* Find offset of function name in the provided ELF object. "binary_path" is > > > > > @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) > > > > > > > > > > if (ret > 0) { > > > > > /* handle multiple matches */ > > > > > - if (elf_sym_offset(sym) == ret) { > > > > > + if (elf_sym_offset(elf, sym) == ret) { > > > > > /* same offset, no problem */ > > > > > continue; > > > > > } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { > > > > > > > > [...] > > > > >
diff --git a/tools/lib/bpf/elf.c b/tools/lib/bpf/elf.c index 823f83ad819c..9b561c8d1eec 100644 --- a/tools/lib/bpf/elf.c +++ b/tools/lib/bpf/elf.c @@ -260,13 +260,29 @@ static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym * for shared libs) into file offset, which is what kernel is expecting * for uprobe/uretprobe attachment. * See Documentation/trace/uprobetracer.rst for more details. This is done - * by looking up symbol's containing section's header and using iter's virtual - * address (sh_addr) and corresponding file offset (sh_offset) to transform + * by looking up symbol's containing program header and using its virtual + * address (p_vaddr) and corresponding file offset (p_offset) to transform * sym.st_value (virtual address) into desired final file offset. */ -static unsigned long elf_sym_offset(struct elf_sym *sym) +static unsigned long elf_sym_offset(Elf *elf, struct elf_sym *sym) { - return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset; + size_t nhdrs, i; + GElf_Phdr phdr; + + if (elf_getphdrnum(elf, &nhdrs)) + return -1; + + for (i = 0; i < nhdrs; i++) { + if (!gelf_getphdr(elf, (int)i, &phdr)) + continue; + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) + continue; + if (sym->sym.st_value >= phdr.p_vaddr && + sym->sym.st_value < (phdr.p_vaddr + phdr.p_memsz)) + return sym->sym.st_value - phdr.p_vaddr + phdr.p_offset; + } + + return -1; } /* Find offset of function name in the provided ELF object. "binary_path" is @@ -329,7 +345,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) if (ret > 0) { /* handle multiple matches */ - if (elf_sym_offset(sym) == ret) { + if (elf_sym_offset(elf, sym) == ret) { /* same offset, no problem */ continue; } else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) { @@ -346,7 +362,7 @@ long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name) } } - ret = elf_sym_offset(sym); + ret = elf_sym_offset(elf, sym); last_bind = cur_bind; } if (ret > 0) @@ -445,7 +461,7 @@ int elf_resolve_syms_offsets(const char *binary_path, int cnt, goto out; while ((sym = elf_sym_iter_next(&iter))) { - unsigned long sym_offset = elf_sym_offset(sym); + unsigned long sym_offset = elf_sym_offset(elf_fd.elf, sym); int bind = GELF_ST_BIND(sym->sym.st_info); struct symbol *found, tmp = { .name = sym->name, @@ -534,7 +550,7 @@ int elf_resolve_pattern_offsets(const char *binary_path, const char *pattern, if (err) goto out; - offsets[cnt++] = elf_sym_offset(sym); + offsets[cnt++] = elf_sym_offset(elf_fd.elf, sym); } /* If we found anything in the first symbol section,
As reported on libbpf-rs issue([0]), the current implementation may resolve symbol to a wrong offset and thus missing uprobe event. Calculate the symbol offset from program header instead. See the BCC implementation (which in turn used by bpftrace) and the spec ([1]) for references. [0]: https://github.com/libbpf/libbpf-rs/issues/1110 [1]: https://refspecs.linuxfoundation.org/elf/ Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> --- tools/lib/bpf/elf.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-)