diff mbox series

[v3,bpf-next,1/4] libbpf: support function name-based attach uprobes

Message ID 1643645554-28723-2-git-send-email-alan.maguire@oracle.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series libbpf: name-based u[ret]probe attach | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR fail PR summary
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning CHECK: Comparison to NULL could be written "strstr" CHECK: Please use a blank line after function/struct/union/enum declarations WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns WARNING: line length of 90 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns WARNING: line length of 95 exceeds 80 columns WARNING: line length of 96 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 34 this patch: 34
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next fail VM_Test

Commit Message

Alan Maguire Jan. 31, 2022, 4:12 p.m. UTC
kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump; then, in the case of local
   functions
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. If the function is a shared library function in a program
   (as opposed to a shared library), the Procedure Linking Table
   (PLT) table address is found (it is indexed via the dynamic
   symbol table index).  This allows us to instrument a call
   to a shared library function locally in the calling binary,
   reducing overhead versus having a breakpoint in global lib.
4. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a binary;
   function "malloc" in "/usr/bin/foo"
3. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.h |  10 +-
 2 files changed, 259 insertions(+), 1 deletion(-)

Comments

Andrii Nakryiko Feb. 4, 2022, 7:17 p.m. UTC | #1
On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> kprobe attach is name-based, using lookups of kallsyms to translate
> a function name to an address.  Currently uprobe attach is done
> via an offset value as described in [1].  Extend uprobe opts
> for attach to include a function name which can then be converted
> into a uprobe-friendly offset.  The calcualation is done in
> several steps:
>
> 1. First, determine the symbol address using libelf; this gives us
>    the offset as reported by objdump; then, in the case of local
>    functions
> 2. If the function is a shared library function - and the binary
>    provided is a shared library - no further work is required;
>    the address found is the required address
> 3. If the function is a shared library function in a program
>    (as opposed to a shared library), the Procedure Linking Table
>    (PLT) table address is found (it is indexed via the dynamic
>    symbol table index).  This allows us to instrument a call
>    to a shared library function locally in the calling binary,
>    reducing overhead versus having a breakpoint in global lib.
> 4. Finally, if the function is local, subtract the base address
>    associated with the object, retrieved from ELF program headers.
>
> The resultant value is then added to the func_offset value passed
> in to specify the uprobe attach address.  So specifying a func_offset
> of 0 along with a function name "printf" will attach to printf entry.
>
> The modes of operation supported are then
>
> 1. to attach to a local function in a binary; function "foo1" in
>    "/usr/bin/foo"
> 2. to attach to a shared library function in a binary;
>    function "malloc" in "/usr/bin/foo"
> 3. to attach to a shared library function in a shared library -
>    function "malloc" in libc.
>
> [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---

This looks great and very clean. I left a few nits, but otherwise it
looks ready (still need to go through the rest of the patches)

>  tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.h |  10 +-
>  2 files changed, 259 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 4ce94f4..eb95629 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10203,6 +10203,241 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
>         return pfd;
>  }
>
> +/* uprobes deal in relative offsets; subtract the base address associated with
> + * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
> + * details.
> + */
> +static long elf_find_relative_offset(Elf *elf,  long addr)

nit: too many spaces after comma

> +{
> +       size_t n;
> +       int i;
> +
> +       if (elf_getphdrnum(elf, &n)) {
> +               pr_warn("elf: failed to find program headers: %s\n",
> +                       elf_errmsg(-1));
> +               return -ENOENT;
> +       }
> +
> +       for (i = 0; i < n; i++) {
> +               int seg_start, seg_end, seg_offset;
> +               GElf_Phdr phdr;
> +
> +               if (!gelf_getphdr(elf, i, &phdr)) {
> +                       pr_warn("elf: failed to get program header %d: %s\n",
> +                               i, elf_errmsg(-1));
> +                       return -ENOENT;
> +               }
> +               if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
> +                       continue;
> +
> +               seg_start = phdr.p_vaddr;
> +               seg_end = seg_start + phdr.p_memsz;
> +               seg_offset = phdr.p_offset;
> +               if (addr >= seg_start && addr < seg_end)
> +                       return addr -  seg_start + seg_offset;

nit: double space before seg_start

> +       }
> +       pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);
> +       return -ENOENT;
> +}
> +
> +/* Return next ELF section of sh_type after scn, or first of that type
> + * if scn is NULL.
> + */
> +static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)
> +{
> +       while ((scn = elf_nextscn(elf, scn)) != NULL) {
> +               GElf_Shdr sh;
> +
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               if (sh.sh_type == sh_type)
> +                       break;
> +       }
> +       return scn;
> +}
> +
> +/* For Position-Independent Code-based libraries, a table of trampolines
> + * (Procedure Linking Table) is used to support resolution of symbol
> + * names at linking time.  The goal here is to find the offset associated
> + * with the jump to the actual library function.  If we can instrument that
> + * locally in the specific binary (rather than instrumenting glibc say),
> + * overheads are greatly reduced.
> + *
> + * The method used is to find the .plt section and determine the offset
> + * of the relevant entry (given by the base address plus the index
> + * of the function multiplied by the size of a .plt entry).
> + */
> +static ssize_t elf_find_plt_offset(Elf *elf, size_t ndx)

nit: ndx -> func_idx? libbpf generally uses "idx" naming, "ndx" is
purely libelf's convention (and it is more obvious if it is explicitly
called out that it's index of a function entry)

> +{
> +       Elf_Scn *scn = NULL;
> +       size_t shstrndx;
> +
> +       if (elf_getshdrstrndx(elf, &shstrndx)) {
> +               pr_debug("elf: failed to get section names section index: %s\n",
> +                        elf_errmsg(-1));
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       while ((scn = elf_find_next_scn_by_type(elf, SHT_PROGBITS, scn))) {
> +               long plt_entry_sz, plt_base;
> +               const char *name;
> +               GElf_Shdr sh;
> +
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               name = elf_strptr(elf, shstrndx, sh.sh_name);
> +               if (!name || strcmp(name, ".plt") != 0)
> +                       continue;

Wouldn't it be simpler to use elf_sec_by_name(elf, ".plt") and then
Shdr and check PROGBITS? Given there will be only one .plt, it makes
more sense than this while loop?

> +               plt_base = sh.sh_addr;
> +               plt_entry_sz = sh.sh_entsize;
> +               return plt_base + (ndx * plt_entry_sz);
> +       }
> +       pr_debug("elf: no .plt section found\n");

Do we really need this, especially without a binary path?

> +       return -LIBBPF_ERRNO__FORMAT;
> +}
> +
> +/* Find offset of function name in object specified by path.  "name" matches
> + * symbol name or name@@LIB for library functions.
> + */
> +static long elf_find_func_offset(const char *binary_path, const char *name)
> +{
> +       int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };
> +       bool is_shared_lib, is_name_qualified;
> +       size_t name_len, sym_ndx = -1;
> +       char errmsg[STRERR_BUFSIZE];
> +       long ret = -ENOENT;
> +       GElf_Ehdr ehdr;
> +       Elf *elf;
> +
> +       fd = open(binary_path, O_RDONLY | O_CLOEXEC);
> +       if (fd < 0) {
> +               ret = -errno;
> +               pr_warn("failed to open %s: %s\n", binary_path,
> +                       libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
> +               return ret;
> +       }
> +       elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
> +       if (!elf) {
> +               pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));
> +               close(fd);
> +               return -LIBBPF_ERRNO__FORMAT;
> +       }
> +       if (!gelf_getehdr(elf, &ehdr)) {
> +               pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
> +               ret = -LIBBPF_ERRNO__FORMAT;
> +               goto out;
> +       }
> +       /* for shared lib case, we do not need to calculate relative offset */
> +       is_shared_lib = ehdr.e_type == ET_DYN;
> +
> +       name_len = strlen(name);
> +       /* Does name specify "@@LIB"? */
> +       is_name_qualified = strstr(name, "@@") != NULL;
> +
> +       /* Search SHT_DYNSYM, SHT_SYMTAB for symbol.  This search order is used because if
> +        * the symbol is found in SHY_DYNSYM, the index in that table tells us which index
> +        * to use in the Procedure Linking Table to instrument calls to the shared library
> +        * function, but locally in the binary rather than in the shared library ifself.

typo: itself

> +        * If a binary is stripped, it may also only have SHT_DYNSYM, and a fully-statically
> +        * linked binary may not have SHT_DYMSYM, so absence of a section should not be
> +        * reported as a warning/error.
> +        */
> +       for (i = 0; i < ARRAY_SIZE(sh_types); i++) {
> +               size_t strtabidx, ndx, nr_syms;
> +               Elf_Data *symbols = NULL;
> +               Elf_Scn *scn = NULL;
> +               int last_bind = -1;
> +               const char *sname;
> +               GElf_Shdr sh;
> +
> +               scn = elf_find_next_scn_by_type(elf, sh_types[i], NULL);
> +               if (!scn) {
> +                       pr_debug("elf: failed to find symbol table ELF sections in %s\n",
> +                                binary_path);

you consistently used '%s' for binary_path, let's do that here as well

> +                       continue;
> +               }
> +               if (!gelf_getshdr(scn, &sh))
> +                       continue;
> +               strtabidx = sh.sh_link;
> +               symbols = elf_getdata(scn, 0);
> +               if (!symbols) {
> +                       pr_warn("elf: failed to get symbols for symtab section in %s: %s\n",
> +                               binary_path, elf_errmsg(-1));

and here

> +                       ret = -LIBBPF_ERRNO__FORMAT;
> +                       goto out;
> +               }
> +               nr_syms = symbols->d_size / sh.sh_entsize;
> +
> +               for (ndx = 0; ndx < nr_syms; ndx++) {
> +                       int curr_bind;
> +                       GElf_Sym sym;
> +
> +                       if (!gelf_getsym(symbols, ndx, &sym))
> +                               continue;
> +                       if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
> +                               continue;
> +
> +                       sname = elf_strptr(elf, strtabidx, sym.st_name);
> +                       if (!sname)
> +                               continue;
> +                       curr_bind = GELF_ST_BIND(sym.st_info);
> +
> +                       /* User can specify func, func@@LIB or func@@LIB_VERSION. */
> +                       if (strncmp(sname, name, name_len) != 0)
> +                               continue;
> +                       /* ...but we don't want a search for "foo" to match 'foo2" also, so any
> +                        * additional characters in sname should be of the form "@@LIB".
> +                        */
> +                       if (!is_name_qualified && strlen(sname) > name_len &&
> +                           sname[name_len] != '@')
> +                               continue;

if both the symbol name and requested function name have @ in them,
what should be the comparison rule? Shouldn't it be an exact match
including '@@' and part after it?

> +
> +                       if (ret >= 0 && last_bind != -1) {

if ret >= 0, last_bind can't be invalid, so let's drop the last_bind check here

> +                               /* handle multiple matches */
> +                               if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
> +                                       /* Only accept one non-weak bind. */
> +                                       pr_warn("elf: ambiguous match for '%s': %s\n",
> +                                               sname, name);
> +                                       ret = -LIBBPF_ERRNO__FORMAT;
> +                                       goto out;

[...]
Alan Maguire Feb. 25, 2022, 4:12 p.m. UTC | #2
On Fri, 4 Feb 2022, Andrii Nakryiko wrote:

> On Mon, Jan 31, 2022 at 8:13 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > kprobe attach is name-based, using lookups of kallsyms to translate
> > a function name to an address.  Currently uprobe attach is done
> > via an offset value as described in [1].  Extend uprobe opts
> > for attach to include a function name which can then be converted
> > into a uprobe-friendly offset.  The calcualation is done in
> > several steps:
> >
> > 1. First, determine the symbol address using libelf; this gives us
> >    the offset as reported by objdump; then, in the case of local
> >    functions
> > 2. If the function is a shared library function - and the binary
> >    provided is a shared library - no further work is required;
> >    the address found is the required address
> > 3. If the function is a shared library function in a program
> >    (as opposed to a shared library), the Procedure Linking Table
> >    (PLT) table address is found (it is indexed via the dynamic
> >    symbol table index).  This allows us to instrument a call
> >    to a shared library function locally in the calling binary,
> >    reducing overhead versus having a breakpoint in global lib.
> > 4. Finally, if the function is local, subtract the base address
> >    associated with the object, retrieved from ELF program headers.
> >
> > The resultant value is then added to the func_offset value passed
> > in to specify the uprobe attach address.  So specifying a func_offset
> > of 0 along with a function name "printf" will attach to printf entry.
> >
> > The modes of operation supported are then
> >
> > 1. to attach to a local function in a binary; function "foo1" in
> >    "/usr/bin/foo"
> > 2. to attach to a shared library function in a binary;
> >    function "malloc" in "/usr/bin/foo"
> > 3. to attach to a shared library function in a shared library -
> >    function "malloc" in libc.
> >
> > [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html
> >
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > ---
> 
> This looks great and very clean. I left a few nits, but otherwise it
> looks ready (still need to go through the rest of the patches)
> 
> >  tools/lib/bpf/libbpf.c | 250 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tools/lib/bpf/libbpf.h |  10 +-
> >  2 files changed, 259 insertions(+), 1 deletion(-)
> >
>

<snip>
 
> if both the symbol name and requested function name have @ in them,
> what should be the comparison rule? Shouldn't it be an exact match
> including '@@' and part after it?
>

In this case, we might want to support matching on malloc@GLIBC and
malloc@GLIBC_2.3.4; in other words letting the caller decide how
specific they want to be makes sense I think.  So the caller dictates
the matching length via the argument they provide - with the proviso that
if it's just a function name without a "@LIBRARY" suffix it must match 
fully. The problem with the version numbers associated with functions is 
they're the versions from the mapfiles, so the same library version has 
malloc@GLIBC_2.2.5, epoll_ctl@GLIBC_2.3.2 etc.

Thanks for the review! I'm working on incorporating all of these changes
into v4 now.

Alan
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4ce94f4..eb95629 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -10203,6 +10203,241 @@  static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	return pfd;
 }
 
+/* uprobes deal in relative offsets; subtract the base address associated with
+ * the mapped binary.  See Documentation/trace/uprobetracer.rst for more
+ * details.
+ */
+static long elf_find_relative_offset(Elf *elf,  long addr)
+{
+	size_t n;
+	int i;
+
+	if (elf_getphdrnum(elf, &n)) {
+		pr_warn("elf: failed to find program headers: %s\n",
+			elf_errmsg(-1));
+		return -ENOENT;
+	}
+
+	for (i = 0; i < n; i++) {
+		int seg_start, seg_end, seg_offset;
+		GElf_Phdr phdr;
+
+		if (!gelf_getphdr(elf, i, &phdr)) {
+			pr_warn("elf: failed to get program header %d: %s\n",
+				i, elf_errmsg(-1));
+			return -ENOENT;
+		}
+		if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X))
+			continue;
+
+		seg_start = phdr.p_vaddr;
+		seg_end = seg_start + phdr.p_memsz;
+		seg_offset = phdr.p_offset;
+		if (addr >= seg_start && addr < seg_end)
+			return addr -  seg_start + seg_offset;
+	}
+	pr_warn("elf: failed to find prog header containing 0x%lx\n", addr);
+	return -ENOENT;
+}
+
+/* Return next ELF section of sh_type after scn, or first of that type
+ * if scn is NULL.
+ */
+static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)
+{
+	while ((scn = elf_nextscn(elf, scn)) != NULL) {
+		GElf_Shdr sh;
+
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		if (sh.sh_type == sh_type)
+			break;
+	}
+	return scn;
+}
+
+/* For Position-Independent Code-based libraries, a table of trampolines
+ * (Procedure Linking Table) is used to support resolution of symbol
+ * names at linking time.  The goal here is to find the offset associated
+ * with the jump to the actual library function.  If we can instrument that
+ * locally in the specific binary (rather than instrumenting glibc say),
+ * overheads are greatly reduced.
+ *
+ * The method used is to find the .plt section and determine the offset
+ * of the relevant entry (given by the base address plus the index
+ * of the function multiplied by the size of a .plt entry).
+ */
+static ssize_t elf_find_plt_offset(Elf *elf, size_t ndx)
+{
+	Elf_Scn *scn = NULL;
+	size_t shstrndx;
+
+	if (elf_getshdrstrndx(elf, &shstrndx)) {
+		pr_debug("elf: failed to get section names section index: %s\n",
+			 elf_errmsg(-1));
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	while ((scn = elf_find_next_scn_by_type(elf, SHT_PROGBITS, scn))) {
+		long plt_entry_sz, plt_base;
+		const char *name;
+		GElf_Shdr sh;
+
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		name = elf_strptr(elf, shstrndx, sh.sh_name);
+		if (!name || strcmp(name, ".plt") != 0)
+			continue;
+		plt_base = sh.sh_addr;
+		plt_entry_sz = sh.sh_entsize;
+		return plt_base + (ndx * plt_entry_sz);
+	}
+	pr_debug("elf: no .plt section found\n");
+	return -LIBBPF_ERRNO__FORMAT;
+}
+
+/* Find offset of function name in object specified by path.  "name" matches
+ * symbol name or name@@LIB for library functions.
+ */
+static long elf_find_func_offset(const char *binary_path, const char *name)
+{
+	int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };
+	bool is_shared_lib, is_name_qualified;
+	size_t name_len, sym_ndx = -1;
+	char errmsg[STRERR_BUFSIZE];
+	long ret = -ENOENT;
+	GElf_Ehdr ehdr;
+	Elf *elf;
+
+	fd = open(binary_path, O_RDONLY | O_CLOEXEC);
+	if (fd < 0) {
+		ret = -errno;
+		pr_warn("failed to open %s: %s\n", binary_path,
+			libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
+		return ret;
+	}
+	elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);
+	if (!elf) {
+		pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));
+		close(fd);
+		return -LIBBPF_ERRNO__FORMAT;
+	}
+	if (!gelf_getehdr(elf, &ehdr)) {
+		pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));
+		ret = -LIBBPF_ERRNO__FORMAT;
+		goto out;
+	}
+	/* for shared lib case, we do not need to calculate relative offset */
+	is_shared_lib = ehdr.e_type == ET_DYN;
+
+	name_len = strlen(name);
+	/* Does name specify "@@LIB"? */
+	is_name_qualified = strstr(name, "@@") != NULL;
+
+	/* Search SHT_DYNSYM, SHT_SYMTAB for symbol.  This search order is used because if
+	 * the symbol is found in SHY_DYNSYM, the index in that table tells us which index
+	 * to use in the Procedure Linking Table to instrument calls to the shared library
+	 * function, but locally in the binary rather than in the shared library ifself.
+	 * If a binary is stripped, it may also only have SHT_DYNSYM, and a fully-statically
+	 * linked binary may not have SHT_DYMSYM, so absence of a section should not be
+	 * reported as a warning/error.
+	 */
+	for (i = 0; i < ARRAY_SIZE(sh_types); i++) {
+		size_t strtabidx, ndx, nr_syms;
+		Elf_Data *symbols = NULL;
+		Elf_Scn *scn = NULL;
+		int last_bind = -1;
+		const char *sname;
+		GElf_Shdr sh;
+
+		scn = elf_find_next_scn_by_type(elf, sh_types[i], NULL);
+		if (!scn) {
+			pr_debug("elf: failed to find symbol table ELF sections in %s\n",
+				 binary_path);
+			continue;
+		}
+		if (!gelf_getshdr(scn, &sh))
+			continue;
+		strtabidx = sh.sh_link;
+		symbols = elf_getdata(scn, 0);
+		if (!symbols) {
+			pr_warn("elf: failed to get symbols for symtab section in %s: %s\n",
+				binary_path, elf_errmsg(-1));
+			ret = -LIBBPF_ERRNO__FORMAT;
+			goto out;
+		}
+		nr_syms = symbols->d_size / sh.sh_entsize;
+
+		for (ndx = 0; ndx < nr_syms; ndx++) {
+			int curr_bind;
+			GElf_Sym sym;
+
+			if (!gelf_getsym(symbols, ndx, &sym))
+				continue;
+			if (GELF_ST_TYPE(sym.st_info) != STT_FUNC)
+				continue;
+
+			sname = elf_strptr(elf, strtabidx, sym.st_name);
+			if (!sname)
+				continue;
+			curr_bind = GELF_ST_BIND(sym.st_info);
+
+			/* User can specify func, func@@LIB or func@@LIB_VERSION. */
+			if (strncmp(sname, name, name_len) != 0)
+				continue;
+			/* ...but we don't want a search for "foo" to match 'foo2" also, so any
+			 * additional characters in sname should be of the form "@@LIB".
+			 */
+			if (!is_name_qualified && strlen(sname) > name_len &&
+			    sname[name_len] != '@')
+				continue;
+
+			if (ret >= 0 && last_bind != -1) {
+				/* handle multiple matches */
+				if (last_bind != STB_WEAK && curr_bind != STB_WEAK) {
+					/* Only accept one non-weak bind. */
+					pr_warn("elf: ambiguous match for '%s': %s\n",
+						sname, name);
+					ret = -LIBBPF_ERRNO__FORMAT;
+					goto out;
+				} else if (curr_bind == STB_WEAK) {
+					/* already have a non-weak bind, and
+					 * this is a weak bind, so ignore.
+					 */
+					continue;
+				}
+			}
+			ret = sym.st_value;
+			last_bind = curr_bind;
+			sym_ndx = ndx;
+		}
+		/* The index of the entry in SHT_DYNSYM gives us the index into the PLT */
+		if (ret == 0 && sh_types[i] == SHT_DYNSYM)
+			ret = elf_find_plt_offset(elf, sym_ndx);
+		/* For binaries that are not shared libraries, we need relative offset */
+		if (ret > 0 && !is_shared_lib)
+			ret = elf_find_relative_offset(elf, ret);
+		if (ret > 0)
+			break;
+	}
+
+	if (ret > 0) {
+		pr_debug("elf: symbol address match for '%s': 0x%lx\n", name, ret);
+	} else {
+		if (ret == 0) {
+			pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path,
+				is_shared_lib ? "should not be 0 in a shared library" :
+						"try using shared library path instead");
+			ret = -ENOENT;
+		} else {
+			pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path);
+		}
+	}
+out:
+	elf_end(elf);
+	close(fd);
+	return ret;
+}
+
 LIBBPF_API struct bpf_link *
 bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid,
 				const char *binary_path, size_t func_offset,
@@ -10214,6 +10449,7 @@  static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	size_t ref_ctr_off;
 	int pfd, err;
 	bool retprobe, legacy;
+	const char *func_name;
 
 	if (!OPTS_VALID(opts, bpf_uprobe_opts))
 		return libbpf_err_ptr(-EINVAL);
@@ -10222,6 +10458,20 @@  static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe,
 	ref_ctr_off = OPTS_GET(opts, ref_ctr_offset, 0);
 	pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
 
+	func_name = OPTS_GET(opts, func_name, NULL);
+	if (func_name) {
+		long sym_off;
+
+		if (!binary_path) {
+			pr_warn("name-based attach requires binary_path\n");
+			return libbpf_err_ptr(-EINVAL);
+		}
+		sym_off = elf_find_func_offset(binary_path, func_name);
+		if (sym_off < 0)
+			return libbpf_err_ptr(sym_off);
+		func_offset += (size_t)sym_off;
+	}
+
 	legacy = determine_uprobe_perf_type() < 0;
 	if (!legacy) {
 		pfd = perf_event_open_probe(true /* uprobe */, retprobe, binary_path,
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 5762b57..1de3eeb 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -433,9 +433,17 @@  struct bpf_uprobe_opts {
 	__u64 bpf_cookie;
 	/* uprobe is return probe, invoked at function return time */
 	bool retprobe;
+	/* name of function name or function@@LIBRARY.  Partial matches
+	 * work for library functions, such as printf, printf@@GLIBC.
+	 * To specify function entry, func_offset argument should be 0 and
+	 * func_name should specify function to trace.  To trace an offset
+	 * within the function, specify func_name and use func_offset
+	 * argument to specify argument _within_ the function.
+	 */
+	const char *func_name;
 	size_t :0;
 };
-#define bpf_uprobe_opts__last_field retprobe
+#define bpf_uprobe_opts__last_field func_name
 
 /**
  * @brief **bpf_program__attach_uprobe()** attaches a BPF program