diff mbox series

[bpf-next] libbpf: Allow Golang symbols in uprobe secdef

Message ID 20230925025722.46580-1-hengqi.chen@gmail.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next] libbpf: Allow Golang symbols in uprobe secdef | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-0 success Logs for ShellCheck
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-5 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-1 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_progs on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 9 this patch: 9
netdev/cc_maintainers warning 8 maintainers not CCed: martin.lau@linux.dev jolsa@kernel.org haoluo@google.com sdf@google.com john.fastabend@gmail.com yonghong.song@linux.dev kpsingh@kernel.org song@kernel.org
netdev/build_clang success Errors and warnings before: 9 this patch: 9
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 9 this patch: 9
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Hengqi Chen Sept. 25, 2023, 2:57 a.m. UTC
Golang symbols in ELF files are different from C/C++
which contains special characters like '*', '(' and ')'.
With generics, things get more complicated, there are
symbols like:

  github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
   Format(fmt.State, int32); TypeName() string;
  github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
  }]).Grow

Add " ()*,-/;[]{}" (in alphabetical order) to support matching
against such symbols. Note that ']' and '-' should be the first
and last characters in the %m range as sscanf required.

A working example can be found at this repo ([0]).

  [0]: https://github.com/chenhengqi/libbpf-go-symbols

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
---
 tools/lib/bpf/libbpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jiri Olsa Sept. 25, 2023, 8:19 a.m. UTC | #1
On Mon, Sep 25, 2023 at 02:57:22AM +0000, Hengqi Chen wrote:
> Golang symbols in ELF files are different from C/C++
> which contains special characters like '*', '(' and ')'.
> With generics, things get more complicated, there are
> symbols like:
> 
>   github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
>    Format(fmt.State, int32); TypeName() string;
>   github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
>   }]).Grow
> 
> Add " ()*,-/;[]{}" (in alphabetical order) to support matching
> against such symbols. Note that ']' and '-' should be the first
> and last characters in the %m range as sscanf required.
> 
> A working example can be found at this repo ([0]).
> 
>   [0]: https://github.com/chenhengqi/libbpf-go-symbols
> 
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> ---
>  tools/lib/bpf/libbpf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b4758e54a815..de0e068195ab 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -11630,7 +11630,7 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
>  
>  	*link = NULL;
>  
> -	n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
> +	n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",
>  		   &probe_type, &binary_path, &func_name, &offset);

could you please make that work for uprobe.multi (attach_uprobe_multi)
as well?

it uses %ms at the moment and it seems it won't get pass the space
in the symbol name

thanks,
jirka

>  	switch (n) {
>  	case 1:
> -- 
> 2.34.1
> 
>
Andrii Nakryiko Sept. 25, 2023, 11:15 p.m. UTC | #2
On Sun, Sep 24, 2023 at 8:19 PM Hengqi Chen <hengqi.chen@gmail.com> wrote:
>
> Golang symbols in ELF files are different from C/C++
> which contains special characters like '*', '(' and ')'.
> With generics, things get more complicated, there are
> symbols like:
>
>   github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
>    Format(fmt.State, int32); TypeName() string;
>   github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
>   }]).Grow
>
> Add " ()*,-/;[]{}" (in alphabetical order) to support matching
> against such symbols. Note that ']' and '-' should be the first
> and last characters in the %m range as sscanf required.
>
> A working example can be found at this repo ([0]).
>
>   [0]: https://github.com/chenhengqi/libbpf-go-symbols
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> ---
>  tools/lib/bpf/libbpf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b4758e54a815..de0e068195ab 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -11630,7 +11630,7 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
>
>         *link = NULL;
>
> -       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
> +       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",

This is almost incomprehensible now... wouldn't it be clearer to just
have a catch-all %ms at the end, and then internally checking if we
have '+%li'? I.e., once we match everything after
"uprobe/<path-to-binary>:", we can strchr('+'), if found, try
sscanf("%li") on the remaining suffix. If that doesn't parse properly,
then we have a choice -- either error out, or just assume that
`+<something>` part is just a part of ELF symbol name?

That way we don't hard-code any fixes set of symbols and avoid any
future crazy adjustments.

WDYT?

>                    &probe_type, &binary_path, &func_name, &offset);
>         switch (n) {
>         case 1:
> --
> 2.34.1
>
Hengqi Chen Sept. 27, 2023, 2:12 a.m. UTC | #3
On Mon, Sep 25, 2023 at 4:19 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Mon, Sep 25, 2023 at 02:57:22AM +0000, Hengqi Chen wrote:
> > Golang symbols in ELF files are different from C/C++
> > which contains special characters like '*', '(' and ')'.
> > With generics, things get more complicated, there are
> > symbols like:
> >
> >   github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
> >    Format(fmt.State, int32); TypeName() string;
> >   github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
> >   }]).Grow
> >
> > Add " ()*,-/;[]{}" (in alphabetical order) to support matching
> > against such symbols. Note that ']' and '-' should be the first
> > and last characters in the %m range as sscanf required.
> >
> > A working example can be found at this repo ([0]).
> >
> >   [0]: https://github.com/chenhengqi/libbpf-go-symbols
> >
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> > ---
> >  tools/lib/bpf/libbpf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index b4758e54a815..de0e068195ab 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -11630,7 +11630,7 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
> >
> >       *link = NULL;
> >
> > -     n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
> > +     n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",
> >                  &probe_type, &binary_path, &func_name, &offset);
>
> could you please make that work for uprobe.multi (attach_uprobe_multi)
> as well?
>

I haven't used uprobe.multi before, let me try.

> it uses %ms at the moment and it seems it won't get pass the space
> in the symbol name
>
> thanks,
> jirka
>
> >       switch (n) {
> >       case 1:
> > --
> > 2.34.1
> >
> >
Hengqi Chen Sept. 27, 2023, 2:17 a.m. UTC | #4
On Tue, Sep 26, 2023 at 7:15 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sun, Sep 24, 2023 at 8:19 PM Hengqi Chen <hengqi.chen@gmail.com> wrote:
> >
> > Golang symbols in ELF files are different from C/C++
> > which contains special characters like '*', '(' and ')'.
> > With generics, things get more complicated, there are
> > symbols like:
> >
> >   github.com/cilium/ebpf/internal.(*Deque[go.shape.interface {
> >    Format(fmt.State, int32); TypeName() string;
> >   github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type
> >   }]).Grow
> >
> > Add " ()*,-/;[]{}" (in alphabetical order) to support matching
> > against such symbols. Note that ']' and '-' should be the first
> > and last characters in the %m range as sscanf required.
> >
> > A working example can be found at this repo ([0]).
> >
> >   [0]: https://github.com/chenhengqi/libbpf-go-symbols
> >
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
> > ---
> >  tools/lib/bpf/libbpf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index b4758e54a815..de0e068195ab 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -11630,7 +11630,7 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
> >
> >         *link = NULL;
> >
> > -       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
> > +       n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",
>
> This is almost incomprehensible now... wouldn't it be clearer to just
> have a catch-all %ms at the end, and then internally checking if we
> have '+%li'? I.e., once we match everything after
> "uprobe/<path-to-binary>:", we can strchr('+'), if found, try
> sscanf("%li") on the remaining suffix. If that doesn't parse properly,
> then we have a choice -- either error out, or just assume that
> `+<something>` part is just a part of ELF symbol name?
>
> That way we don't hard-code any fixes set of symbols and avoid any
> future crazy adjustments.
>
> WDYT?

Sounds good. This also solves the matching of unicode identifiers.

As Jiri mentioned above, %ms won't match whitespaces,
so I am wondering if %m[^\n] is acceptable.

>
> >                    &probe_type, &binary_path, &func_name, &offset);
> >         switch (n) {
> >         case 1:
> > --
> > 2.34.1
> >
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b4758e54a815..de0e068195ab 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -11630,7 +11630,7 @@  static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
 
 	*link = NULL;
 
-	n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[a-zA-Z0-9_.@]+%li",
+	n = sscanf(prog->sec_name, "%m[^/]/%m[^:]:%m[]a-zA-Z0-9 ()*,./;@[_{}-]+%li",
 		   &probe_type, &binary_path, &func_name, &offset);
 	switch (n) {
 	case 1: