diff mbox series

[bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25

Message ID 1675949331-27935-1-git-send-email-alan.maguire@oracle.com (mailing list archive)
State Accepted
Commit 0243d3dfe274832aa0a16214499c208122345173
Delegated to: BPF
Headers show
Series [bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25 | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: line length of 103 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc

Commit Message

Alan Maguire Feb. 9, 2023, 1:28 p.m. UTC
v1.25 of pahole supports filtering out functions with multiple
inconsistent function prototypes or optimized-out parameters
from the BTF representation.  These present problems because
there is no additional info in BTF saying which inconsistent
prototype matches which function instance to help guide
attachment, and functions with optimized-out parameters can
lead to incorrect assumptions about register contents.

So for now, filter out such functions while adding BTF
representations for functions that have "."-suffixes
(foo.isra.0) but not optimized-out parameters.

This patch assumes changes in [1] land and pahole is bumped
to v1.25.

[1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

---
 scripts/pahole-flags.sh | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jiri Olsa Feb. 9, 2023, 3:08 p.m. UTC | #1
On Thu, Feb 09, 2023 at 01:28:51PM +0000, Alan Maguire wrote:
> v1.25 of pahole supports filtering out functions with multiple
> inconsistent function prototypes or optimized-out parameters
> from the BTF representation.  These present problems because
> there is no additional info in BTF saying which inconsistent
> prototype matches which function instance to help guide
> attachment, and functions with optimized-out parameters can
> lead to incorrect assumptions about register contents.
> 
> So for now, filter out such functions while adding BTF
> representations for functions that have "."-suffixes
> (foo.isra.0) but not optimized-out parameters.
> 
> This patch assumes changes in [1] land and pahole is bumped
> to v1.25.
> 
> [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

jirka

> 
> ---
>  scripts/pahole-flags.sh | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d3..728d551 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>  	# see PAHOLE_HAS_LANG_EXCLUDE
>  	extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>  fi
> +if [ "${pahole_ver}" -ge "125" ]; then
> +	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> +fi
>  
>  echo ${extra_paholeopt}
> -- 
> 1.8.3.1
>
patchwork-bot+netdevbpf@kernel.org Feb. 13, 2023, 5 p.m. UTC | #2
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Thu,  9 Feb 2023 13:28:51 +0000 you wrote:
> v1.25 of pahole supports filtering out functions with multiple
> inconsistent function prototypes or optimized-out parameters
> from the BTF representation.  These present problems because
> there is no additional info in BTF saying which inconsistent
> prototype matches which function instance to help guide
> attachment, and functions with optimized-out parameters can
> lead to incorrect assumptions about register contents.
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25
    https://git.kernel.org/bpf/bpf-next/c/0243d3dfe274

You are awesome, thank you!
Alexei Starovoitov Feb. 14, 2023, 3:12 a.m. UTC | #3
On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> v1.25 of pahole supports filtering out functions with multiple
> inconsistent function prototypes or optimized-out parameters
> from the BTF representation.  These present problems because
> there is no additional info in BTF saying which inconsistent
> prototype matches which function instance to help guide
> attachment, and functions with optimized-out parameters can
> lead to incorrect assumptions about register contents.
>
> So for now, filter out such functions while adding BTF
> representations for functions that have "."-suffixes
> (foo.isra.0) but not optimized-out parameters.
>
> This patch assumes changes in [1] land and pahole is bumped
> to v1.25.
>
> [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ---
>  scripts/pahole-flags.sh | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d3..728d551 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>         # see PAHOLE_HAS_LANG_EXCLUDE
>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>  fi
> +if [ "${pahole_ver}" -ge "125" ]; then
> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> +fi

We landed this too soon.
#229     tracing_struct:FAIL
is failing now.
since bpf_testmod.ko is missing a bunch of functions though they're global.

I've tried a bunch of different flags and attributes, but none of them
helped.
The only thing that works is:
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 46500636d8cd..5fd0f75d5d20 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
        long b;
 };

+__attribute__((optimize("-O0")))
 noinline int
 bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
b, int c) {

We cannot do:
--- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
+++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
@@ -10,7 +10,7 @@ endif
 MODULES = bpf_testmod.ko

 obj-m += bpf_testmod.o
-CFLAGS_bpf_testmod.o = -I$(src)
+CFLAGS_bpf_testmod.o = -I$(src) -O0

The build fails due to asm stuff.

Maybe we should make scripts/pahole-flags.sh selective
and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?

Thoughts?
Alexei Starovoitov Feb. 14, 2023, 6:09 a.m. UTC | #4
On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > v1.25 of pahole supports filtering out functions with multiple
> > inconsistent function prototypes or optimized-out parameters
> > from the BTF representation.  These present problems because
> > there is no additional info in BTF saying which inconsistent
> > prototype matches which function instance to help guide
> > attachment, and functions with optimized-out parameters can
> > lead to incorrect assumptions about register contents.
> >
> > So for now, filter out such functions while adding BTF
> > representations for functions that have "."-suffixes
> > (foo.isra.0) but not optimized-out parameters.
> >
> > This patch assumes changes in [1] land and pahole is bumped
> > to v1.25.
> >
> > [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> >
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> > ---
> >  scripts/pahole-flags.sh | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > index 1f1f1d3..728d551 100755
> > --- a/scripts/pahole-flags.sh
> > +++ b/scripts/pahole-flags.sh
> > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> >         # see PAHOLE_HAS_LANG_EXCLUDE
> >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> >  fi
> > +if [ "${pahole_ver}" -ge "125" ]; then
> > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > +fi
>
> We landed this too soon.
> #229     tracing_struct:FAIL
> is failing now.
> since bpf_testmod.ko is missing a bunch of functions though they're global.
>
> I've tried a bunch of different flags and attributes, but none of them
> helped.
> The only thing that works is:
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 46500636d8cd..5fd0f75d5d20 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
>         long b;
>  };
>
> +__attribute__((optimize("-O0")))
>  noinline int
>  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> b, int c) {
>
> We cannot do:
> --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> @@ -10,7 +10,7 @@ endif
>  MODULES = bpf_testmod.ko
>
>  obj-m += bpf_testmod.o
> -CFLAGS_bpf_testmod.o = -I$(src)
> +CFLAGS_bpf_testmod.o = -I$(src) -O0
>
> The build fails due to asm stuff.
>
> Maybe we should make scripts/pahole-flags.sh selective
> and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
>
> Thoughts?

It's even worse with clang compiled kernel:
    WARN: resolve_btfids: unresolved symbol tcp_reno_cong_avoid
    WARN: resolve_btfids: unresolved symbol dctcp_update_alpha
    WARN: resolve_btfids: unresolved symbol cubictcp_cong_avoid
    WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_timestamp
    WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
    WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
    WARN: resolve_btfids: unresolved symbol bpf_task_acquire_not_zero
    WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
    WARN: resolve_btfids: unresolved symbol
bpf_kfunc_call_test_static_unused_arg
    WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref

so I reverted this commit for now.
Looks like pahole with skip_encoding_btf_inconsistent_proto needs
to be more accurate.
It's way too aggressive removing valid functions.
Jiri Olsa Feb. 14, 2023, 12:27 p.m. UTC | #5
On Mon, Feb 13, 2023 at 07:12:33PM -0800, Alexei Starovoitov wrote:
> On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > v1.25 of pahole supports filtering out functions with multiple
> > inconsistent function prototypes or optimized-out parameters
> > from the BTF representation.  These present problems because
> > there is no additional info in BTF saying which inconsistent
> > prototype matches which function instance to help guide
> > attachment, and functions with optimized-out parameters can
> > lead to incorrect assumptions about register contents.
> >
> > So for now, filter out such functions while adding BTF
> > representations for functions that have "."-suffixes
> > (foo.isra.0) but not optimized-out parameters.
> >
> > This patch assumes changes in [1] land and pahole is bumped
> > to v1.25.
> >
> > [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> >
> > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> > ---
> >  scripts/pahole-flags.sh | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > index 1f1f1d3..728d551 100755
> > --- a/scripts/pahole-flags.sh
> > +++ b/scripts/pahole-flags.sh
> > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> >         # see PAHOLE_HAS_LANG_EXCLUDE
> >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> >  fi
> > +if [ "${pahole_ver}" -ge "125" ]; then
> > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > +fi
> 
> We landed this too soon.
> #229     tracing_struct:FAIL
> is failing now.
> since bpf_testmod.ko is missing a bunch of functions though they're global.
> 

hum, didn't see this one failing.. I'll try that again

jirka

> I've tried a bunch of different flags and attributes, but none of them
> helped.
> The only thing that works is:
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 46500636d8cd..5fd0f75d5d20 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
>         long b;
>  };
> 
> +__attribute__((optimize("-O0")))
>  noinline int
>  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> b, int c) {
> 
> We cannot do:
> --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> @@ -10,7 +10,7 @@ endif
>  MODULES = bpf_testmod.ko
> 
>  obj-m += bpf_testmod.o
> -CFLAGS_bpf_testmod.o = -I$(src)
> +CFLAGS_bpf_testmod.o = -I$(src) -O0
> 
> The build fails due to asm stuff.
> 
> Maybe we should make scripts/pahole-flags.sh selective
> and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
> 
> Thoughts?
Arnaldo Carvalho de Melo Feb. 14, 2023, 4:17 p.m. UTC | #6
Em Tue, Feb 14, 2023 at 01:27:43PM +0100, Jiri Olsa escreveu:
> On Mon, Feb 13, 2023 at 07:12:33PM -0800, Alexei Starovoitov wrote:
> > On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >
> > > v1.25 of pahole supports filtering out functions with multiple
> > > inconsistent function prototypes or optimized-out parameters
> > > from the BTF representation.  These present problems because
> > > there is no additional info in BTF saying which inconsistent
> > > prototype matches which function instance to help guide
> > > attachment, and functions with optimized-out parameters can
> > > lead to incorrect assumptions about register contents.
> > >
> > > So for now, filter out such functions while adding BTF
> > > representations for functions that have "."-suffixes
> > > (foo.isra.0) but not optimized-out parameters.
> > >
> > > This patch assumes changes in [1] land and pahole is bumped
> > > to v1.25.
> > >
> > > [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> > >
> > > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > >
> > > ---
> > >  scripts/pahole-flags.sh | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > > index 1f1f1d3..728d551 100755
> > > --- a/scripts/pahole-flags.sh
> > > +++ b/scripts/pahole-flags.sh
> > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > >         # see PAHOLE_HAS_LANG_EXCLUDE
> > >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > >  fi
> > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > > +fi
> > 
> > We landed this too soon.
> > #229     tracing_struct:FAIL
> > is failing now.
> > since bpf_testmod.ko is missing a bunch of functions though they're global.
> > 
> 
> hum, didn't see this one failing.. I'll try that again

/me too, redoing tests her, with gcc and clang, running selftests on a
system booted with a kernel built with pahole 1.25, etc.

- Arnaldo
 
> jirka
> 
> > I've tried a bunch of different flags and attributes, but none of them
> > helped.
> > The only thing that works is:
> > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > index 46500636d8cd..5fd0f75d5d20 100644
> > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
> >         long b;
> >  };
> > 
> > +__attribute__((optimize("-O0")))
> >  noinline int
> >  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> > b, int c) {
> > 
> > We cannot do:
> > --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > @@ -10,7 +10,7 @@ endif
> >  MODULES = bpf_testmod.ko
> > 
> >  obj-m += bpf_testmod.o
> > -CFLAGS_bpf_testmod.o = -I$(src)
> > +CFLAGS_bpf_testmod.o = -I$(src) -O0
> > 
> > The build fails due to asm stuff.
> > 
> > Maybe we should make scripts/pahole-flags.sh selective
> > and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
> > 
> > Thoughts?
Jiri Olsa Feb. 14, 2023, 10:12 p.m. UTC | #7
On Tue, Feb 14, 2023 at 01:17:56PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Feb 14, 2023 at 01:27:43PM +0100, Jiri Olsa escreveu:
> > On Mon, Feb 13, 2023 at 07:12:33PM -0800, Alexei Starovoitov wrote:
> > > On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > >
> > > > v1.25 of pahole supports filtering out functions with multiple
> > > > inconsistent function prototypes or optimized-out parameters
> > > > from the BTF representation.  These present problems because
> > > > there is no additional info in BTF saying which inconsistent
> > > > prototype matches which function instance to help guide
> > > > attachment, and functions with optimized-out parameters can
> > > > lead to incorrect assumptions about register contents.
> > > >
> > > > So for now, filter out such functions while adding BTF
> > > > representations for functions that have "."-suffixes
> > > > (foo.isra.0) but not optimized-out parameters.
> > > >
> > > > This patch assumes changes in [1] land and pahole is bumped
> > > > to v1.25.
> > > >
> > > > [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> > > >
> > > > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > > > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> > > >
> > > > ---
> > > >  scripts/pahole-flags.sh | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > > > index 1f1f1d3..728d551 100755
> > > > --- a/scripts/pahole-flags.sh
> > > > +++ b/scripts/pahole-flags.sh
> > > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > > >         # see PAHOLE_HAS_LANG_EXCLUDE
> > > >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > > >  fi
> > > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > > > +fi
> > > 
> > > We landed this too soon.
> > > #229     tracing_struct:FAIL
> > > is failing now.
> > > since bpf_testmod.ko is missing a bunch of functions though they're global.
> > > 
> > 
> > hum, didn't see this one failing.. I'll try that again
> 
> /me too, redoing tests her, with gcc and clang, running selftests on a
> system booted with a kernel built with pahole 1.25, etc.

ok, can't see that with gcc, but reproduced with clang 16

resolve_btfids complains because those functions are not in btf

  BTFIDS  vmlinux
WARN: resolve_btfids: unresolved symbol tcp_reno_cong_avoid
WARN: resolve_btfids: unresolved symbol should_failslab
WARN: resolve_btfids: unresolved symbol should_fail_alloc_page
WARN: resolve_btfids: unresolved symbol cubictcp_cong_avoid
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_timestamp
WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_task_acquire_not_zero
WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_static_unused_arg
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass_ctx
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_pass1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_kptr_get
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail3
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_acquire
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test2
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test1
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb_release
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb1_release
WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_int_mem_release
  NM      System.map

jirka
Alan Maguire Feb. 17, 2023, 1:45 p.m. UTC | #8
On 14/02/2023 22:12, Jiri Olsa wrote:
> On Tue, Feb 14, 2023 at 01:17:56PM -0300, Arnaldo Carvalho de Melo wrote:
>> Em Tue, Feb 14, 2023 at 01:27:43PM +0100, Jiri Olsa escreveu:
>>> On Mon, Feb 13, 2023 at 07:12:33PM -0800, Alexei Starovoitov wrote:
>>>> On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>
>>>>> v1.25 of pahole supports filtering out functions with multiple
>>>>> inconsistent function prototypes or optimized-out parameters
>>>>> from the BTF representation.  These present problems because
>>>>> there is no additional info in BTF saying which inconsistent
>>>>> prototype matches which function instance to help guide
>>>>> attachment, and functions with optimized-out parameters can
>>>>> lead to incorrect assumptions about register contents.
>>>>>
>>>>> So for now, filter out such functions while adding BTF
>>>>> representations for functions that have "."-suffixes
>>>>> (foo.isra.0) but not optimized-out parameters.
>>>>>
>>>>> This patch assumes changes in [1] land and pahole is bumped
>>>>> to v1.25.
>>>>>
>>>>> [1] https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
>>>>>
>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>>>>>
>>>>> ---
>>>>>  scripts/pahole-flags.sh | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>> index 1f1f1d3..728d551 100755
>>>>> --- a/scripts/pahole-flags.sh
>>>>> +++ b/scripts/pahole-flags.sh
>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>         # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>  fi
>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
>>>>> +fi
>>>>
>>>> We landed this too soon.
>>>> #229     tracing_struct:FAIL
>>>> is failing now.
>>>> since bpf_testmod.ko is missing a bunch of functions though they're global.
>>>>
>>>
>>> hum, didn't see this one failing.. I'll try that again
>>
>> /me too, redoing tests her, with gcc and clang, running selftests on a
>> system booted with a kernel built with pahole 1.25, etc.
> 
> ok, can't see that with gcc, but reproduced with clang 16
> 
> resolve_btfids complains because those functions are not in btf
> 
>   BTFIDS  vmlinux
> WARN: resolve_btfids: unresolved symbol tcp_reno_cong_avoid
> WARN: resolve_btfids: unresolved symbol should_failslab
> WARN: resolve_btfids: unresolved symbol should_fail_alloc_page
> WARN: resolve_btfids: unresolved symbol cubictcp_cong_avoid
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_timestamp
> WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> WARN: resolve_btfids: unresolved symbol bpf_task_acquire_not_zero
> WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_static_unused_arg
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass_ctx
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_pass1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_pass1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_mem_len_fail1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_kptr_get
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail3
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_fail1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_acquire
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test2
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test1
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb_release
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_memb1_release
> WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_int_mem_release
>   NM      System.map
>

Jiri kindly provided a clang-generated vmlinux, and I also managed to reproduce
issues by building the kernel with clang 17.

The first question is if we're detecting optimizations correctly. From
an initial look, I _think_ we are, in some cases at least.

For tcp_reno_cong_avoid(), the function signature is

__bpf_kfunc void tcp_reno_cong_avoid(struct sock *sk, u32 ack, u32 acked)

...but our handling of the DWARF generated spots that the "ack" parameter
has no location info; and looking at the source it is never used.  Here is 
the DWARF - note no location info for the second parameter ("ack"):

0x0891dab0:   DW_TAG_subprogram
                DW_AT_low_pc    (0xffffffff82031180)
                DW_AT_high_pc   (0xffffffff820311d8)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_call_all_calls    (true)
                DW_AT_name      ("tcp_reno_cong_avoid")
                DW_AT_decl_file ("/home/jolsa/kernel/linux-qemu/net/ipv4/tcp_cong.c")
                DW_AT_decl_line (446)
                DW_AT_prototyped        (true)
                DW_AT_external  (true)

0x0891dabe:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x7a) loclist = 0x00f50eb1:
                     [0xffffffff82031185, 0xffffffff8203119e): DW_OP_reg5 RDI
                     [0xffffffff8203119e, 0xffffffff820311cc): DW_OP_reg3 RBX
                     [0xffffffff820311cc, 0xffffffff820311d1): DW_OP_reg5 RDI
                     [0xffffffff820311d1, 0xffffffff820311d2): DW_OP_reg3 RBX
                     [0xffffffff820311d2, 0xffffffff820311d8): DW_OP_entry_value(DW_OP_reg5 RDI), DW_OP_stack_value)
                  DW_AT_name    ("sk")
                  DW_AT_decl_file       ("/home/jolsa/kernel/linux-qemu/net/ipv4/tcp_cong.c")
                  DW_AT_decl_line       (446)
                  DW_AT_type    (0x08902c4d "sock *")

0x0891dac9:     DW_TAG_formal_parameter
                  DW_AT_name    ("ack")
                  DW_AT_decl_file       ("/home/jolsa/kernel/linux-qemu/net/ipv4/tcp_cong.c")
                  DW_AT_decl_line       (446)
                  DW_AT_type    (0x08902c3d "u32")

0x0891dad4:     DW_TAG_formal_parameter
                  DW_AT_location        (indexed (0x7b) loclist = 0x00f50eda:
                     [0xffffffff82031185, 0xffffffff820311bc): DW_OP_reg1 RDX
                     [0xffffffff820311bc, 0xffffffff820311c8): DW_OP_reg0 RAX
                     [0xffffffff820311c8, 0xffffffff820311d1): DW_OP_reg1 RDX)
                  DW_AT_name    ("acked")
                  DW_AT_decl_file       ("/home/jolsa/kernel/linux-qemu/net/ipv4/tcp_cong.c")
                  DW_AT_decl_line       (446)
                  DW_AT_type    (0x08902c3d "u32")

Disassembling we see the following:

(gdb) disassemble/s tcp_reno_cong_avoid
Dump of assembler code for function tcp_reno_cong_avoid:
net/ipv4/tcp_cong.c:
447	{
   0xffffffff82031180 <+0>:	nopl   0x0(%rax,%rax,1)
   0xffffffff82031185 <+5>:	push   %rbx
   0xffffffff82031186 <+6>:	mov    %rdi,%rbx

./include/net/tcp.h:
1305		if (tp->is_cwnd_limited)
   0xffffffff82031189 <+9>:	cmpb   $0x0,0x95f(%rdi)
   0xffffffff82031190 <+16>:	mov    0x9e4(%rdi),%eax
   0xffffffff82031196 <+22>:	mov    0x9e8(%rdi),%esi
   0xffffffff8203119c <+28>:	js     0xffffffff820311ae <tcp_reno_cong_avoid+46>

1238		return tcp_snd_cwnd(tp) < tp->snd_ssthresh;
   0xffffffff8203119e <+30>:	cmp    %eax,%esi

1306			return true;
1307	
1308		/* If in slow start, ensure cwnd grows to twice what was ACKed. */
1309		if (tcp_in_slow_start(tp))
   0xffffffff820311a0 <+32>:	jae    0xffffffff820311d1 <tcp_reno_cong_avoid+81>

1310			return tcp_snd_cwnd(tp) < 2 * tp->max_packets_out;
   0xffffffff820311a2 <+34>:	mov    0x9b4(%rbx),%ecx
   0xffffffff820311a8 <+40>:	add    %ecx,%ecx
   0xffffffff820311aa <+42>:	cmp    %ecx,%esi

net/ipv4/tcp_cong.c:
450		if (!tcp_is_cwnd_limited(sk))
   0xffffffff820311ac <+44>:	jae    0xffffffff820311d1 <tcp_reno_cong_avoid+81>

./include/net/tcp.h:
1238		return tcp_snd_cwnd(tp) < tp->snd_ssthresh;
   0xffffffff820311ae <+46>:	cmp    %eax,%esi

net/ipv4/tcp_cong.c:
454		if (tcp_in_slow_start(tp)) {
   0xffffffff820311b0 <+48>:	jae    0xffffffff820311c8 <tcp_reno_cong_avoid+72>

455			acked = tcp_slow_start(tp, acked);
   0xffffffff820311b2 <+50>:	mov    %rbx,%rdi
   0xffffffff820311b5 <+53>:	mov    %edx,%esi
   0xffffffff820311b7 <+55>:	call   0xffffffff82031080 <tcp_slow_start>

--Type <RET> for more, q to quit, c to continue without paging--
456			if (!acked)
   0xffffffff820311bc <+60>:	test   %eax,%eax
   0xffffffff820311be <+62>:	je     0xffffffff820311d1 <tcp_reno_cong_avoid+81>
   0xffffffff820311c0 <+64>:	mov    %eax,%edx

./include/net/tcp.h:
1227		return tp->snd_cwnd;
   0xffffffff820311c2 <+66>:	mov    0x9e8(%rbx),%esi

net/ipv4/tcp_cong.c:
460		tcp_cong_avoid_ai(tp, tcp_snd_cwnd(tp), acked);
   0xffffffff820311c8 <+72>:	mov    %rbx,%rdi
   0xffffffff820311cb <+75>:	pop    %rbx
   0xffffffff820311cc <+76>:	jmp    0xffffffff820310d0 <tcp_cong_avoid_ai>

461	}
   0xffffffff820311d1 <+81>:	pop    %rbx
   0xffffffff820311d2 <+82>:	cs jmp 0xffffffff8223c240 <__x86_return_thunk>
End of assembler dump.

From what I can see above - unless I'm misreading it - we see something
interesting here. Note that in preparing the call to tcp_cong_avoid_ai(),
we get the tcp_snd_cwnd() value into %esi, but nothing needs to be done with 
%rdx because it's already got the "acked" value in it. Now this is good
news, because if we're calling this kfunc - that only uses the first and
third parameters - preparing all three will not lead to any nasty surprises
(we just wasted a bit of time preparing the second unused parameter).
If this held true for all kfuncs it would mean that skipping representing
them in BTF due to optimized-out parameters would be the wrong answer.
The key thing to figure out is this - if we optimize out a parameter, will
the subsequent parameters that are not optimized out still use the registers
that they would be expected to if no optimization had happened? So if I 
optimize out the first parameter say, will the second parameter use the 
"right" register (%rsi on x86_64)? If that were guaranteed, we could relax
the cases where we skip BTF generation to the following:

1. cases where multiple inconsistent function prototypes for the same name
   exist.
2. cases where a function has multiple instances with different optimization
   states (optimized out parameter in one CU but not another). This is another
   instance of 1 really, and shouldn't be an issue for kfuncs.
3. cases where an optimized-out parameter has knock-on effects for the
   registers used to handle other unoptimized parameters such that assumptions
   we would make from the function signature are violated for non-optimized out
   parameters. So the parameter would be flagged as using an unexpected
   register, and that unexpected case would instead lead to skipping BTF
   encoding.

I can have a go at implementing the above in pahole and seeing how it effects
our list of functions.

Note though that what's good for kfuncs isn't necessarily good for tracing
accuracy; if assumptions I make about parameter presence are violated, I will
see strange trace results based upon reading the code, whereas if I am
preparing kfunc parameters, a bit of extra work is done preparing an unused
parameter, but no harm is done. For the tracing case, function annotations
flagging optimized-out parameters via BTF tags could help.

However, none of the above will help problematic cases like
bpf_xdp_metadata_rx_timestamp() or bpf_xdp_metadata_rx_hash(); 
again we see missing location info in their DWARF representations

0x07449011:   DW_TAG_subprogram
                DW_AT_low_pc    (0xffffffff81ec8c80)
                DW_AT_high_pc   (0xffffffff81ec8c90)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_call_all_calls    (true)
                DW_AT_name      ("bpf_xdp_metadata_rx_timestamp")
                DW_AT_decl_file ("/home/jolsa/kernel/linux-qemu/net/core/xdp.c")
                DW_AT_decl_line (725)
                DW_AT_prototyped        (true)
                DW_AT_type      (0x0742f1ae "int")
                DW_AT_external  (true)

0x07449023:     DW_TAG_formal_parameter
                  DW_AT_name    ("ctx")
                  DW_AT_decl_file       ("/home/jolsa/kernel/linux-qemu/net/core/xdp.c")
                  DW_AT_decl_line       (725)
                  DW_AT_type    (0x0743dff0 "const xdp_md *")

0x0744902d:     DW_TAG_formal_parameter
                  DW_AT_name    ("timestamp")
                  DW_AT_decl_file       ("/home/jolsa/kernel/linux-qemu/net/core/xdp.c")
                  DW_AT_decl_line       (725)
                  DW_AT_type    (0x0743217a "u64 *")


...due to the function just being a "return -EOPNOTSUPP;":

Dump of assembler code for function bpf_xdp_metadata_rx_timestamp:
   0xffffffff81ec8c80 <+0>:	nopl   0x0(%rax,%rax,1)
   0xffffffff81ec8c85 <+5>:	mov    $0xffffffa1,%eax
   0xffffffff81ec8c8a <+10>:	cs jmp 0xffffffff8223c240 <__x86_return_thunk>
End of assembler dump.

__bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
{
        return -EOPNOTSUPP;
}

...and playing around with various attributes here does not seem to help.

should_failslab() has a similar story, I suspect because __should_failslab()
got optimized out due to CONFIG_FAILSLAB=n and we get

(gdb) disassemble/s should_failslab
Dump of assembler code for function should_failslab:
mm/slab_common.c:
1462	{
   0xffffffff81422490 <+0>:	nopl   0x0(%rax,%rax,1)

1463		if (__should_failslab(s, gfpflags))
1464			return -ENOMEM;
1465		return 0;
1466	}
   0xffffffff81422495 <+5>:	xor    %eax,%eax
   0xffffffff81422497 <+7>:	cs jmp 0xffffffff8223c240 <__x86_return_thunk>
End of assembler dump.

However, the caller of this function still prepares parameters as if
they were going to be used:

(gdb) disassemble slab_pre_alloc_hook
Dump of assembler code for function slab_pre_alloc_hook:
   0xffffffff813bc5b0 <+0>:	push   %rbp
   0xffffffff813bc5b1 <+1>:	mov    %rsp,%rbp
   0xffffffff813bc5b4 <+4>:	push   %r15
   0xffffffff813bc5b6 <+6>:	push   %r14
   0xffffffff813bc5b8 <+8>:	push   %r13
   0xffffffff813bc5ba <+10>:	push   %r12
   0xffffffff813bc5bc <+12>:	push   %rbx
   0xffffffff813bc5bd <+13>:	sub    $0x20,%rsp
   0xffffffff813bc5c1 <+17>:	mov    %r8d,%r15d
   0xffffffff813bc5c4 <+20>:	mov    %rcx,%r12
   0xffffffff813bc5c7 <+23>:	mov    %rdx,-0x48(%rbp)
   0xffffffff813bc5cb <+27>:	mov    %rsi,%rbx
   0xffffffff813bc5ce <+30>:	mov    %rdi,%r13
   0xffffffff813bc5d1 <+33>:	and    0x219ff00(%rip),%r15d        # 0xffffffff8355c4d8 <gfp_allowed_mask>
   0xffffffff813bc5d8 <+40>:	test   $0x400,%r15d
   0xffffffff813bc5df <+47>:	je     0xffffffff813bc5e6 <slab_pre_alloc_hook+54>
   0xffffffff813bc5e1 <+49>:	callq  0xffffffff81d27b68 <__SCT__might_resched>
   0xffffffff813bc5e6 <+54>:	mov    %r13,%rdi
   0xffffffff813bc5e9 <+57>:	mov    %r15d,%esi
   0xffffffff813bc5ec <+60>:	callq  0xffffffff81340a70 <should_failslab>

...so the function is still being called with the right register
values. This points to a conceptual flaw in the approach I was
taking - we cannot equate register _state_ on entry to a function
with register _use_ by that function. So from a DWARF perspective,
the fact that there is no location information does not necessarily
tell us the function is not _called_ with that parameter; rather it 
tells us it is not used within the body of the function.

This all combines to suggest that the only time we should be
definitive in rejecting a function for BTF encoding due to 
optimizations is when they interfere with the expectations we
have about _used_ parameter->register mappings. So concretely,
where the second parameter uses a register other than the
one the calling conventions dictate should be used by the 
second parameter say, or indeed does not use a register at
all. It is possible that we could be more definitive by examining
DWARF call site info, but there is none for some of the above
functions like should_failslab(), so that will not help here
as far as I can see.

Does this seem reasonable, or am I missing something here?
I'm worried we're going to end up playing whack-a-mole with
increasingly clever compiler optimizations, so my instinct
is that we need to be conservative in choosing when to skip
BTF encoding, only doing so when we are certain it will mislead
badly. I _think_ there is some justification to that based on
the above. What do you think?

Alan
Arnaldo Carvalho de Melo March 9, 2023, 1:53 a.m. UTC | #9
Em Mon, Feb 13, 2023 at 10:09:21PM -0800, Alexei Starovoitov escreveu:
> On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > +++ b/scripts/pahole-flags.sh
> > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > >         # see PAHOLE_HAS_LANG_EXCLUDE
> > >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > >  fi
> > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > > +fi
> >
> > We landed this too soon.
> > #229     tracing_struct:FAIL
> > is failing now.
> > since bpf_testmod.ko is missing a bunch of functions though they're global.
> >
> > I've tried a bunch of different flags and attributes, but none of them
> > helped.
> > The only thing that works is:
> > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > index 46500636d8cd..5fd0f75d5d20 100644
> > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
> >         long b;
> >  };
> >
> > +__attribute__((optimize("-O0")))
> >  noinline int
> >  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> > b, int c) {
> >
> > We cannot do:
> > --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > @@ -10,7 +10,7 @@ endif
> >  MODULES = bpf_testmod.ko
> >
> >  obj-m += bpf_testmod.o
> > -CFLAGS_bpf_testmod.o = -I$(src)
> > +CFLAGS_bpf_testmod.o = -I$(src) -O0
> >
> > The build fails due to asm stuff.
> >
> > Maybe we should make scripts/pahole-flags.sh selective
> > and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
> >
> > Thoughts?
> 
> It's even worse with clang compiled kernel:

I tested what is now in the master branch with both gcc and clang, on
fedora:37, Alan also tested it, Jiri, it would be great if you could
check if reverting the revert works for you as well.

Thanks,

- Arnaldo

>     WARN: resolve_btfids: unresolved symbol tcp_reno_cong_avoid
>     WARN: resolve_btfids: unresolved symbol dctcp_update_alpha
>     WARN: resolve_btfids: unresolved symbol cubictcp_cong_avoid
>     WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_timestamp
>     WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
>     WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
>     WARN: resolve_btfids: unresolved symbol bpf_task_acquire_not_zero
>     WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
>     WARN: resolve_btfids: unresolved symbol
> bpf_kfunc_call_test_static_unused_arg
>     WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
> 
> so I reverted this commit for now.
> Looks like pahole with skip_encoding_btf_inconsistent_proto needs
> to be more accurate.
> It's way too aggressive removing valid functions.
Jiri Olsa March 9, 2023, 8:16 a.m. UTC | #10
On Wed, Mar 08, 2023 at 10:53:51PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 13, 2023 at 10:09:21PM -0800, Alexei Starovoitov escreveu:
> > On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > > +++ b/scripts/pahole-flags.sh
> > > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > > >         # see PAHOLE_HAS_LANG_EXCLUDE
> > > >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > > >  fi
> > > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > > > +fi
> > >
> > > We landed this too soon.
> > > #229     tracing_struct:FAIL
> > > is failing now.
> > > since bpf_testmod.ko is missing a bunch of functions though they're global.
> > >
> > > I've tried a bunch of different flags and attributes, but none of them
> > > helped.
> > > The only thing that works is:
> > > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > index 46500636d8cd..5fd0f75d5d20 100644
> > > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
> > >         long b;
> > >  };
> > >
> > > +__attribute__((optimize("-O0")))
> > >  noinline int
> > >  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> > > b, int c) {
> > >
> > > We cannot do:
> > > --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > > +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > > @@ -10,7 +10,7 @@ endif
> > >  MODULES = bpf_testmod.ko
> > >
> > >  obj-m += bpf_testmod.o
> > > -CFLAGS_bpf_testmod.o = -I$(src)
> > > +CFLAGS_bpf_testmod.o = -I$(src) -O0
> > >
> > > The build fails due to asm stuff.
> > >
> > > Maybe we should make scripts/pahole-flags.sh selective
> > > and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
> > >
> > > Thoughts?
> > 
> > It's even worse with clang compiled kernel:
> 
> I tested what is now in the master branch with both gcc and clang, on
> fedora:37, Alan also tested it, Jiri, it would be great if you could
> check if reverting the revert works for you as well.

ok, will check your master branch

jirka

> 
> Thanks,
> 
> - Arnaldo
> 
> >     WARN: resolve_btfids: unresolved symbol tcp_reno_cong_avoid
> >     WARN: resolve_btfids: unresolved symbol dctcp_update_alpha
> >     WARN: resolve_btfids: unresolved symbol cubictcp_cong_avoid
> >     WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_timestamp
> >     WARN: resolve_btfids: unresolved symbol bpf_xdp_metadata_rx_hash
> >     WARN: resolve_btfids: unresolved symbol bpf_task_kptr_get
> >     WARN: resolve_btfids: unresolved symbol bpf_task_acquire_not_zero
> >     WARN: resolve_btfids: unresolved symbol bpf_rdonly_cast
> >     WARN: resolve_btfids: unresolved symbol
> > bpf_kfunc_call_test_static_unused_arg
> >     WARN: resolve_btfids: unresolved symbol bpf_kfunc_call_test_ref
> > 
> > so I reverted this commit for now.
> > Looks like pahole with skip_encoding_btf_inconsistent_proto needs
> > to be more accurate.
> > It's way too aggressive removing valid functions.
Jiri Olsa March 9, 2023, 10:11 a.m. UTC | #11
On Thu, Mar 09, 2023 at 09:16:53AM +0100, Jiri Olsa wrote:
> On Wed, Mar 08, 2023 at 10:53:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Feb 13, 2023 at 10:09:21PM -0800, Alexei Starovoitov escreveu:
> > > On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > > > +++ b/scripts/pahole-flags.sh
> > > > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > > > >         # see PAHOLE_HAS_LANG_EXCLUDE
> > > > >         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > > > >  fi
> > > > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > > > +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > > > > +fi
> > > >
> > > > We landed this too soon.
> > > > #229     tracing_struct:FAIL
> > > > is failing now.
> > > > since bpf_testmod.ko is missing a bunch of functions though they're global.
> > > >
> > > > I've tried a bunch of different flags and attributes, but none of them
> > > > helped.
> > > > The only thing that works is:
> > > > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > > b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > > index 46500636d8cd..5fd0f75d5d20 100644
> > > > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > > @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
> > > >         long b;
> > > >  };
> > > >
> > > > +__attribute__((optimize("-O0")))
> > > >  noinline int
> > > >  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
> > > > b, int c) {
> > > >
> > > > We cannot do:
> > > > --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > > > +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
> > > > @@ -10,7 +10,7 @@ endif
> > > >  MODULES = bpf_testmod.ko
> > > >
> > > >  obj-m += bpf_testmod.o
> > > > -CFLAGS_bpf_testmod.o = -I$(src)
> > > > +CFLAGS_bpf_testmod.o = -I$(src) -O0
> > > >
> > > > The build fails due to asm stuff.
> > > >
> > > > Maybe we should make scripts/pahole-flags.sh selective
> > > > and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
> > > >
> > > > Thoughts?
> > > 
> > > It's even worse with clang compiled kernel:
> > 
> > I tested what is now in the master branch with both gcc and clang, on
> > fedora:37, Alan also tested it, Jiri, it would be great if you could
> > check if reverting the revert works for you as well.
> 
> ok, will check your master branch

looks good.. got no duplicates and passing bpf tests for both
gcc and clang setups

jirka
Arnaldo Carvalho de Melo March 9, 2023, 12:26 p.m. UTC | #12
Em Thu, Mar 09, 2023 at 11:11:27AM +0100, Jiri Olsa escreveu:
> On Thu, Mar 09, 2023 at 09:16:53AM +0100, Jiri Olsa wrote:
> > On Wed, Mar 08, 2023 at 10:53:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Mon, Feb 13, 2023 at 10:09:21PM -0800, Alexei Starovoitov escreveu:
> > > > On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> > > > > Maybe we should make scripts/pahole-flags.sh selective
> > > > > and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?

> > > > > Thoughts?

> > > > It's even worse with clang compiled kernel:

> > > I tested what is now in the master branch with both gcc and clang, on
> > > fedora:37, Alan also tested it, Jiri, it would be great if you could
> > > check if reverting the revert works for you as well.

> > ok, will check your master branch

> looks good.. got no duplicates and passing bpf tests for both
> gcc and clang setups

Thanks for testing!

Alexei, since you hit those problems, please consider redoing those
tests in your environment so that we triple check all this.

Thanks,

- Arnaldo
Alan Maguire March 9, 2023, 2:58 p.m. UTC | #13
On 09/03/2023 10:11, Jiri Olsa wrote:
> On Thu, Mar 09, 2023 at 09:16:53AM +0100, Jiri Olsa wrote:
>> On Wed, Mar 08, 2023 at 10:53:51PM -0300, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Feb 13, 2023 at 10:09:21PM -0800, Alexei Starovoitov escreveu:
>>>> On Mon, Feb 13, 2023 at 7:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>>>>> On Thu, Feb 9, 2023 at 5:29 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>> +++ b/scripts/pahole-flags.sh
>>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>>         # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>>  fi
>>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>>> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
>>>>>> +fi
>>>>>
>>>>> We landed this too soon.
>>>>> #229     tracing_struct:FAIL
>>>>> is failing now.
>>>>> since bpf_testmod.ko is missing a bunch of functions though they're global.
>>>>>
>>>>> I've tried a bunch of different flags and attributes, but none of them
>>>>> helped.
>>>>> The only thing that works is:
>>>>> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
>>>>> b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
>>>>> index 46500636d8cd..5fd0f75d5d20 100644
>>>>> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
>>>>> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
>>>>> @@ -28,6 +28,7 @@ struct bpf_testmod_struct_arg_2 {
>>>>>         long b;
>>>>>  };
>>>>>
>>>>> +__attribute__((optimize("-O0")))
>>>>>  noinline int
>>>>>  bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int
>>>>> b, int c) {
>>>>>
>>>>> We cannot do:
>>>>> --- a/tools/testing/selftests/bpf/bpf_testmod/Makefile
>>>>> +++ b/tools/testing/selftests/bpf/bpf_testmod/Makefile
>>>>> @@ -10,7 +10,7 @@ endif
>>>>>  MODULES = bpf_testmod.ko
>>>>>
>>>>>  obj-m += bpf_testmod.o
>>>>> -CFLAGS_bpf_testmod.o = -I$(src)
>>>>> +CFLAGS_bpf_testmod.o = -I$(src) -O0
>>>>>
>>>>> The build fails due to asm stuff.
>>>>>
>>>>> Maybe we should make scripts/pahole-flags.sh selective
>>>>> and don't apply skip_encoding_btf_inconsiste to bpf_testmod ?
>>>>>
>>>>> Thoughts?
>>>>
>>>> It's even worse with clang compiled kernel:
>>>
>>> I tested what is now in the master branch with both gcc and clang, on
>>> fedora:37, Alan also tested it, Jiri, it would be great if you could
>>> check if reverting the revert works for you as well.
>>
>> ok, will check your master branch
> 
> looks good.. got no duplicates and passing bpf tests for both
> gcc and clang setups
>

thanks for re-testing! I just did the same for latest bpf-next
on x86_64/aarch64; all looks good.

Alan
diff mbox series

Patch

diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
index 1f1f1d3..728d551 100755
--- a/scripts/pahole-flags.sh
+++ b/scripts/pahole-flags.sh
@@ -23,5 +23,8 @@  if [ "${pahole_ver}" -ge "124" ]; then
 	# see PAHOLE_HAS_LANG_EXCLUDE
 	extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
 fi
+if [ "${pahole_ver}" -ge "125" ]; then
+	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
+fi
 
 echo ${extra_paholeopt}