diff mbox series

[bpf-next] bpf: Add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25

Message ID 20230510130241.1696561-1-alan.maguire@oracle.com (mailing list archive)
State Accepted
Commit 7b99f75942da332e3f4f865e55a10fec95a30d4f
Delegated to: BPF
Headers show
Series [bpf-next] bpf: Add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25 | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: line length of 103 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_verifier on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for veristat
bpf/vmtest-bpf-next-VM_Test-8 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_parallel on aarch64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on s390x with gcc

Commit Message

Alan Maguire May 10, 2023, 1:02 p.m. UTC
v1.25 of pahole supports filtering out functions with multiple inconsistent
function prototypes or optimized-out parameters from the BTF representation.
These present problems because there is no additional info in BTF saying which
inconsistent prototype matches which function instance to help guide attachment,
and functions with optimized-out parameters can lead to incorrect assumptions
about register contents.

So for now, filter out such functions while adding BTF representations for
functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
This patch assumes that below linked changes land in pahole for v1.25.

Issues with pahole filtering being too aggressive in removing functions
appear to be resolved now, but CI and further testing will confirm.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 scripts/pahole-flags.sh | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jiri Olsa May 10, 2023, 5:15 p.m. UTC | #1
On Wed, May 10, 2023 at 02:02:41PM +0100, Alan Maguire wrote:
> v1.25 of pahole supports filtering out functions with multiple inconsistent
> function prototypes or optimized-out parameters from the BTF representation.
> These present problems because there is no additional info in BTF saying which
> inconsistent prototype matches which function instance to help guide attachment,
> and functions with optimized-out parameters can lead to incorrect assumptions
> about register contents.
> 
> So for now, filter out such functions while adding BTF representations for
> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
> This patch assumes that below linked changes land in pahole for v1.25.
> 
> Issues with pahole filtering being too aggressive in removing functions
> appear to be resolved now, but CI and further testing will confirm.
> 
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

jirka

> ---
>  scripts/pahole-flags.sh | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d397c39..728d55190d97 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>  	# see PAHOLE_HAS_LANG_EXCLUDE
>  	extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>  fi
> +if [ "${pahole_ver}" -ge "125" ]; then
> +	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> +fi
>  
>  echo ${extra_paholeopt}
> -- 
> 2.31.1
>
Yafang Shao May 12, 2023, 2:51 a.m. UTC | #2
On Wed, May 10, 2023 at 9:03 PM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> v1.25 of pahole supports filtering out functions with multiple inconsistent
> function prototypes or optimized-out parameters from the BTF representation.
> These present problems because there is no additional info in BTF saying which
> inconsistent prototype matches which function instance to help guide attachment,
> and functions with optimized-out parameters can lead to incorrect assumptions
> about register contents.
>
> So for now, filter out such functions while adding BTF representations for
> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
> This patch assumes that below linked changes land in pahole for v1.25.
>
> Issues with pahole filtering being too aggressive in removing functions
> appear to be resolved now, but CI and further testing will confirm.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  scripts/pahole-flags.sh | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> index 1f1f1d397c39..728d55190d97 100755
> --- a/scripts/pahole-flags.sh
> +++ b/scripts/pahole-flags.sh
> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>         # see PAHOLE_HAS_LANG_EXCLUDE
>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>  fi
> +if [ "${pahole_ver}" -ge "125" ]; then
> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> +fi
>
>  echo ${extra_paholeopt}
> --
> 2.31.1
>

That change looks like a workaround to me.
There may be multiple functions that have the same proto, e.g.:

  $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
kernel/bpf/ net/core/
  kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
bpf_iter_aux_info *aux)
  net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
bpf_iter_aux_info *aux)

  $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
bpf_iter_detach_map
  [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
  'aux' type_id=2638
  [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static

We don't know which one it is in the BTF.
However, I'm not against this change, as it can avoid some issues.
Alan Maguire May 12, 2023, 4:03 p.m. UTC | #3
On 12/05/2023 03:51, Yafang Shao wrote:
> On Wed, May 10, 2023 at 9:03 PM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> v1.25 of pahole supports filtering out functions with multiple inconsistent
>> function prototypes or optimized-out parameters from the BTF representation.
>> These present problems because there is no additional info in BTF saying which
>> inconsistent prototype matches which function instance to help guide attachment,
>> and functions with optimized-out parameters can lead to incorrect assumptions
>> about register contents.
>>
>> So for now, filter out such functions while adding BTF representations for
>> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
>> This patch assumes that below linked changes land in pahole for v1.25.
>>
>> Issues with pahole filtering being too aggressive in removing functions
>> appear to be resolved now, but CI and further testing will confirm.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  scripts/pahole-flags.sh | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>> index 1f1f1d397c39..728d55190d97 100755
>> --- a/scripts/pahole-flags.sh
>> +++ b/scripts/pahole-flags.sh
>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>         # see PAHOLE_HAS_LANG_EXCLUDE
>>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>  fi
>> +if [ "${pahole_ver}" -ge "125" ]; then
>> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
>> +fi
>>
>>  echo ${extra_paholeopt}
>> --
>> 2.31.1
>>
> 
> That change looks like a workaround to me.
> There may be multiple functions that have the same proto, e.g.:
> 
>   $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
> kernel/bpf/ net/core/
>   kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
> bpf_iter_aux_info *aux)
>   net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
> bpf_iter_aux_info *aux)
> 
>   $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
> bpf_iter_detach_map
>   [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>   'aux' type_id=2638
>   [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
> 
> We don't know which one it is in the BTF.
> However, I'm not against this change, as it can avoid some issues.
> 

In the above case, the BTF representation is consistent though.
That is, if I attach fentry progs to either of these functions
based on that BTF representation, nothing will crash.

That's ultimately what those changes are about; ensuring
consistency in BTF representation, so when a function is in
BTF we can know the signature of the function can be safely
used by fentry for example.

The question of being able to identify functions (as opposed
to having a consistent representation) is the next step.
Finding a way to link between kallsyms and BTF would allow us to
have multiple inconsistent functions in BTF, since we could map
from BTF -> kallsyms safely. So two functions called "foo"
with different function signatures would be okay, because
we'd know which was which in kallsyms and could attach
safely. Something like a BTF tag for the function that could
clarify that mapping - but just for cases where it would
otherwise be ambiguous - is probably the way forward
longer term.

Jiri's talking about this topic at LSF/MM/BPF this week I believe.

Thanks!

Alan
Alexei Starovoitov May 12, 2023, 6:59 p.m. UTC | #4
On Fri, May 12, 2023 at 9:04 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 12/05/2023 03:51, Yafang Shao wrote:
> > On Wed, May 10, 2023 at 9:03 PM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>
> >> v1.25 of pahole supports filtering out functions with multiple inconsistent
> >> function prototypes or optimized-out parameters from the BTF representation.
> >> These present problems because there is no additional info in BTF saying which
> >> inconsistent prototype matches which function instance to help guide attachment,
> >> and functions with optimized-out parameters can lead to incorrect assumptions
> >> about register contents.
> >>
> >> So for now, filter out such functions while adding BTF representations for
> >> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
> >> This patch assumes that below linked changes land in pahole for v1.25.
> >>
> >> Issues with pahole filtering being too aggressive in removing functions
> >> appear to be resolved now, but CI and further testing will confirm.
> >>
> >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> >> ---
> >>  scripts/pahole-flags.sh | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> >> index 1f1f1d397c39..728d55190d97 100755
> >> --- a/scripts/pahole-flags.sh
> >> +++ b/scripts/pahole-flags.sh
> >> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> >>         # see PAHOLE_HAS_LANG_EXCLUDE
> >>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> >>  fi
> >> +if [ "${pahole_ver}" -ge "125" ]; then
> >> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> >> +fi
> >>
> >>  echo ${extra_paholeopt}
> >> --
> >> 2.31.1
> >>
> >
> > That change looks like a workaround to me.
> > There may be multiple functions that have the same proto, e.g.:
> >
> >   $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
> > kernel/bpf/ net/core/
> >   kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
> > bpf_iter_aux_info *aux)
> >   net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
> > bpf_iter_aux_info *aux)
> >
> >   $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
> > bpf_iter_detach_map
> >   [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
> >   'aux' type_id=2638
> >   [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
> >
> > We don't know which one it is in the BTF.
> > However, I'm not against this change, as it can avoid some issues.
> >
>
> In the above case, the BTF representation is consistent though.
> That is, if I attach fentry progs to either of these functions
> based on that BTF representation, nothing will crash.
>
> That's ultimately what those changes are about; ensuring
> consistency in BTF representation, so when a function is in
> BTF we can know the signature of the function can be safely
> used by fentry for example.
>
> The question of being able to identify functions (as opposed
> to having a consistent representation) is the next step.
> Finding a way to link between kallsyms and BTF would allow us to
> have multiple inconsistent functions in BTF, since we could map
> from BTF -> kallsyms safely. So two functions called "foo"
> with different function signatures would be okay, because
> we'd know which was which in kallsyms and could attach
> safely. Something like a BTF tag for the function that could
> clarify that mapping - but just for cases where it would
> otherwise be ambiguous - is probably the way forward
> longer term.
>
> Jiri's talking about this topic at LSF/MM/BPF this week I believe.

Jiri presented a few ideas during LSFMMBPF.

I feel the best approach is to add a set of addr-s to BTF
via a special decl_tag.
We can also consider extending KIND_FUNC.
The advantage that every BTF func will have one or more addrs
associated with it and bpf prog loading logic wouldn't need to do
fragile name comparison between btf and kallsyms.
pahole can take addrs from dwarf and optionally double check with kallsyms.
patchwork-bot+netdevbpf@kernel.org May 12, 2023, 7 p.m. UTC | #5
Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Wed, 10 May 2023 14:02:41 +0100 you wrote:
> v1.25 of pahole supports filtering out functions with multiple inconsistent
> function prototypes or optimized-out parameters from the BTF representation.
> These present problems because there is no additional info in BTF saying which
> inconsistent prototype matches which function instance to help guide attachment,
> and functions with optimized-out parameters can lead to incorrect assumptions
> about register contents.
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: Add --skip_encoding_btf_inconsistent_proto, --btf_gen_optimized to pahole flags for v1.25
    https://git.kernel.org/bpf/bpf-next/c/7b99f75942da

You are awesome, thank you!
Jiri Olsa May 12, 2023, 9:54 p.m. UTC | #6
On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
> On Fri, May 12, 2023 at 9:04 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > On 12/05/2023 03:51, Yafang Shao wrote:
> > > On Wed, May 10, 2023 at 9:03 PM Alan Maguire <alan.maguire@oracle.com> wrote:
> > >>
> > >> v1.25 of pahole supports filtering out functions with multiple inconsistent
> > >> function prototypes or optimized-out parameters from the BTF representation.
> > >> These present problems because there is no additional info in BTF saying which
> > >> inconsistent prototype matches which function instance to help guide attachment,
> > >> and functions with optimized-out parameters can lead to incorrect assumptions
> > >> about register contents.
> > >>
> > >> So for now, filter out such functions while adding BTF representations for
> > >> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
> > >> This patch assumes that below linked changes land in pahole for v1.25.
> > >>
> > >> Issues with pahole filtering being too aggressive in removing functions
> > >> appear to be resolved now, but CI and further testing will confirm.
> > >>
> > >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > >> ---
> > >>  scripts/pahole-flags.sh | 3 +++
> > >>  1 file changed, 3 insertions(+)
> > >>
> > >> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > >> index 1f1f1d397c39..728d55190d97 100755
> > >> --- a/scripts/pahole-flags.sh
> > >> +++ b/scripts/pahole-flags.sh
> > >> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > >>         # see PAHOLE_HAS_LANG_EXCLUDE
> > >>         extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > >>  fi
> > >> +if [ "${pahole_ver}" -ge "125" ]; then
> > >> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
> > >> +fi
> > >>
> > >>  echo ${extra_paholeopt}
> > >> --
> > >> 2.31.1
> > >>
> > >
> > > That change looks like a workaround to me.
> > > There may be multiple functions that have the same proto, e.g.:
> > >
> > >   $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
> > > kernel/bpf/ net/core/
> > >   kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
> > > bpf_iter_aux_info *aux)
> > >   net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
> > > bpf_iter_aux_info *aux)
> > >
> > >   $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
> > > bpf_iter_detach_map
> > >   [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
> > >   'aux' type_id=2638
> > >   [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
> > >
> > > We don't know which one it is in the BTF.
> > > However, I'm not against this change, as it can avoid some issues.
> > >
> >
> > In the above case, the BTF representation is consistent though.
> > That is, if I attach fentry progs to either of these functions
> > based on that BTF representation, nothing will crash.
> >
> > That's ultimately what those changes are about; ensuring
> > consistency in BTF representation, so when a function is in
> > BTF we can know the signature of the function can be safely
> > used by fentry for example.
> >
> > The question of being able to identify functions (as opposed
> > to having a consistent representation) is the next step.
> > Finding a way to link between kallsyms and BTF would allow us to
> > have multiple inconsistent functions in BTF, since we could map
> > from BTF -> kallsyms safely. So two functions called "foo"
> > with different function signatures would be okay, because
> > we'd know which was which in kallsyms and could attach
> > safely. Something like a BTF tag for the function that could
> > clarify that mapping - but just for cases where it would
> > otherwise be ambiguous - is probably the way forward
> > longer term.
> >
> > Jiri's talking about this topic at LSF/MM/BPF this week I believe.
> 
> Jiri presented a few ideas during LSFMMBPF.
> 
> I feel the best approach is to add a set of addr-s to BTF
> via a special decl_tag.
> We can also consider extending KIND_FUNC.
> The advantage that every BTF func will have one or more addrs
> associated with it and bpf prog loading logic wouldn't need to do
> fragile name comparison between btf and kallsyms.
> pahole can take addrs from dwarf and optionally double check with kallsyms.

Yonghong summed it up in another email discussion, pasting it in here:

  So overall we have three options as kallsyms representation now:
    (a) "addr module:foo:dir_a/dir_b/core.c"
    (b) "addr module:foo"
    (c) "addr module:foo:btf_id"

  option (a):
    'dir_a/dir_b/core.c' needs to be encoded in BTF.
    user space either check file path or func signature
    to find attach_btf_id and pass to the kernel.
    kernel can find file path in BTF and then lookup
    kallsyms to find addr.

  option (b):
    "addr" needs to be encoded in BTF.
    user space checks func signature to find
    attach_btf_id and pass to the kernel.
    kernel can find addr in BTF and use it.

  option (c):
    if user can decide which function to attach, e.g.,
    through func signature, then no BTF encoding
    is necessary. attach_btf_id is passed to the
    kernel and search kallsyms to find the matching
    btf_id and 'addr' will be available then.

  For option (b) and (c), user space needs to check
  func signature to find which btf_id to use. If
  same-name static functions having the identical
  signatures, then user space would have a hard time
  to differentiate. I think it should be very
  rare same-name static functions in the kernel will have
  identical signatures. But if we want 100% correctness,
  we may need file path in which case option (a)
  is preferable.

  Current option (a) kallsyms format is under review.
  option (c) also needs kallsyms change...


my thoughts so far is that I like the idea of storing functions address
in BTF (option b), because it's the easiest way

on the other hand, user would need debug info to find address for the function
to trace.. but still just for small subset of functions that share same name

also I like Lorenz's idea of storing BTF ID in kalsyms and verifier being
able to lookup address based on BTF ID.. seems like easier kallsyms change,
but it would still require storing objects paths in BTF to pick up the
correct one

cc-ing other folks

jirka
Yonghong Song May 13, 2023, 2:59 a.m. UTC | #7
On 5/12/23 2:54 PM, Jiri Olsa wrote:
> On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
>> On Fri, May 12, 2023 at 9:04 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>
>>> On 12/05/2023 03:51, Yafang Shao wrote:
>>>> On Wed, May 10, 2023 at 9:03 PM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>
>>>>> v1.25 of pahole supports filtering out functions with multiple inconsistent
>>>>> function prototypes or optimized-out parameters from the BTF representation.
>>>>> These present problems because there is no additional info in BTF saying which
>>>>> inconsistent prototype matches which function instance to help guide attachment,
>>>>> and functions with optimized-out parameters can lead to incorrect assumptions
>>>>> about register contents.
>>>>>
>>>>> So for now, filter out such functions while adding BTF representations for
>>>>> functions that have "."-suffixes (foo.isra.0) but not optimized-out parameters.
>>>>> This patch assumes that below linked changes land in pahole for v1.25.
>>>>>
>>>>> Issues with pahole filtering being too aggressive in removing functions
>>>>> appear to be resolved now, but CI and further testing will confirm.
>>>>>
>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>> ---
>>>>>   scripts/pahole-flags.sh | 3 +++
>>>>>   1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>> index 1f1f1d397c39..728d55190d97 100755
>>>>> --- a/scripts/pahole-flags.sh
>>>>> +++ b/scripts/pahole-flags.sh
>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>          # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>          extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>   fi
>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>> +       extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
>>>>> +fi
>>>>>
>>>>>   echo ${extra_paholeopt}
>>>>> --
>>>>> 2.31.1
>>>>>
>>>>
>>>> That change looks like a workaround to me.
>>>> There may be multiple functions that have the same proto, e.g.:
>>>>
>>>>    $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
>>>> kernel/bpf/ net/core/
>>>>    kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
>>>> bpf_iter_aux_info *aux)
>>>>    net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
>>>> bpf_iter_aux_info *aux)
>>>>
>>>>    $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
>>>> bpf_iter_detach_map
>>>>    [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>>>>    'aux' type_id=2638
>>>>    [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
>>>>
>>>> We don't know which one it is in the BTF.
>>>> However, I'm not against this change, as it can avoid some issues.
>>>>
>>>
>>> In the above case, the BTF representation is consistent though.
>>> That is, if I attach fentry progs to either of these functions
>>> based on that BTF representation, nothing will crash.
>>>
>>> That's ultimately what those changes are about; ensuring
>>> consistency in BTF representation, so when a function is in
>>> BTF we can know the signature of the function can be safely
>>> used by fentry for example.
>>>
>>> The question of being able to identify functions (as opposed
>>> to having a consistent representation) is the next step.
>>> Finding a way to link between kallsyms and BTF would allow us to
>>> have multiple inconsistent functions in BTF, since we could map
>>> from BTF -> kallsyms safely. So two functions called "foo"
>>> with different function signatures would be okay, because
>>> we'd know which was which in kallsyms and could attach
>>> safely. Something like a BTF tag for the function that could
>>> clarify that mapping - but just for cases where it would
>>> otherwise be ambiguous - is probably the way forward
>>> longer term.
>>>
>>> Jiri's talking about this topic at LSF/MM/BPF this week I believe.
>>
>> Jiri presented a few ideas during LSFMMBPF.
>>
>> I feel the best approach is to add a set of addr-s to BTF
>> via a special decl_tag.
>> We can also consider extending KIND_FUNC.
>> The advantage that every BTF func will have one or more addrs
>> associated with it and bpf prog loading logic wouldn't need to do
>> fragile name comparison between btf and kallsyms.
>> pahole can take addrs from dwarf and optionally double check with kallsyms.
> 
> Yonghong summed it up in another email discussion, pasting it in here:
> 
>    So overall we have three options as kallsyms representation now:
>      (a) "addr module:foo:dir_a/dir_b/core.c"
>      (b) "addr module:foo"
>      (c) "addr module:foo:btf_id"
> 
>    option (a):
>      'dir_a/dir_b/core.c' needs to be encoded in BTF.
>      user space either check file path or func signature
>      to find attach_btf_id and pass to the kernel.
>      kernel can find file path in BTF and then lookup
>      kallsyms to find addr.
> 
>    option (b):
>      "addr" needs to be encoded in BTF.
>      user space checks func signature to find
>      attach_btf_id and pass to the kernel.
>      kernel can find addr in BTF and use it.
> 
>    option (c):
>      if user can decide which function to attach, e.g.,
>      through func signature, then no BTF encoding
>      is necessary. attach_btf_id is passed to the
>      kernel and search kallsyms to find the matching
>      btf_id and 'addr' will be available then.
> 
>    For option (b) and (c), user space needs to check
>    func signature to find which btf_id to use. If
>    same-name static functions having the identical
>    signatures, then user space would have a hard time
>    to differentiate. I think it should be very
>    rare same-name static functions in the kernel will have
>    identical signatures. But if we want 100% correctness,
>    we may need file path in which case option (a)
>    is preferable.

As Alexei mentioned in previous email, for such a extreme case,
if user is willing to go through extra step to check dwarf
to find and match file path, then (b) and (c) should work
perfectly as well.

> 
>    Current option (a) kallsyms format is under review.
>    option (c) also needs kallsyms change...
> 
> 
> my thoughts so far is that I like the idea of storing functions address
> in BTF (option b), because it's the easiest way
> 
> on the other hand, user would need debug info to find address for the function
> to trace.. but still just for small subset of functions that share same name
> 
> also I like Lorenz's idea of storing BTF ID in kalsyms and verifier being
> able to lookup address based on BTF ID.. seems like easier kallsyms change,
> but it would still require storing objects paths in BTF to pick up the
> correct one
> 
> cc-ing other folks
> 
> jirka
Yonghong Song May 14, 2023, 5:37 p.m. UTC | #8
On 5/12/23 7:59 PM, Yonghong Song wrote:
> 
> 
> On 5/12/23 2:54 PM, Jiri Olsa wrote:
>> On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
>>> On Fri, May 12, 2023 at 9:04 AM Alan Maguire 
>>> <alan.maguire@oracle.com> wrote:
>>>>
>>>> On 12/05/2023 03:51, Yafang Shao wrote:
>>>>> On Wed, May 10, 2023 at 9:03 PM Alan Maguire 
>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>
>>>>>> v1.25 of pahole supports filtering out functions with multiple 
>>>>>> inconsistent
>>>>>> function prototypes or optimized-out parameters from the BTF 
>>>>>> representation.
>>>>>> These present problems because there is no additional info in BTF 
>>>>>> saying which
>>>>>> inconsistent prototype matches which function instance to help 
>>>>>> guide attachment,
>>>>>> and functions with optimized-out parameters can lead to incorrect 
>>>>>> assumptions
>>>>>> about register contents.
>>>>>>
>>>>>> So for now, filter out such functions while adding BTF 
>>>>>> representations for
>>>>>> functions that have "."-suffixes (foo.isra.0) but not 
>>>>>> optimized-out parameters.
>>>>>> This patch assumes that below linked changes land in pahole for 
>>>>>> v1.25.
>>>>>>
>>>>>> Issues with pahole filtering being too aggressive in removing 
>>>>>> functions
>>>>>> appear to be resolved now, but CI and further testing will confirm.
>>>>>>
>>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>>> ---
>>>>>>   scripts/pahole-flags.sh | 3 +++
>>>>>>   1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>>> index 1f1f1d397c39..728d55190d97 100755
>>>>>> --- a/scripts/pahole-flags.sh
>>>>>> +++ b/scripts/pahole-flags.sh
>>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>>          # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>>          extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>>   fi
>>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>>> +       extra_paholeopt="${extra_paholeopt} 
>>>>>> --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
>>>>>> +fi
>>>>>>
>>>>>>   echo ${extra_paholeopt}
>>>>>> -- 
>>>>>> 2.31.1
>>>>>>
>>>>>
>>>>> That change looks like a workaround to me.
>>>>> There may be multiple functions that have the same proto, e.g.:
>>>>>
>>>>>    $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
>>>>> kernel/bpf/ net/core/
>>>>>    kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
>>>>> bpf_iter_aux_info *aux)
>>>>>    net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
>>>>> bpf_iter_aux_info *aux)
>>>>>
>>>>>    $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
>>>>> bpf_iter_detach_map
>>>>>    [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>>>>>    'aux' type_id=2638
>>>>>    [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
>>>>>
>>>>> We don't know which one it is in the BTF.
>>>>> However, I'm not against this change, as it can avoid some issues.
>>>>>
>>>>
>>>> In the above case, the BTF representation is consistent though.
>>>> That is, if I attach fentry progs to either of these functions
>>>> based on that BTF representation, nothing will crash.
>>>>
>>>> That's ultimately what those changes are about; ensuring
>>>> consistency in BTF representation, so when a function is in
>>>> BTF we can know the signature of the function can be safely
>>>> used by fentry for example.
>>>>
>>>> The question of being able to identify functions (as opposed
>>>> to having a consistent representation) is the next step.
>>>> Finding a way to link between kallsyms and BTF would allow us to
>>>> have multiple inconsistent functions in BTF, since we could map
>>>> from BTF -> kallsyms safely. So two functions called "foo"
>>>> with different function signatures would be okay, because
>>>> we'd know which was which in kallsyms and could attach
>>>> safely. Something like a BTF tag for the function that could
>>>> clarify that mapping - but just for cases where it would
>>>> otherwise be ambiguous - is probably the way forward
>>>> longer term.
>>>>
>>>> Jiri's talking about this topic at LSF/MM/BPF this week I believe.
>>>
>>> Jiri presented a few ideas during LSFMMBPF.
>>>
>>> I feel the best approach is to add a set of addr-s to BTF
>>> via a special decl_tag.
>>> We can also consider extending KIND_FUNC.
>>> The advantage that every BTF func will have one or more addrs
>>> associated with it and bpf prog loading logic wouldn't need to do
>>> fragile name comparison between btf and kallsyms.
>>> pahole can take addrs from dwarf and optionally double check with 
>>> kallsyms.
>>
>> Yonghong summed it up in another email discussion, pasting it in here:
>>
>>    So overall we have three options as kallsyms representation now:
>>      (a) "addr module:foo:dir_a/dir_b/core.c"
>>      (b) "addr module:foo"
>>      (c) "addr module:foo:btf_id"
>>
>>    option (a):
>>      'dir_a/dir_b/core.c' needs to be encoded in BTF.
>>      user space either check file path or func signature
>>      to find attach_btf_id and pass to the kernel.
>>      kernel can find file path in BTF and then lookup
>>      kallsyms to find addr.
>>
>>    option (b):
>>      "addr" needs to be encoded in BTF.
>>      user space checks func signature to find
>>      attach_btf_id and pass to the kernel.
>>      kernel can find addr in BTF and use it.
>>
>>    option (c):
>>      if user can decide which function to attach, e.g.,
>>      through func signature, then no BTF encoding
>>      is necessary. attach_btf_id is passed to the
>>      kernel and search kallsyms to find the matching
>>      btf_id and 'addr' will be available then.
>>
>>    For option (b) and (c), user space needs to check
>>    func signature to find which btf_id to use. If
>>    same-name static functions having the identical
>>    signatures, then user space would have a hard time
>>    to differentiate. I think it should be very
>>    rare same-name static functions in the kernel will have
>>    identical signatures. But if we want 100% correctness,
>>    we may need file path in which case option (a)
>>    is preferable.
> 
> As Alexei mentioned in previous email, for such a extreme case,
> if user is willing to go through extra step to check dwarf
> to find and match file path, then (b) and (c) should work
> perfectly as well.

Okay, it looks like this is more complex if the function signature is
the same. In such cases, current BTF dedup will merge these
functions as a single BTF func. In such cases, we could have:

    decl_tag_1   ----> dedup'ed static_func
                          ^
                          |
    decl_tag_2   ---------

For such cases, just passing btf_id of static func to kernel
won't work since the kernel won't be able to know which
decl_tag to be associated with.

(I did a simple test with vmlinux, it looks we have
  issues with decl_tag_1/decl_tag_2 -> dedup'ed static_func
  as well since only one of decl_tag survives.
  But this is a different issue.
)

So if we intend to add decl tag (addr or file_path), we
should not dedup static functions or generally any functions.

> 
>>
>>    Current option (a) kallsyms format is under review.
>>    option (c) also needs kallsyms change...
>>
>>
>> my thoughts so far is that I like the idea of storing functions address
>> in BTF (option b), because it's the easiest way
>>
>> on the other hand, user would need debug info to find address for the 
>> function
>> to trace.. but still just for small subset of functions that share 
>> same name
>>
>> also I like Lorenz's idea of storing BTF ID in kalsyms and verifier being
>> able to lookup address based on BTF ID.. seems like easier kallsyms 
>> change,
>> but it would still require storing objects paths in BTF to pick up the
>> correct one
>>
>> cc-ing other folks
>>
>> jirka
Jiri Olsa May 14, 2023, 9:49 p.m. UTC | #9
On Sun, May 14, 2023 at 10:37:08AM -0700, Yonghong Song wrote:
> 
> 
> On 5/12/23 7:59 PM, Yonghong Song wrote:
> > 
> > 
> > On 5/12/23 2:54 PM, Jiri Olsa wrote:
> > > On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
> > > > On Fri, May 12, 2023 at 9:04 AM Alan Maguire
> > > > <alan.maguire@oracle.com> wrote:
> > > > > 
> > > > > On 12/05/2023 03:51, Yafang Shao wrote:
> > > > > > On Wed, May 10, 2023 at 9:03 PM Alan Maguire
> > > > > > <alan.maguire@oracle.com> wrote:
> > > > > > > 
> > > > > > > v1.25 of pahole supports filtering out functions
> > > > > > > with multiple inconsistent
> > > > > > > function prototypes or optimized-out parameters from
> > > > > > > the BTF representation.
> > > > > > > These present problems because there is no
> > > > > > > additional info in BTF saying which
> > > > > > > inconsistent prototype matches which function
> > > > > > > instance to help guide attachment,
> > > > > > > and functions with optimized-out parameters can lead
> > > > > > > to incorrect assumptions
> > > > > > > about register contents.
> > > > > > > 
> > > > > > > So for now, filter out such functions while adding
> > > > > > > BTF representations for
> > > > > > > functions that have "."-suffixes (foo.isra.0) but
> > > > > > > not optimized-out parameters.
> > > > > > > This patch assumes that below linked changes land in
> > > > > > > pahole for v1.25.
> > > > > > > 
> > > > > > > Issues with pahole filtering being too aggressive in
> > > > > > > removing functions
> > > > > > > appear to be resolved now, but CI and further testing will confirm.
> > > > > > > 
> > > > > > > Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> > > > > > > ---
> > > > > > >   scripts/pahole-flags.sh | 3 +++
> > > > > > >   1 file changed, 3 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
> > > > > > > index 1f1f1d397c39..728d55190d97 100755
> > > > > > > --- a/scripts/pahole-flags.sh
> > > > > > > +++ b/scripts/pahole-flags.sh
> > > > > > > @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
> > > > > > >          # see PAHOLE_HAS_LANG_EXCLUDE
> > > > > > >          extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
> > > > > > >   fi
> > > > > > > +if [ "${pahole_ver}" -ge "125" ]; then
> > > > > > > +       extra_paholeopt="${extra_paholeopt}
> > > > > > > --skip_encoding_btf_inconsistent_proto
> > > > > > > --btf_gen_optimized"
> > > > > > > +fi
> > > > > > > 
> > > > > > >   echo ${extra_paholeopt}
> > > > > > > -- 
> > > > > > > 2.31.1
> > > > > > > 
> > > > > > 
> > > > > > That change looks like a workaround to me.
> > > > > > There may be multiple functions that have the same proto, e.g.:
> > > > > > 
> > > > > >    $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
> > > > > > kernel/bpf/ net/core/
> > > > > >    kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
> > > > > > bpf_iter_aux_info *aux)
> > > > > >    net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
> > > > > > bpf_iter_aux_info *aux)
> > > > > > 
> > > > > >    $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
> > > > > > bpf_iter_detach_map
> > > > > >    [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
> > > > > >    'aux' type_id=2638
> > > > > >    [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
> > > > > > 
> > > > > > We don't know which one it is in the BTF.
> > > > > > However, I'm not against this change, as it can avoid some issues.
> > > > > > 
> > > > > 
> > > > > In the above case, the BTF representation is consistent though.
> > > > > That is, if I attach fentry progs to either of these functions
> > > > > based on that BTF representation, nothing will crash.
> > > > > 
> > > > > That's ultimately what those changes are about; ensuring
> > > > > consistency in BTF representation, so when a function is in
> > > > > BTF we can know the signature of the function can be safely
> > > > > used by fentry for example.
> > > > > 
> > > > > The question of being able to identify functions (as opposed
> > > > > to having a consistent representation) is the next step.
> > > > > Finding a way to link between kallsyms and BTF would allow us to
> > > > > have multiple inconsistent functions in BTF, since we could map
> > > > > from BTF -> kallsyms safely. So two functions called "foo"
> > > > > with different function signatures would be okay, because
> > > > > we'd know which was which in kallsyms and could attach
> > > > > safely. Something like a BTF tag for the function that could
> > > > > clarify that mapping - but just for cases where it would
> > > > > otherwise be ambiguous - is probably the way forward
> > > > > longer term.
> > > > > 
> > > > > Jiri's talking about this topic at LSF/MM/BPF this week I believe.
> > > > 
> > > > Jiri presented a few ideas during LSFMMBPF.
> > > > 
> > > > I feel the best approach is to add a set of addr-s to BTF
> > > > via a special decl_tag.
> > > > We can also consider extending KIND_FUNC.
> > > > The advantage that every BTF func will have one or more addrs
> > > > associated with it and bpf prog loading logic wouldn't need to do
> > > > fragile name comparison between btf and kallsyms.
> > > > pahole can take addrs from dwarf and optionally double check
> > > > with kallsyms.
> > > 
> > > Yonghong summed it up in another email discussion, pasting it in here:
> > > 
> > >    So overall we have three options as kallsyms representation now:
> > >      (a) "addr module:foo:dir_a/dir_b/core.c"
> > >      (b) "addr module:foo"
> > >      (c) "addr module:foo:btf_id"
> > > 
> > >    option (a):
> > >      'dir_a/dir_b/core.c' needs to be encoded in BTF.
> > >      user space either check file path or func signature
> > >      to find attach_btf_id and pass to the kernel.
> > >      kernel can find file path in BTF and then lookup
> > >      kallsyms to find addr.
> > > 
> > >    option (b):
> > >      "addr" needs to be encoded in BTF.
> > >      user space checks func signature to find
> > >      attach_btf_id and pass to the kernel.
> > >      kernel can find addr in BTF and use it.
> > > 
> > >    option (c):
> > >      if user can decide which function to attach, e.g.,
> > >      through func signature, then no BTF encoding
> > >      is necessary. attach_btf_id is passed to the
> > >      kernel and search kallsyms to find the matching
> > >      btf_id and 'addr' will be available then.
> > > 
> > >    For option (b) and (c), user space needs to check
> > >    func signature to find which btf_id to use. If
> > >    same-name static functions having the identical
> > >    signatures, then user space would have a hard time
> > >    to differentiate. I think it should be very
> > >    rare same-name static functions in the kernel will have
> > >    identical signatures. But if we want 100% correctness,
> > >    we may need file path in which case option (a)
> > >    is preferable.
> > 
> > As Alexei mentioned in previous email, for such a extreme case,
> > if user is willing to go through extra step to check dwarf
> > to find and match file path, then (b) and (c) should work
> > perfectly as well.
> 
> Okay, it looks like this is more complex if the function signature is
> the same. In such cases, current BTF dedup will merge these
> functions as a single BTF func. In such cases, we could have:
> 
>    decl_tag_1   ----> dedup'ed static_func
>                          ^
>                          |
>    decl_tag_2   ---------
> 
> For such cases, just passing btf_id of static func to kernel
> won't work since the kernel won't be able to know which
> decl_tag to be associated with.
> 
> (I did a simple test with vmlinux, it looks we have
>  issues with decl_tag_1/decl_tag_2 -> dedup'ed static_func
>  as well since only one of decl_tag survives.
>  But this is a different issue.
> )
> 
> So if we intend to add decl tag (addr or file_path), we
> should not dedup static functions or generally any functions.

I did not think functions would be dedup-ed, they are ;-) with the
declaration tags in place we could perhaps switch it off, right?

or perhaps I can't think of all the cases we need functions dedup for,
so maybe the dedup code could check also the associated decl tag when
comparing functions

jirka
Yonghong Song May 15, 2023, 4:06 a.m. UTC | #10
On 5/14/23 2:49 PM, Jiri Olsa wrote:
> On Sun, May 14, 2023 at 10:37:08AM -0700, Yonghong Song wrote:
>>
>>
>> On 5/12/23 7:59 PM, Yonghong Song wrote:
>>>
>>>
>>> On 5/12/23 2:54 PM, Jiri Olsa wrote:
>>>> On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
>>>>> On Fri, May 12, 2023 at 9:04 AM Alan Maguire
>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>
>>>>>> On 12/05/2023 03:51, Yafang Shao wrote:
>>>>>>> On Wed, May 10, 2023 at 9:03 PM Alan Maguire
>>>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>>>
>>>>>>>> v1.25 of pahole supports filtering out functions
>>>>>>>> with multiple inconsistent
>>>>>>>> function prototypes or optimized-out parameters from
>>>>>>>> the BTF representation.
>>>>>>>> These present problems because there is no
>>>>>>>> additional info in BTF saying which
>>>>>>>> inconsistent prototype matches which function
>>>>>>>> instance to help guide attachment,
>>>>>>>> and functions with optimized-out parameters can lead
>>>>>>>> to incorrect assumptions
>>>>>>>> about register contents.
>>>>>>>>
>>>>>>>> So for now, filter out such functions while adding
>>>>>>>> BTF representations for
>>>>>>>> functions that have "."-suffixes (foo.isra.0) but
>>>>>>>> not optimized-out parameters.
>>>>>>>> This patch assumes that below linked changes land in
>>>>>>>> pahole for v1.25.
>>>>>>>>
>>>>>>>> Issues with pahole filtering being too aggressive in
>>>>>>>> removing functions
>>>>>>>> appear to be resolved now, but CI and further testing will confirm.
>>>>>>>>
>>>>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>>>>> ---
>>>>>>>>    scripts/pahole-flags.sh | 3 +++
>>>>>>>>    1 file changed, 3 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>>>>> index 1f1f1d397c39..728d55190d97 100755
>>>>>>>> --- a/scripts/pahole-flags.sh
>>>>>>>> +++ b/scripts/pahole-flags.sh
>>>>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>>>>           # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>>>>           extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>>>>    fi
>>>>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>>>>> +       extra_paholeopt="${extra_paholeopt}
>>>>>>>> --skip_encoding_btf_inconsistent_proto
>>>>>>>> --btf_gen_optimized"
>>>>>>>> +fi
>>>>>>>>
>>>>>>>>    echo ${extra_paholeopt}
>>>>>>>> -- 
>>>>>>>> 2.31.1
>>>>>>>>
>>>>>>>
>>>>>>> That change looks like a workaround to me.
>>>>>>> There may be multiple functions that have the same proto, e.g.:
>>>>>>>
>>>>>>>     $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
>>>>>>> kernel/bpf/ net/core/
>>>>>>>     kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>     net/core/bpf_sk_storage.c:static void bpf_iter_detach_map(struct
>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>
>>>>>>>     $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
>>>>>>> bpf_iter_detach_map
>>>>>>>     [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>>>>>>>     'aux' type_id=2638
>>>>>>>     [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
>>>>>>>
>>>>>>> We don't know which one it is in the BTF.
>>>>>>> However, I'm not against this change, as it can avoid some issues.
>>>>>>>
>>>>>>
>>>>>> In the above case, the BTF representation is consistent though.
>>>>>> That is, if I attach fentry progs to either of these functions
>>>>>> based on that BTF representation, nothing will crash.
>>>>>>
>>>>>> That's ultimately what those changes are about; ensuring
>>>>>> consistency in BTF representation, so when a function is in
>>>>>> BTF we can know the signature of the function can be safely
>>>>>> used by fentry for example.
>>>>>>
>>>>>> The question of being able to identify functions (as opposed
>>>>>> to having a consistent representation) is the next step.
>>>>>> Finding a way to link between kallsyms and BTF would allow us to
>>>>>> have multiple inconsistent functions in BTF, since we could map
>>>>>> from BTF -> kallsyms safely. So two functions called "foo"
>>>>>> with different function signatures would be okay, because
>>>>>> we'd know which was which in kallsyms and could attach
>>>>>> safely. Something like a BTF tag for the function that could
>>>>>> clarify that mapping - but just for cases where it would
>>>>>> otherwise be ambiguous - is probably the way forward
>>>>>> longer term.
>>>>>>
>>>>>> Jiri's talking about this topic at LSF/MM/BPF this week I believe.
>>>>>
>>>>> Jiri presented a few ideas during LSFMMBPF.
>>>>>
>>>>> I feel the best approach is to add a set of addr-s to BTF
>>>>> via a special decl_tag.
>>>>> We can also consider extending KIND_FUNC.
>>>>> The advantage that every BTF func will have one or more addrs
>>>>> associated with it and bpf prog loading logic wouldn't need to do
>>>>> fragile name comparison between btf and kallsyms.
>>>>> pahole can take addrs from dwarf and optionally double check
>>>>> with kallsyms.
>>>>
>>>> Yonghong summed it up in another email discussion, pasting it in here:
>>>>
>>>>     So overall we have three options as kallsyms representation now:
>>>>       (a) "addr module:foo:dir_a/dir_b/core.c"
>>>>       (b) "addr module:foo"
>>>>       (c) "addr module:foo:btf_id"
>>>>
>>>>     option (a):
>>>>       'dir_a/dir_b/core.c' needs to be encoded in BTF.
>>>>       user space either check file path or func signature
>>>>       to find attach_btf_id and pass to the kernel.
>>>>       kernel can find file path in BTF and then lookup
>>>>       kallsyms to find addr.
>>>>
>>>>     option (b):
>>>>       "addr" needs to be encoded in BTF.
>>>>       user space checks func signature to find
>>>>       attach_btf_id and pass to the kernel.
>>>>       kernel can find addr in BTF and use it.
>>>>
>>>>     option (c):
>>>>       if user can decide which function to attach, e.g.,
>>>>       through func signature, then no BTF encoding
>>>>       is necessary. attach_btf_id is passed to the
>>>>       kernel and search kallsyms to find the matching
>>>>       btf_id and 'addr' will be available then.
>>>>
>>>>     For option (b) and (c), user space needs to check
>>>>     func signature to find which btf_id to use. If
>>>>     same-name static functions having the identical
>>>>     signatures, then user space would have a hard time
>>>>     to differentiate. I think it should be very
>>>>     rare same-name static functions in the kernel will have
>>>>     identical signatures. But if we want 100% correctness,
>>>>     we may need file path in which case option (a)
>>>>     is preferable.
>>>
>>> As Alexei mentioned in previous email, for such a extreme case,
>>> if user is willing to go through extra step to check dwarf
>>> to find and match file path, then (b) and (c) should work
>>> perfectly as well.
>>
>> Okay, it looks like this is more complex if the function signature is
>> the same. In such cases, current BTF dedup will merge these
>> functions as a single BTF func. In such cases, we could have:
>>
>>     decl_tag_1   ----> dedup'ed static_func
>>                           ^
>>                           |
>>     decl_tag_2   ---------
>>
>> For such cases, just passing btf_id of static func to kernel
>> won't work since the kernel won't be able to know which
>> decl_tag to be associated with.
>>
>> (I did a simple test with vmlinux, it looks we have
>>   issues with decl_tag_1/decl_tag_2 -> dedup'ed static_func
>>   as well since only one of decl_tag survives.
>>   But this is a different issue.
>> )
>>
>> So if we intend to add decl tag (addr or file_path), we
>> should not dedup static functions or generally any functions.
> 
> I did not think functions would be dedup-ed, they are ;-) with the
> declaration tags in place we could perhaps switch it off, right?

That is my hope too. If with decl tag func won't be dedup'ed,
then we should be okay. In the kernel, based on attach_btf_id,
through btf, kernel can find the corresponding decl tag (file path
or addr) or through attach_btf_id itself if the btf id is
encoded in kallsym entries.

> 
> or perhaps I can't think of all the cases we need functions dedup for,
> so maybe the dedup code could check also the associated decl tag when
> comparing functions
> 
> jirka
Alan Maguire May 15, 2023, 2:53 p.m. UTC | #11
On 15/05/2023 05:06, Yonghong Song wrote:
> 
> 
> On 5/14/23 2:49 PM, Jiri Olsa wrote:
>> On Sun, May 14, 2023 at 10:37:08AM -0700, Yonghong Song wrote:
>>>
>>>
>>> On 5/12/23 7:59 PM, Yonghong Song wrote:
>>>>
>>>>
>>>> On 5/12/23 2:54 PM, Jiri Olsa wrote:
>>>>> On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
>>>>>> On Fri, May 12, 2023 at 9:04 AM Alan Maguire
>>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>>
>>>>>>> On 12/05/2023 03:51, Yafang Shao wrote:
>>>>>>>> On Wed, May 10, 2023 at 9:03 PM Alan Maguire
>>>>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>>>>
>>>>>>>>> v1.25 of pahole supports filtering out functions
>>>>>>>>> with multiple inconsistent
>>>>>>>>> function prototypes or optimized-out parameters from
>>>>>>>>> the BTF representation.
>>>>>>>>> These present problems because there is no
>>>>>>>>> additional info in BTF saying which
>>>>>>>>> inconsistent prototype matches which function
>>>>>>>>> instance to help guide attachment,
>>>>>>>>> and functions with optimized-out parameters can lead
>>>>>>>>> to incorrect assumptions
>>>>>>>>> about register contents.
>>>>>>>>>
>>>>>>>>> So for now, filter out such functions while adding
>>>>>>>>> BTF representations for
>>>>>>>>> functions that have "."-suffixes (foo.isra.0) but
>>>>>>>>> not optimized-out parameters.
>>>>>>>>> This patch assumes that below linked changes land in
>>>>>>>>> pahole for v1.25.
>>>>>>>>>
>>>>>>>>> Issues with pahole filtering being too aggressive in
>>>>>>>>> removing functions
>>>>>>>>> appear to be resolved now, but CI and further testing will
>>>>>>>>> confirm.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>>>>>> ---
>>>>>>>>>    scripts/pahole-flags.sh | 3 +++
>>>>>>>>>    1 file changed, 3 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>>>>>> index 1f1f1d397c39..728d55190d97 100755
>>>>>>>>> --- a/scripts/pahole-flags.sh
>>>>>>>>> +++ b/scripts/pahole-flags.sh
>>>>>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>>>>>           # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>>>>>           extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>>>>>    fi
>>>>>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>>>>>> +       extra_paholeopt="${extra_paholeopt}
>>>>>>>>> --skip_encoding_btf_inconsistent_proto
>>>>>>>>> --btf_gen_optimized"
>>>>>>>>> +fi
>>>>>>>>>
>>>>>>>>>    echo ${extra_paholeopt}
>>>>>>>>> -- 
>>>>>>>>> 2.31.1
>>>>>>>>>
>>>>>>>>
>>>>>>>> That change looks like a workaround to me.
>>>>>>>> There may be multiple functions that have the same proto, e.g.:
>>>>>>>>
>>>>>>>>     $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
>>>>>>>> kernel/bpf/ net/core/
>>>>>>>>     kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
>>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>>     net/core/bpf_sk_storage.c:static void
>>>>>>>> bpf_iter_detach_map(struct
>>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>>
>>>>>>>>     $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
>>>>>>>> bpf_iter_detach_map
>>>>>>>>     [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>>>>>>>>     'aux' type_id=2638
>>>>>>>>     [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
>>>>>>>>
>>>>>>>> We don't know which one it is in the BTF.
>>>>>>>> However, I'm not against this change, as it can avoid some issues.
>>>>>>>>
>>>>>>>
>>>>>>> In the above case, the BTF representation is consistent though.
>>>>>>> That is, if I attach fentry progs to either of these functions
>>>>>>> based on that BTF representation, nothing will crash.
>>>>>>>
>>>>>>> That's ultimately what those changes are about; ensuring
>>>>>>> consistency in BTF representation, so when a function is in
>>>>>>> BTF we can know the signature of the function can be safely
>>>>>>> used by fentry for example.
>>>>>>>
>>>>>>> The question of being able to identify functions (as opposed
>>>>>>> to having a consistent representation) is the next step.
>>>>>>> Finding a way to link between kallsyms and BTF would allow us to
>>>>>>> have multiple inconsistent functions in BTF, since we could map
>>>>>>> from BTF -> kallsyms safely. So two functions called "foo"
>>>>>>> with different function signatures would be okay, because
>>>>>>> we'd know which was which in kallsyms and could attach
>>>>>>> safely. Something like a BTF tag for the function that could
>>>>>>> clarify that mapping - but just for cases where it would
>>>>>>> otherwise be ambiguous - is probably the way forward
>>>>>>> longer term.
>>>>>>>
>>>>>>> Jiri's talking about this topic at LSF/MM/BPF this week I believe.
>>>>>>
>>>>>> Jiri presented a few ideas during LSFMMBPF.
>>>>>>
>>>>>> I feel the best approach is to add a set of addr-s to BTF
>>>>>> via a special decl_tag.
>>>>>> We can also consider extending KIND_FUNC.
>>>>>> The advantage that every BTF func will have one or more addrs
>>>>>> associated with it and bpf prog loading logic wouldn't need to do
>>>>>> fragile name comparison between btf and kallsyms.
>>>>>> pahole can take addrs from dwarf and optionally double check
>>>>>> with kallsyms.
>>>>>
>>>>> Yonghong summed it up in another email discussion, pasting it in here:
>>>>>
>>>>>     So overall we have three options as kallsyms representation now:
>>>>>       (a) "addr module:foo:dir_a/dir_b/core.c"
>>>>>       (b) "addr module:foo"
>>>>>       (c) "addr module:foo:btf_id"
>>>>>
>>>>>     option (a):
>>>>>       'dir_a/dir_b/core.c' needs to be encoded in BTF.
>>>>>       user space either check file path or func signature
>>>>>       to find attach_btf_id and pass to the kernel.
>>>>>       kernel can find file path in BTF and then lookup
>>>>>       kallsyms to find addr.
>>>>>

I like the source-centric nature of this, but the only
danger here is we might not get a 1:1 mapping between
source file location and address; consider the case
of a static inline function in a .h file which doesn't
get inlined. It could have multiple addresses associated
with the same source. For example:

static inline void __list_del_entry(struct list_head *entry)
{
	if (!__list_del_entry_valid(entry))
		return;

	__list_del(entry->prev, entry->next);
}

$ grep __list_del_entry /proc/kallsyms
ffffffff982cc5c0 t __pfx___list_del_entry
ffffffff982cc5d0 t __list_del_entry
ffffffff985b0860 t __pfx___list_del_entry
ffffffff985b0870 t __list_del_entry
ffffffff9862d800 t __pfx___list_del_entry
ffffffff9862d810 t __list_del_entry
ffffffff987d3dd0 t __pfx___list_del_entry
ffffffff987d3de0 t __list_del_entry
ffffffff987f37a0 T __pfx___list_del_entry_valid
ffffffff987f37b0 T __list_del_entry_valid
ffffffff989fdd10 t __pfx___list_del_entry
ffffffff989fdd20 t __list_del_entry
ffffffff99baf08c r __ksymtab___list_del_entry_valid
ffffffffc12da2e0 t __list_del_entry	[bnep]
ffffffffc12da2d0 t __pfx___list_del_entry	[bnep]
ffffffffc092d6b0 t __list_del_entry	[videobuf2_common]
ffffffffc092d6a0 t __pfx___list_del_entry	[videobuf2_common]

>>>>>     option (b):
>>>>>       "addr" needs to be encoded in BTF.
>>>>>       user space checks func signature to find
>>>>>       attach_btf_id and pass to the kernel.
>>>>>       kernel can find addr in BTF and use it.
>>>>>

This seems like the safest option to me. Ideally we wouldn't
need such information for every function - only ones with
multiple sites and ambiguous signatures - but the problem
is a function could have the same name in a kernel and
a module too. So it seems like we're stuck with providing
additional info to clarify which signature goes with which
function for each static function.

>>>>>     option (c):
>>>>>       if user can decide which function to attach, e.g.,
>>>>>       through func signature, then no BTF encoding
>>>>>       is necessary. attach_btf_id is passed to the
>>>>>       kernel and search kallsyms to find the matching
>>>>>       btf_id and 'addr' will be available then.
>>>>>
>>>>>     For option (b) and (c), user space needs to check
>>>>>     func signature to find which btf_id to use. If
>>>>>     same-name static functions having the identical
>>>>>     signatures, then user space would have a hard time
>>>>>     to differentiate. I think it should be very
>>>>>     rare same-name static functions in the kernel will have
>>>>>     identical signatures. But if we want 100% correctness,
>>>>>     we may need file path in which case option (a)
>>>>>     is preferable.
>>>>
>>>> As Alexei mentioned in previous email, for such a extreme case,
>>>> if user is willing to go through extra step to check dwarf
>>>> to find and match file path, then (b) and (c) should work
>>>> perfectly as well.
>>>
>>> Okay, it looks like this is more complex if the function signature is
>>> the same. In such cases, current BTF dedup will merge these
>>> functions as a single BTF func. In such cases, we could have:
>>>
>>>     decl_tag_1   ----> dedup'ed static_func
>>>                           ^
>>>                           |
>>>     decl_tag_2   ---------
>>>
>>> For such cases, just passing btf_id of static func to kernel
>>> won't work since the kernel won't be able to know which
>>> decl_tag to be associated with.
>>>
>>> (I did a simple test with vmlinux, it looks we have
>>>   issues with decl_tag_1/decl_tag_2 -> dedup'ed static_func
>>>   as well since only one of decl_tag survives.
>>>   But this is a different issue.
>>> )
>>>
>>> So if we intend to add decl tag (addr or file_path), we
>>> should not dedup static functions or generally any functions.
>>
>> I did not think functions would be dedup-ed, they are ;-) with the
>> declaration tags in place we could perhaps switch it off, right?
> 
> That is my hope too. If with decl tag func won't be dedup'ed,
> then we should be okay. In the kernel, based on attach_btf_id,
> through btf, kernel can find the corresponding decl tag (file path
> or addr) or through attach_btf_id itself if the btf id is
> encoded in kallsym entries.
> 
>>
>> or perhaps I can't think of all the cases we need functions dedup for,
>> so maybe the dedup code could check also the associated decl tag when
>> comparing functions
>>

Would using the BTF decl tag id instead of the function BTF id to
guide attachment be an option? That way if multiple functions shared
the same signature, we could still get the info to figure out which to
attach to from the decl tag, and we wouldn't need to mess with dedup.
I'm probably missing something which makes that unworkable, but just a
thought. Thanks!

Alan
Yonghong Song May 15, 2023, 7:33 p.m. UTC | #12
On 5/15/23 7:53 AM, Alan Maguire wrote:
> On 15/05/2023 05:06, Yonghong Song wrote:
>>
>>
>> On 5/14/23 2:49 PM, Jiri Olsa wrote:
>>> On Sun, May 14, 2023 at 10:37:08AM -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 5/12/23 7:59 PM, Yonghong Song wrote:
>>>>>
>>>>>
>>>>> On 5/12/23 2:54 PM, Jiri Olsa wrote:
>>>>>> On Fri, May 12, 2023 at 11:59:34AM -0700, Alexei Starovoitov wrote:
>>>>>>> On Fri, May 12, 2023 at 9:04 AM Alan Maguire
>>>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>>>
>>>>>>>> On 12/05/2023 03:51, Yafang Shao wrote:
>>>>>>>>> On Wed, May 10, 2023 at 9:03 PM Alan Maguire
>>>>>>>>> <alan.maguire@oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>> v1.25 of pahole supports filtering out functions
>>>>>>>>>> with multiple inconsistent
>>>>>>>>>> function prototypes or optimized-out parameters from
>>>>>>>>>> the BTF representation.
>>>>>>>>>> These present problems because there is no
>>>>>>>>>> additional info in BTF saying which
>>>>>>>>>> inconsistent prototype matches which function
>>>>>>>>>> instance to help guide attachment,
>>>>>>>>>> and functions with optimized-out parameters can lead
>>>>>>>>>> to incorrect assumptions
>>>>>>>>>> about register contents.
>>>>>>>>>>
>>>>>>>>>> So for now, filter out such functions while adding
>>>>>>>>>> BTF representations for
>>>>>>>>>> functions that have "."-suffixes (foo.isra.0) but
>>>>>>>>>> not optimized-out parameters.
>>>>>>>>>> This patch assumes that below linked changes land in
>>>>>>>>>> pahole for v1.25.
>>>>>>>>>>
>>>>>>>>>> Issues with pahole filtering being too aggressive in
>>>>>>>>>> removing functions
>>>>>>>>>> appear to be resolved now, but CI and further testing will
>>>>>>>>>> confirm.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>>>>>>>>>> ---
>>>>>>>>>>     scripts/pahole-flags.sh | 3 +++
>>>>>>>>>>     1 file changed, 3 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
>>>>>>>>>> index 1f1f1d397c39..728d55190d97 100755
>>>>>>>>>> --- a/scripts/pahole-flags.sh
>>>>>>>>>> +++ b/scripts/pahole-flags.sh
>>>>>>>>>> @@ -23,5 +23,8 @@ if [ "${pahole_ver}" -ge "124" ]; then
>>>>>>>>>>            # see PAHOLE_HAS_LANG_EXCLUDE
>>>>>>>>>>            extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
>>>>>>>>>>     fi
>>>>>>>>>> +if [ "${pahole_ver}" -ge "125" ]; then
>>>>>>>>>> +       extra_paholeopt="${extra_paholeopt}
>>>>>>>>>> --skip_encoding_btf_inconsistent_proto
>>>>>>>>>> --btf_gen_optimized"
>>>>>>>>>> +fi
>>>>>>>>>>
>>>>>>>>>>     echo ${extra_paholeopt}
>>>>>>>>>> -- 
>>>>>>>>>> 2.31.1
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> That change looks like a workaround to me.
>>>>>>>>> There may be multiple functions that have the same proto, e.g.:
>>>>>>>>>
>>>>>>>>>      $ grep -r "bpf_iter_detach_map(struct bpf_iter_aux_info \*aux)"
>>>>>>>>> kernel/bpf/ net/core/
>>>>>>>>>      kernel/bpf/map_iter.c:static void bpf_iter_detach_map(struct
>>>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>>>      net/core/bpf_sk_storage.c:static void
>>>>>>>>> bpf_iter_detach_map(struct
>>>>>>>>> bpf_iter_aux_info *aux)
>>>>>>>>>
>>>>>>>>>      $ bpftool btf dump file /sys/kernel/btf/vmlinux   |  grep -B 2
>>>>>>>>> bpf_iter_detach_map
>>>>>>>>>      [34691] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1
>>>>>>>>>      'aux' type_id=2638
>>>>>>>>>      [34692] FUNC 'bpf_iter_detach_map' type_id=34691 linkage=static
>>>>>>>>>
>>>>>>>>> We don't know which one it is in the BTF.
>>>>>>>>> However, I'm not against this change, as it can avoid some issues.
>>>>>>>>>
>>>>>>>>
>>>>>>>> In the above case, the BTF representation is consistent though.
>>>>>>>> That is, if I attach fentry progs to either of these functions
>>>>>>>> based on that BTF representation, nothing will crash.
>>>>>>>>
>>>>>>>> That's ultimately what those changes are about; ensuring
>>>>>>>> consistency in BTF representation, so when a function is in
>>>>>>>> BTF we can know the signature of the function can be safely
>>>>>>>> used by fentry for example.
>>>>>>>>
>>>>>>>> The question of being able to identify functions (as opposed
>>>>>>>> to having a consistent representation) is the next step.
>>>>>>>> Finding a way to link between kallsyms and BTF would allow us to
>>>>>>>> have multiple inconsistent functions in BTF, since we could map
>>>>>>>> from BTF -> kallsyms safely. So two functions called "foo"
>>>>>>>> with different function signatures would be okay, because
>>>>>>>> we'd know which was which in kallsyms and could attach
>>>>>>>> safely. Something like a BTF tag for the function that could
>>>>>>>> clarify that mapping - but just for cases where it would
>>>>>>>> otherwise be ambiguous - is probably the way forward
>>>>>>>> longer term.
>>>>>>>>
>>>>>>>> Jiri's talking about this topic at LSF/MM/BPF this week I believe.
>>>>>>>
>>>>>>> Jiri presented a few ideas during LSFMMBPF.
>>>>>>>
>>>>>>> I feel the best approach is to add a set of addr-s to BTF
>>>>>>> via a special decl_tag.
>>>>>>> We can also consider extending KIND_FUNC.
>>>>>>> The advantage that every BTF func will have one or more addrs
>>>>>>> associated with it and bpf prog loading logic wouldn't need to do
>>>>>>> fragile name comparison between btf and kallsyms.
>>>>>>> pahole can take addrs from dwarf and optionally double check
>>>>>>> with kallsyms.
>>>>>>
>>>>>> Yonghong summed it up in another email discussion, pasting it in here:
>>>>>>
>>>>>>      So overall we have three options as kallsyms representation now:
>>>>>>        (a) "addr module:foo:dir_a/dir_b/core.c"
>>>>>>        (b) "addr module:foo"
>>>>>>        (c) "addr module:foo:btf_id"
>>>>>>
>>>>>>      option (a):
>>>>>>        'dir_a/dir_b/core.c' needs to be encoded in BTF.
>>>>>>        user space either check file path or func signature
>>>>>>        to find attach_btf_id and pass to the kernel.
>>>>>>        kernel can find file path in BTF and then lookup
>>>>>>        kallsyms to find addr.
>>>>>>
> 
> I like the source-centric nature of this, but the only
> danger here is we might not get a 1:1 mapping between
> source file location and address; consider the case
> of a static inline function in a .h file which doesn't
> get inlined. It could have multiple addresses associated
> with the same source. For example:
> 
> static inline void __list_del_entry(struct list_head *entry)
> {
> 	if (!__list_del_entry_valid(entry))
> 		return;
> 
> 	__list_del(entry->prev, entry->next);
> }
> 
> $ grep __list_del_entry /proc/kallsyms
> ffffffff982cc5c0 t __pfx___list_del_entry
> ffffffff982cc5d0 t __list_del_entry
> ffffffff985b0860 t __pfx___list_del_entry
> ffffffff985b0870 t __list_del_entry
> ffffffff9862d800 t __pfx___list_del_entry
> ffffffff9862d810 t __list_del_entry
> ffffffff987d3dd0 t __pfx___list_del_entry
> ffffffff987d3de0 t __list_del_entry
> ffffffff987f37a0 T __pfx___list_del_entry_valid
> ffffffff987f37b0 T __list_del_entry_valid
> ffffffff989fdd10 t __pfx___list_del_entry
> ffffffff989fdd20 t __list_del_entry
> ffffffff99baf08c r __ksymtab___list_del_entry_valid
> ffffffffc12da2e0 t __list_del_entry	[bnep]
> ffffffffc12da2d0 t __pfx___list_del_entry	[bnep]
> ffffffffc092d6b0 t __list_del_entry	[videobuf2_common]
> ffffffffc092d6a0 t __pfx___list_del_entry	[videobuf2_common]

Until now, we are okay with static inline function carried
into .o file since all inline functions are marked as notrace
(fentry cannot attach to it.)

However, now we have
https://lore.kernel.org/live-patching/20230502164102.1a51cdb4@gandalf.local.home/T/#u

If the patch is merged, then the above inline function
in the header survived in .o file indeed a problem.

Typically users won't trace inline functions in
the header file. But for completeness, agree that
file path may not fully reliable.

> 
>>>>>>      option (b):
>>>>>>        "addr" needs to be encoded in BTF.
>>>>>>        user space checks func signature to find
>>>>>>        attach_btf_id and pass to the kernel.
>>>>>>        kernel can find addr in BTF and use it.
>>>>>>
> 
> This seems like the safest option to me. Ideally we wouldn't
> need such information for every function - only ones with
> multiple sites and ambiguous signatures - but the problem
> is a function could have the same name in a kernel and
> a module too. So it seems like we're stuck with providing
> additional info to clarify which signature goes with which
> function for each static function.

When passing attach_btf_id to kernel, we have attach_btf_obj_fd
as well, which should differentiate whether it is kernel or which module 
it is.

> 
>>>>>>      option (c):
>>>>>>        if user can decide which function to attach, e.g.,
>>>>>>        through func signature, then no BTF encoding
>>>>>>        is necessary. attach_btf_id is passed to the
>>>>>>        kernel and search kallsyms to find the matching
>>>>>>        btf_id and 'addr' will be available then.
>>>>>>
>>>>>>      For option (b) and (c), user space needs to check
>>>>>>      func signature to find which btf_id to use. If
>>>>>>      same-name static functions having the identical
>>>>>>      signatures, then user space would have a hard time
>>>>>>      to differentiate. I think it should be very
>>>>>>      rare same-name static functions in the kernel will have
>>>>>>      identical signatures. But if we want 100% correctness,
>>>>>>      we may need file path in which case option (a)
>>>>>>      is preferable.
>>>>>
>>>>> As Alexei mentioned in previous email, for such a extreme case,
>>>>> if user is willing to go through extra step to check dwarf
>>>>> to find and match file path, then (b) and (c) should work
>>>>> perfectly as well.
>>>>
>>>> Okay, it looks like this is more complex if the function signature is
>>>> the same. In such cases, current BTF dedup will merge these
>>>> functions as a single BTF func. In such cases, we could have:
>>>>
>>>>      decl_tag_1   ----> dedup'ed static_func
>>>>                            ^
>>>>                            |
>>>>      decl_tag_2   ---------
>>>>
>>>> For such cases, just passing btf_id of static func to kernel
>>>> won't work since the kernel won't be able to know which
>>>> decl_tag to be associated with.
>>>>
>>>> (I did a simple test with vmlinux, it looks we have
>>>>    issues with decl_tag_1/decl_tag_2 -> dedup'ed static_func
>>>>    as well since only one of decl_tag survives.
>>>>    But this is a different issue.
>>>> )
>>>>
>>>> So if we intend to add decl tag (addr or file_path), we
>>>> should not dedup static functions or generally any functions.
>>>
>>> I did not think functions would be dedup-ed, they are ;-) with the
>>> declaration tags in place we could perhaps switch it off, right?
>>
>> That is my hope too. If with decl tag func won't be dedup'ed,
>> then we should be okay. In the kernel, based on attach_btf_id,
>> through btf, kernel can find the corresponding decl tag (file path
>> or addr) or through attach_btf_id itself if the btf id is
>> encoded in kallsym entries.
>>
>>>
>>> or perhaps I can't think of all the cases we need functions dedup for,
>>> so maybe the dedup code could check also the associated decl tag when
>>> comparing functions
>>>
> 
> Would using the BTF decl tag id instead of the function BTF id to
> guide attachment be an option? That way if multiple functions shared
> the same signature, we could still get the info to figure out which to
> attach to from the decl tag, and we wouldn't need to mess with dedup.
> I'm probably missing something which makes that unworkable, but just a
> thought. Thanks!

The issue is due to current bpf syscall UAPI. it passed attach_btf_id
and attach_btf_obj_fd to the kernel. If we change UAPI to pass
decl_tag_id instead of attach_btf_id to the kernel, we should be
okay, but this requires a bpf syscall UAPI change. Maybe this works too.

> 
> Alan
diff mbox series

Patch

diff --git a/scripts/pahole-flags.sh b/scripts/pahole-flags.sh
index 1f1f1d397c39..728d55190d97 100755
--- a/scripts/pahole-flags.sh
+++ b/scripts/pahole-flags.sh
@@ -23,5 +23,8 @@  if [ "${pahole_ver}" -ge "124" ]; then
 	# see PAHOLE_HAS_LANG_EXCLUDE
 	extra_paholeopt="${extra_paholeopt} --lang_exclude=rust"
 fi
+if [ "${pahole_ver}" -ge "125" ]; then
+	extra_paholeopt="${extra_paholeopt} --skip_encoding_btf_inconsistent_proto --btf_gen_optimized"
+fi
 
 echo ${extra_paholeopt}