mbox series

[v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph

Message ID 172615368656.133222.2336770908714920670.stgit@devnote2 (mailing list archive)
Headers show
Series tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph | expand

Message

Masami Hiramatsu (Google) Sept. 12, 2024, 3:08 p.m. UTC
Hi,

Here is the 14th version of the series to re-implement the fprobe on
function-graph tracer. The previous version is;

https://lore.kernel.org/all/172398527264.293426.2050093948411376857.stgit@devnote2/

This version is based on trace-v6.11-rc6.

I dropped already committed bugfix patch, also add RISC-V support of
ftrace_regs APIs(ftrace_regs_get_return_address() and ftrace_partial_regs()).

Overview
--------
This series rewrites the fprobe on this function-graph.
The purposes of this change are;

 1) Remove dependency of the rethook from fprobe so that we can reduce
   the return hook code and shadow stack.

 2) Make 'ftrace_regs' the common trace interface for the function
   boundary.

1) Currently we have 2(or 3) different function return hook codes,
 the function-graph tracer and rethook (and legacy kretprobe).
 But since this  is redundant and needs double maintenance cost,
 I would like to unify those. From the user's viewpoint, function-
 graph tracer is very useful to grasp the execution path. For this
 purpose, it is hard to use the rethook in the function-graph
 tracer, but the opposite is possible. (Strictly speaking, kretprobe
 can not use it because it requires 'pt_regs' for historical reasons.)

2) Now the fprobe provides the 'pt_regs' for its handler, but that is
 wrong for the function entry and exit. Moreover, depending on the
 architecture, there is no way to accurately reproduce 'pt_regs'
 outside of interrupt or exception handlers. This means fprobe should
 not use 'pt_regs' because it does not use such exceptions.
 (Conversely, kprobe should use 'pt_regs' because it is an abstract
  interface of the software breakpoint exception.)

This series changes fprobe to use function-graph tracer for tracing
function entry and exit, instead of mixture of ftrace and rethook.
Unlike the rethook which is a per-task list of system-wide allocated
nodes, the function graph's ret_stack is a per-task shadow stack.
Thus it does not need to set 'nr_maxactive' (which is the number of
pre-allocated nodes).
Also the handlers will get the 'ftrace_regs' instead of 'pt_regs'.
Since eBPF mulit_kprobe/multi_kretprobe events still use 'pt_regs' as
their register interface, this changes it to convert 'ftrace_regs' to
'pt_regs'. Of course this conversion makes an incomplete 'pt_regs',
so users must access only registers for function parameters or
return value. 

Design
------
Instead of using ftrace's function entry hook directly, the new fprobe
is built on top of the function-graph's entry and return callbacks
with 'ftrace_regs'.

Since the fprobe requires access to 'ftrace_regs', the architecture
must support CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS and
CONFIG_HAVE_FTRACE_GRAPH_FUNC, which enables to call function-graph
entry callback with 'ftrace_regs', and also
CONFIG_HAVE_FUNCTION_GRAPH_FREGS, which passes the ftrace_regs to
return_to_handler.

All fprobes share a single function-graph ops (means shares a common
ftrace filter) similar to the kprobe-on-ftrace. This needs another
layer to find corresponding fprobe in the common function-graph
callbacks, but has much better scalability, since the number of
registered function-graph ops is limited.

In the entry callback, the fprobe runs its entry_handler and saves the
address of 'fprobe' on the function-graph's shadow stack as data. The
return callback decodes the data to get the 'fprobe' address, and runs
the exit_handler.

The fprobe introduces two hash-tables, one is for entry callback which
searches fprobes related to the given function address passed by entry
callback. The other is for a return callback which checks if the given
'fprobe' data structure pointer is still valid. Note that it is
possible to unregister fprobe before the return callback runs. Thus
the address validation must be done before using it in the return
callback.

Download
--------
This series can be applied against the ftrace/for-next branch in
linux-trace tree.

This series can also be found below branch.

https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=topic/fprobe-on-fgraph

Thank you,

---

Masami Hiramatsu (Google) (19):
      tracing: Add a comment about ftrace_regs definition
      tracing: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
      function_graph: Pass ftrace_regs to entryfunc
      function_graph: Replace fgraph_ret_regs with ftrace_regs
      function_graph: Pass ftrace_regs to retfunc
      fprobe: Use ftrace_regs in fprobe entry handler
      fprobe: Use ftrace_regs in fprobe exit handler
      tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
      tracing: Add ftrace_fill_perf_regs() for perf event
      tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
      bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
      ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
      fprobe: Rewrite fprobe on function-graph tracer
      tracing: Fix function timing profiler to initialize hashtable
      tracing/fprobe: Remove nr_maxactive from fprobe
      selftests: ftrace: Remove obsolate maxactive syntax check
      selftests/ftrace: Add a test case for repeating register/unregister fprobe
      Documentation: probes: Update fprobe on function-graph tracer
      fgraph: Skip recording calltime/rettime if it is not nneeded


 Documentation/trace/fprobe.rst                     |   42 +
 arch/arm64/Kconfig                                 |    2 
 arch/arm64/include/asm/ftrace.h                    |   47 +
 arch/arm64/kernel/asm-offsets.c                    |   12 
 arch/arm64/kernel/entry-ftrace.S                   |   32 +
 arch/arm64/kernel/ftrace.c                         |   20 +
 arch/loongarch/Kconfig                             |    4 
 arch/loongarch/include/asm/ftrace.h                |   32 -
 arch/loongarch/kernel/asm-offsets.c                |   12 
 arch/loongarch/kernel/ftrace_dyn.c                 |   10 
 arch/loongarch/kernel/mcount.S                     |   17 -
 arch/loongarch/kernel/mcount_dyn.S                 |   14 
 arch/powerpc/Kconfig                               |    1 
 arch/powerpc/include/asm/ftrace.h                  |   15 
 arch/powerpc/kernel/trace/ftrace.c                 |    2 
 arch/powerpc/kernel/trace/ftrace_64_pg.c           |   10 
 arch/riscv/Kconfig                                 |    3 
 arch/riscv/include/asm/ftrace.h                    |   43 +
 arch/riscv/kernel/ftrace.c                         |   17 +
 arch/riscv/kernel/mcount.S                         |   24 -
 arch/s390/Kconfig                                  |    3 
 arch/s390/include/asm/ftrace.h                     |   39 +
 arch/s390/kernel/asm-offsets.c                     |    6 
 arch/s390/kernel/mcount.S                          |    9 
 arch/x86/Kconfig                                   |    4 
 arch/x86/include/asm/ftrace.h                      |   37 +
 arch/x86/kernel/ftrace.c                           |   50 +-
 arch/x86/kernel/ftrace_32.S                        |   15 
 arch/x86/kernel/ftrace_64.S                        |   17 -
 include/linux/fprobe.h                             |   57 +-
 include/linux/ftrace.h                             |  136 ++++
 kernel/trace/Kconfig                               |   23 +
 kernel/trace/bpf_trace.c                           |   19 -
 kernel/trace/fgraph.c                              |   96 ++-
 kernel/trace/fprobe.c                              |  638 ++++++++++++++------
 kernel/trace/ftrace.c                              |   10 
 kernel/trace/trace.h                               |    6 
 kernel/trace/trace_fprobe.c                        |  147 ++---
 kernel/trace/trace_functions_graph.c               |   10 
 kernel/trace/trace_irqsoff.c                       |    6 
 kernel/trace/trace_probe_tmpl.h                    |    2 
 kernel/trace/trace_sched_wakeup.c                  |    6 
 kernel/trace/trace_selftest.c                      |   11 
 lib/test_fprobe.c                                  |   51 --
 samples/fprobe/fprobe_example.c                    |    4 
 .../test.d/dynevent/add_remove_fprobe_repeat.tc    |   19 +
 .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    4 
 47 files changed, 1167 insertions(+), 617 deletions(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/dynevent/add_remove_fprobe_repeat.tc

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Comments

Andrii Nakryiko Sept. 12, 2024, 6:41 p.m. UTC | #1
+ BPF ML

On Thu, Sep 12, 2024 at 8:35 AM <bot+bpf-ci@kernel.org> wrote:
>
> Dear patch submitter,
>
> CI has tested the following submission:
> Status:     FAILURE
> Name:       [v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph
> Patchwork:  https://patchwork.kernel.org/project/netdevbpf/list/?series=889822&state=*
> Matrix:     https://github.com/kernel-patches/bpf/actions/runs/10833792984
>
> Failed jobs:
> test_progs-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791397
> test_progs_no_alu32-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791836
> test_progs-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757062
> test_progs_no_alu32-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757809
>
> First test_progs failure (test_progs-aarch64-gcc):
> #132 kprobe_multi_testmod_test
> serial_test_kprobe_multi_testmod_test:PASS:load_kallsyms_local 0 nsec
> #132/1 kprobe_multi_testmod_test/testmod_attach_api_syms
> test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> trigger_module_test_read:PASS:testmod_file_open 0 nsec
> test_testmod_attach_api:PASS:trigger_read 0 nsec
> kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> #132/2 kprobe_multi_testmod_test/testmod_attach_api_addrs
> test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> trigger_module_test_read:PASS:testmod_file_open 0 nsec
> test_testmod_attach_api:PASS:trigger_read 0 nsec
> kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
>

Seems like this selftest is still broken. Please let me know if you
need help with building and running BPF selftests to reproduce this
locally.

>
> Please note: this email is coming from an unmonitored mailbox. If you have
> questions or feedback, please reach out to the Meta Kernel CI team at
> kernel-ci@meta.com.
Masami Hiramatsu (Google) Sept. 12, 2024, 11:54 p.m. UTC | #2
On Thu, 12 Sep 2024 11:41:17 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> + BPF ML
> 
> On Thu, Sep 12, 2024 at 8:35 AM <bot+bpf-ci@kernel.org> wrote:
> >
> > Dear patch submitter,
> >
> > CI has tested the following submission:
> > Status:     FAILURE
> > Name:       [v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph
> > Patchwork:  https://patchwork.kernel.org/project/netdevbpf/list/?series=889822&state=*
> > Matrix:     https://github.com/kernel-patches/bpf/actions/runs/10833792984
> >
> > Failed jobs:
> > test_progs-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791397
> > test_progs_no_alu32-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791836
> > test_progs-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757062
> > test_progs_no_alu32-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757809
> >
> > First test_progs failure (test_progs-aarch64-gcc):
> > #132 kprobe_multi_testmod_test
> > serial_test_kprobe_multi_testmod_test:PASS:load_kallsyms_local 0 nsec
> > #132/1 kprobe_multi_testmod_test/testmod_attach_api_syms
> > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > #132/2 kprobe_multi_testmod_test/testmod_attach_api_addrs
> > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> >
> 
> Seems like this selftest is still broken. Please let me know if you
> need help with building and running BPF selftests to reproduce this
> locally.

Thanks, It will be helpful. Also I would like to know, is there any 
debug mode (like print more debug logs)?

Thank you,

> 
> >
> > Please note: this email is coming from an unmonitored mailbox. If you have
> > questions or feedback, please reach out to the Meta Kernel CI team at
> > kernel-ci@meta.com.
Andrii Nakryiko Sept. 13, 2024, 1:55 a.m. UTC | #3
On Thu, Sep 12, 2024 at 4:54 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 12 Sep 2024 11:41:17 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > + BPF ML
> >
> > On Thu, Sep 12, 2024 at 8:35 AM <bot+bpf-ci@kernel.org> wrote:
> > >
> > > Dear patch submitter,
> > >
> > > CI has tested the following submission:
> > > Status:     FAILURE
> > > Name:       [v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph
> > > Patchwork:  https://patchwork.kernel.org/project/netdevbpf/list/?series=889822&state=*
> > > Matrix:     https://github.com/kernel-patches/bpf/actions/runs/10833792984
> > >
> > > Failed jobs:
> > > test_progs-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791397
> > > test_progs_no_alu32-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791836
> > > test_progs-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757062
> > > test_progs_no_alu32-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757809
> > >
> > > First test_progs failure (test_progs-aarch64-gcc):
> > > #132 kprobe_multi_testmod_test
> > > serial_test_kprobe_multi_testmod_test:PASS:load_kallsyms_local 0 nsec
> > > #132/1 kprobe_multi_testmod_test/testmod_attach_api_syms
> > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > #132/2 kprobe_multi_testmod_test/testmod_attach_api_addrs
> > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > >
> >
> > Seems like this selftest is still broken. Please let me know if you
> > need help with building and running BPF selftests to reproduce this
> > locally.
>
> Thanks, It will be helpful. Also I would like to know, is there any
> debug mode (like print more debug logs)?

So first of all, the assumption is that you will build a most recent
kernel with Kconfig values set from
tools/testing/selftests/bpf/config, so make sure you  append that to
your kernel config. Then build the kernel, BPF selftests' Makefile
tries to find latest built kernel (according to KBUILD_OUTPUT/O/etc).

Now to building BPF selftests:

$ cd tools/testing/selftests/bpf
$ make -j$(nproc)

You'll need decently recent Clang and a few dependent packages. At
least elfutils-devel, libzstd-devel, but there might be a few more
which I never can remember.

Once everything is built, you can run the failing test with

$ sudo ./test_progs -t kprobe_multi_testmod_test -v

The source code for user space part for that test is in
prog_tests/kprobe_multi_testmod_test.c and BPF-side is in
progs/kprobe_multi.c.

Taking failing output from the test:

> > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1

kretprobe_test3_result is a sort of identifier for a test condition,
you can find a corresponding line in user space .c file grepping for
that:

ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
"kretprobe_test3_result");

Most probably the problem is in:

__u64 addr = bpf_get_func_ip(ctx);

which is then checked as

if ((const void *) addr == &bpf_testmod_fentry_test3)


With your patches this doesn't match anymore.


Hopefully the above gives you some pointers, let me know if you run
into any problems.

>
> Thank you,
>
> >
> > >
> > > Please note: this email is coming from an unmonitored mailbox. If you have
> > > questions or feedback, please reach out to the Meta Kernel CI team at
> > > kernel-ci@meta.com.
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) Sept. 13, 2024, 8:59 a.m. UTC | #4
On Thu, 12 Sep 2024 18:55:40 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Thu, Sep 12, 2024 at 4:54 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 12 Sep 2024 11:41:17 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > > + BPF ML
> > >
> > > On Thu, Sep 12, 2024 at 8:35 AM <bot+bpf-ci@kernel.org> wrote:
> > > >
> > > > Dear patch submitter,
> > > >
> > > > CI has tested the following submission:
> > > > Status:     FAILURE
> > > > Name:       [v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph
> > > > Patchwork:  https://patchwork.kernel.org/project/netdevbpf/list/?series=889822&state=*
> > > > Matrix:     https://github.com/kernel-patches/bpf/actions/runs/10833792984
> > > >
> > > > Failed jobs:
> > > > test_progs-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791397
> > > > test_progs_no_alu32-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791836
> > > > test_progs-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757062
> > > > test_progs_no_alu32-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757809
> > > >
> > > > First test_progs failure (test_progs-aarch64-gcc):
> > > > #132 kprobe_multi_testmod_test
> > > > serial_test_kprobe_multi_testmod_test:PASS:load_kallsyms_local 0 nsec
> > > > #132/1 kprobe_multi_testmod_test/testmod_attach_api_syms
> > > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > > #132/2 kprobe_multi_testmod_test/testmod_attach_api_addrs
> > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > >
> > >
> > > Seems like this selftest is still broken. Please let me know if you
> > > need help with building and running BPF selftests to reproduce this
> > > locally.
> >
> > Thanks, It will be helpful. Also I would like to know, is there any
> > debug mode (like print more debug logs)?
> 
> So first of all, the assumption is that you will build a most recent
> kernel with Kconfig values set from
> tools/testing/selftests/bpf/config, so make sure you  append that to
> your kernel config. Then build the kernel, BPF selftests' Makefile
> tries to find latest built kernel (according to KBUILD_OUTPUT/O/etc).
> 
> Now to building BPF selftests:
> 
> $ cd tools/testing/selftests/bpf
> $ make -j$(nproc)
> 
> You'll need decently recent Clang and a few dependent packages. At
> least elfutils-devel, libzstd-devel, but there might be a few more
> which I never can remember.
> 
> Once everything is built, you can run the failing test with
> 
> $ sudo ./test_progs -t kprobe_multi_testmod_test -v
> 
> The source code for user space part for that test is in
> prog_tests/kprobe_multi_testmod_test.c and BPF-side is in
> progs/kprobe_multi.c.

Thanks for the information!

> 
> Taking failing output from the test:
> 
> > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> 
> kretprobe_test3_result is a sort of identifier for a test condition,
> you can find a corresponding line in user space .c file grepping for
> that:
> 
> ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> "kretprobe_test3_result");
> 
> Most probably the problem is in:
> 
> __u64 addr = bpf_get_func_ip(ctx);

Yeah, and as I replyed to another thread, the problem is
that the ftrace entry_ip is not symbol ip.

We have ftrace_call_adjust() arch function for reverse
direction (symbol ip to ftrace entry ip) but what we need
here is the reverse translate function (ftrace entry to symbol)

The easiest way is to use kallsyms to find it, but this is
a bit costful (but it just increase bsearch in several levels).
Other possible options are

 - Change bpf_kprobe_multi_addrs_cmp() to accept a range
   of address. [sym_addr, sym_addr + offset) returns true,
   bpf_kprobe_multi_cookie() can find the entry address.
   The offset depends on arch, but 16 is enough.

 - Change bpf_kprobe_multi_addrs_cmp() to call
   ftrace_call_adjust() before comparing. This may take a
   cost but find actual entry address.

 - (Cached method) when making link->addrs, make a link->adj_addrs
   array too, where adj_addrs[i] == ftrace_call_adjust(addrs[i]).

Let me try the 3rd one. It may consume more memory but the
fastest solution.

Thank you,

> 
> which is then checked as
> 
> if ((const void *) addr == &bpf_testmod_fentry_test3)
> 
> 
> With your patches this doesn't match anymore.

It actually enables the fprobe on those architectures, so
without my series, those test should be skipped.

Thank you,

> 
> 
> Hopefully the above gives you some pointers, let me know if you run
> into any problems.
> 
> >
> > Thank you,
> >
> > >
> > > >
> > > > Please note: this email is coming from an unmonitored mailbox. If you have
> > > > questions or feedback, please reach out to the Meta Kernel CI team at
> > > > kernel-ci@meta.com.
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) Sept. 13, 2024, 12:45 p.m. UTC | #5
On Fri, 13 Sep 2024 17:59:35 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> > 
> > Taking failing output from the test:
> > 
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > 
> > kretprobe_test3_result is a sort of identifier for a test condition,
> > you can find a corresponding line in user space .c file grepping for
> > that:
> > 
> > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > "kretprobe_test3_result");
> > 
> > Most probably the problem is in:
> > 
> > __u64 addr = bpf_get_func_ip(ctx);
> 
> Yeah, and as I replyed to another thread, the problem is
> that the ftrace entry_ip is not symbol ip.
> 
> We have ftrace_call_adjust() arch function for reverse
> direction (symbol ip to ftrace entry ip) but what we need
> here is the reverse translate function (ftrace entry to symbol)
> 
> The easiest way is to use kallsyms to find it, but this is
> a bit costful (but it just increase bsearch in several levels).
> Other possible options are
> 
>  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
>    of address. [sym_addr, sym_addr + offset) returns true,
>    bpf_kprobe_multi_cookie() can find the entry address.
>    The offset depends on arch, but 16 is enough.

Oops. no, this bpf_kprobe_multi_cookie() is used only for storing
test data. Not a general problem solving. (I saw kprobe_multi_check())

So solving problem is much costly, we need to put more arch-
dependent in bpf_trace, and make sure it does reverse translation
of ftrace_call_adjust(). (this means if you expand arch support,
you need to add such implementation)

Thank you,
Masami Hiramatsu (Google) Sept. 13, 2024, 1:49 p.m. UTC | #6
On Fri, 13 Sep 2024 21:45:15 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> On Fri, 13 Sep 2024 17:59:35 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> 
> > > 
> > > Taking failing output from the test:
> > > 
> > > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > 
> > > kretprobe_test3_result is a sort of identifier for a test condition,
> > > you can find a corresponding line in user space .c file grepping for
> > > that:
> > > 
> > > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > > "kretprobe_test3_result");
> > > 
> > > Most probably the problem is in:
> > > 
> > > __u64 addr = bpf_get_func_ip(ctx);
> > 
> > Yeah, and as I replyed to another thread, the problem is
> > that the ftrace entry_ip is not symbol ip.
> > 
> > We have ftrace_call_adjust() arch function for reverse
> > direction (symbol ip to ftrace entry ip) but what we need
> > here is the reverse translate function (ftrace entry to symbol)
> > 
> > The easiest way is to use kallsyms to find it, but this is
> > a bit costful (but it just increase bsearch in several levels).
> > Other possible options are
> > 
> >  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
> >    of address. [sym_addr, sym_addr + offset) returns true,
> >    bpf_kprobe_multi_cookie() can find the entry address.
> >    The offset depends on arch, but 16 is enough.
> 
> Oops. no, this bpf_kprobe_multi_cookie() is used only for storing
> test data. Not a general problem solving. (I saw kprobe_multi_check())
> 
> So solving problem is much costly, we need to put more arch-
> dependent in bpf_trace, and make sure it does reverse translation
> of ftrace_call_adjust(). (this means if you expand arch support,
> you need to add such implementation)

OK, can you try this one?


>From 81bc599911507215aa9faa1077a601880cbd654a Mon Sep 17 00:00:00 2001
From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Date: Fri, 13 Sep 2024 21:43:46 +0900
Subject: [PATCH] bpf: Add get_entry_ip() for arm64

Add get_entry_ip() implementation for arm64. This is based on
the information in ftrace_call_adjust() for arm64.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/bpf_trace.c | 64 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index deb629f4a510..b0cf6e5b8965 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1066,6 +1066,70 @@ static unsigned long get_entry_ip(unsigned long fentry_ip)
 		fentry_ip -= ENDBR_INSN_SIZE;
 	return fentry_ip;
 }
+#elif defined (CONFIG_ARM64)
+#include <asm/insn.h>
+
+static unsigned long get_entry_ip(unsigned long fentry_ip)
+{
+	u32 insn;
+
+	/*
+	 * When using patchable-function-entry without pre-function NOPS, ftrace
+	 * entry is the address of the first NOP after the function entry point.
+	 *
+	 * The compiler has either generated:
+	 *
+	 * func+00:	func:	NOP		// To be patched to MOV X9, LR
+	 * func+04:		NOP		// To be patched to BL <caller>
+	 *
+	 * Or:
+	 *
+	 * func-04:		BTI	C
+	 * func+00:	func:	NOP		// To be patched to MOV X9, LR
+	 * func+04:		NOP		// To be patched to BL <caller>
+	 *
+	 * The fentry_ip is the address of `BL <caller>` which is at `func + 4`
+	 * bytes in either case.
+	 */
+	if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
+		return fentry_ip - AARCH64_INSN_SIZE;
+
+	/*
+	 * When using patchable-function-entry with pre-function NOPs, BTI is
+	 * a bit different.
+	 *
+	 * func+00:	func:	NOP		// To be patched to MOV X9, LR
+	 * func+04:		NOP		// To be patched to BL <caller>
+	 *
+	 * Or:
+	 *
+	 * func+00:	func:	BTI	C
+	 * func+04:		NOP		// To be patched to MOV X9, LR
+	 * func+08:		NOP		// To be patched to BL <caller>
+	 *
+	 * The fentry_ip is the address of `BL <caller>` which is at either
+	 * `func + 4` or `func + 8` depends on whether there is a BTI.
+	 */
+
+	/* If there is no BTI, the func address should be one instruction before. */
+	if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
+		return fentry_ip - AARCH64_INSN_SIZE;
+
+	/* We want to be extra safe in case entry ip is on the page edge,
+	 * but otherwise we need to avoid get_kernel_nofault()'s overhead.
+	 */
+	if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
+		if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
+			return fentry_ip - AARCH64_INSN_SIZE;
+	} else {
+		insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
+	}
+
+	if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
+		return fentry_ip - AARCH64_INSN_SIZE * 2;
+
+	return fentry_ip - AARCH64_INSN_SIZE;
+}
 #else
 #define get_entry_ip(fentry_ip) fentry_ip
 #endif
Andrii Nakryiko Sept. 13, 2024, 9:16 p.m. UTC | #7
On Fri, Sep 13, 2024 at 1:59 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 12 Sep 2024 18:55:40 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Thu, Sep 12, 2024 at 4:54 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 12 Sep 2024 11:41:17 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > + BPF ML
> > > >
> > > > On Thu, Sep 12, 2024 at 8:35 AM <bot+bpf-ci@kernel.org> wrote:
> > > > >
> > > > > Dear patch submitter,
> > > > >
> > > > > CI has tested the following submission:
> > > > > Status:     FAILURE
> > > > > Name:       [v14,00/19] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph
> > > > > Patchwork:  https://patchwork.kernel.org/project/netdevbpf/list/?series=889822&state=*
> > > > > Matrix:     https://github.com/kernel-patches/bpf/actions/runs/10833792984
> > > > >
> > > > > Failed jobs:
> > > > > test_progs-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791397
> > > > > test_progs_no_alu32-aarch64-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061791836
> > > > > test_progs-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757062
> > > > > test_progs_no_alu32-s390x-gcc: https://github.com/kernel-patches/bpf/actions/runs/10833792984/job/30061757809
> > > > >
> > > > > First test_progs failure (test_progs-aarch64-gcc):
> > > > > #132 kprobe_multi_testmod_test
> > > > > serial_test_kprobe_multi_testmod_test:PASS:load_kallsyms_local 0 nsec
> > > > > #132/1 kprobe_multi_testmod_test/testmod_attach_api_syms
> > > > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > > > #132/2 kprobe_multi_testmod_test/testmod_attach_api_addrs
> > > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > > test_testmod_attach_api_addrs:PASS:ksym_get_addr_local 0 nsec
> > > > > test_testmod_attach_api:PASS:fentry_raw_skel_load 0 nsec
> > > > > trigger_module_test_read:PASS:testmod_file_open 0 nsec
> > > > > test_testmod_attach_api:PASS:trigger_read 0 nsec
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test1_result unexpected kprobe_test1_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test2_result unexpected kprobe_test2_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kprobe_test3_result unexpected kprobe_test3_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test1_result unexpected kretprobe_test1_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test2_result unexpected kretprobe_test2_result: actual 0 != expected 1
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > > >
> > > >
> > > > Seems like this selftest is still broken. Please let me know if you
> > > > need help with building and running BPF selftests to reproduce this
> > > > locally.
> > >
> > > Thanks, It will be helpful. Also I would like to know, is there any
> > > debug mode (like print more debug logs)?
> >
> > So first of all, the assumption is that you will build a most recent
> > kernel with Kconfig values set from
> > tools/testing/selftests/bpf/config, so make sure you  append that to
> > your kernel config. Then build the kernel, BPF selftests' Makefile
> > tries to find latest built kernel (according to KBUILD_OUTPUT/O/etc).
> >
> > Now to building BPF selftests:
> >
> > $ cd tools/testing/selftests/bpf
> > $ make -j$(nproc)
> >
> > You'll need decently recent Clang and a few dependent packages. At
> > least elfutils-devel, libzstd-devel, but there might be a few more
> > which I never can remember.
> >
> > Once everything is built, you can run the failing test with
> >
> > $ sudo ./test_progs -t kprobe_multi_testmod_test -v
> >
> > The source code for user space part for that test is in
> > prog_tests/kprobe_multi_testmod_test.c and BPF-side is in
> > progs/kprobe_multi.c.
>
> Thanks for the information!
>
> >
> > Taking failing output from the test:
> >
> > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> >
> > kretprobe_test3_result is a sort of identifier for a test condition,
> > you can find a corresponding line in user space .c file grepping for
> > that:
> >
> > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > "kretprobe_test3_result");
> >
> > Most probably the problem is in:
> >
> > __u64 addr = bpf_get_func_ip(ctx);
>
> Yeah, and as I replyed to another thread, the problem is
> that the ftrace entry_ip is not symbol ip.
>
> We have ftrace_call_adjust() arch function for reverse
> direction (symbol ip to ftrace entry ip) but what we need
> here is the reverse translate function (ftrace entry to symbol)
>
> The easiest way is to use kallsyms to find it, but this is
> a bit costful (but it just increase bsearch in several levels).
> Other possible options are
>
>  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
>    of address. [sym_addr, sym_addr + offset) returns true,
>    bpf_kprobe_multi_cookie() can find the entry address.
>    The offset depends on arch, but 16 is enough.

This feels quite sloppy, tbh...

>
>  - Change bpf_kprobe_multi_addrs_cmp() to call
>    ftrace_call_adjust() before comparing. This may take a
>    cost but find actual entry address.

too expensive, unnecessary runtime overhead

>
>  - (Cached method) when making link->addrs, make a link->adj_addrs
>    array too, where adj_addrs[i] == ftrace_call_adjust(addrs[i]).
>
> Let me try the 3rd one. It may consume more memory but the
> fastest solution.

I like the third one as well, but I'm not following why we'd need more memory.

We can store adjusted addresses in existing link->addrs, and we'll
need to adjust them to originals (potentially expensively) only in
bpf_kprobe_multi_link_fill_link_info(). But that's not a hot path, so
it doesn't matter.

>
> Thank you,
>
> >
> > which is then checked as
> >
> > if ((const void *) addr == &bpf_testmod_fentry_test3)
> >
> >
> > With your patches this doesn't match anymore.
>
> It actually enables the fprobe on those architectures, so
> without my series, those test should be skipped.

Ok, cool, still, let's fix the issue.

>
> Thank you,
>
> >
> >
> > Hopefully the above gives you some pointers, let me know if you run
> > into any problems.
> >
> > >
> > > Thank you,
> > >
> > > >
> > > > >
> > > > > Please note: this email is coming from an unmonitored mailbox. If you have
> > > > > questions or feedback, please reach out to the Meta Kernel CI team at
> > > > > kernel-ci@meta.com.
> > >
> > >
> > > --
> > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
Andrii Nakryiko Sept. 13, 2024, 9:23 p.m. UTC | #8
On Fri, Sep 13, 2024 at 6:50 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Fri, 13 Sep 2024 21:45:15 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > On Fri, 13 Sep 2024 17:59:35 +0900
> > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> >
> > > >
> > > > Taking failing output from the test:
> > > >
> > > > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > >
> > > > kretprobe_test3_result is a sort of identifier for a test condition,
> > > > you can find a corresponding line in user space .c file grepping for
> > > > that:
> > > >
> > > > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > > > "kretprobe_test3_result");
> > > >
> > > > Most probably the problem is in:
> > > >
> > > > __u64 addr = bpf_get_func_ip(ctx);
> > >
> > > Yeah, and as I replyed to another thread, the problem is
> > > that the ftrace entry_ip is not symbol ip.
> > >
> > > We have ftrace_call_adjust() arch function for reverse
> > > direction (symbol ip to ftrace entry ip) but what we need
> > > here is the reverse translate function (ftrace entry to symbol)
> > >
> > > The easiest way is to use kallsyms to find it, but this is
> > > a bit costful (but it just increase bsearch in several levels).
> > > Other possible options are
> > >
> > >  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
> > >    of address. [sym_addr, sym_addr + offset) returns true,
> > >    bpf_kprobe_multi_cookie() can find the entry address.
> > >    The offset depends on arch, but 16 is enough.
> >
> > Oops. no, this bpf_kprobe_multi_cookie() is used only for storing
> > test data. Not a general problem solving. (I saw kprobe_multi_check())
> >
> > So solving problem is much costly, we need to put more arch-
> > dependent in bpf_trace, and make sure it does reverse translation
> > of ftrace_call_adjust(). (this means if you expand arch support,
> > you need to add such implementation)
>
> OK, can you try this one?
>

I'm running out of time today, so I won't have time to try this, sorry.

But see my last reply, I think adjusting link->addrs once before
attachment is the way to go. It gives us fast and simple lookups at
runtime.

In my last reply I assumed that we won't need to keep a copy of
original addrs (because we can dynamically adjust for
bpf_kprobe_multi_link_fill_link_info()), but I now realize that we do
need that for bpf_get_func_ip() anyways.

Still, I'd rather have an extra link->adj_addrs with a copy and do a
quick and simple lookup at runtime. So I suggest going with that. At
the very worst case it's a few kilobytes of memory for thousands of
attached functions, no big deal, IMO.

It's vastly better than maintaining arch-specific reverse of
ftrace_call_adjust(), isn't it?

Jiri, any opinion here?

>
> From 81bc599911507215aa9faa1077a601880cbd654a Mon Sep 17 00:00:00 2001
> From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
> Date: Fri, 13 Sep 2024 21:43:46 +0900
> Subject: [PATCH] bpf: Add get_entry_ip() for arm64
>
> Add get_entry_ip() implementation for arm64. This is based on
> the information in ftrace_call_adjust() for arm64.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/bpf_trace.c | 64 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index deb629f4a510..b0cf6e5b8965 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1066,6 +1066,70 @@ static unsigned long get_entry_ip(unsigned long fentry_ip)
>                 fentry_ip -= ENDBR_INSN_SIZE;
>         return fentry_ip;
>  }
> +#elif defined (CONFIG_ARM64)
> +#include <asm/insn.h>
> +
> +static unsigned long get_entry_ip(unsigned long fentry_ip)
> +{
> +       u32 insn;
> +
> +       /*
> +        * When using patchable-function-entry without pre-function NOPS, ftrace
> +        * entry is the address of the first NOP after the function entry point.
> +        *
> +        * The compiler has either generated:
> +        *
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * Or:
> +        *
> +        * func-04:             BTI     C
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * The fentry_ip is the address of `BL <caller>` which is at `func + 4`
> +        * bytes in either case.
> +        */
> +       if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> +               return fentry_ip - AARCH64_INSN_SIZE;
> +
> +       /*
> +        * When using patchable-function-entry with pre-function NOPs, BTI is
> +        * a bit different.
> +        *
> +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> +        * func+04:             NOP             // To be patched to BL <caller>
> +        *
> +        * Or:
> +        *
> +        * func+00:     func:   BTI     C
> +        * func+04:             NOP             // To be patched to MOV X9, LR
> +        * func+08:             NOP             // To be patched to BL <caller>
> +        *
> +        * The fentry_ip is the address of `BL <caller>` which is at either
> +        * `func + 4` or `func + 8` depends on whether there is a BTI.
> +        */
> +
> +       /* If there is no BTI, the func address should be one instruction before. */
> +       if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> +               return fentry_ip - AARCH64_INSN_SIZE;
> +
> +       /* We want to be extra safe in case entry ip is on the page edge,
> +        * but otherwise we need to avoid get_kernel_nofault()'s overhead.
> +        */
> +       if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
> +               if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
> +                       return fentry_ip - AARCH64_INSN_SIZE;
> +       } else {
> +               insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
> +       }
> +
> +       if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
> +               return fentry_ip - AARCH64_INSN_SIZE * 2;
> +
> +       return fentry_ip - AARCH64_INSN_SIZE;
> +}
>  #else
>  #define get_entry_ip(fentry_ip) fentry_ip
>  #endif
> --
> 2.34.1
>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) Sept. 14, 2024, 1:58 a.m. UTC | #9
On Fri, 13 Sep 2024 14:16:47 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> >
> >  - Change bpf_kprobe_multi_addrs_cmp() to call
> >    ftrace_call_adjust() before comparing. This may take a
> >    cost but find actual entry address.
> 
> too expensive, unnecessary runtime overhead

Indeed.

> >
> >  - (Cached method) when making link->addrs, make a link->adj_addrs
> >    array too, where adj_addrs[i] == ftrace_call_adjust(addrs[i]).
> >
> > Let me try the 3rd one. It may consume more memory but the
> > fastest solution.
> 
> I like the third one as well, but I'm not following why we'd need more memory.
> 
> We can store adjusted addresses in existing link->addrs, and we'll
> need to adjust them to originals (potentially expensively) only in
> bpf_kprobe_multi_link_fill_link_info(). But that's not a hot path, so
> it doesn't matter.

OK, that works well, but let me check it has any side effects.

BTW, I worry about what the user expects for `bpf_get_func_ip()`.

 * u64 bpf_get_func_ip(void *ctx)
 * 	Description
 * 		Get address of the traced function (for tracing and kprobe programs).
 *
 * 		When called for kprobe program attached as uprobe it returns
 * 		probe address for both entry and return uprobe.
 *
 * 	Return
 * 		Address of the traced function for kprobe.
 * 		0 for kprobes placed within the function (not at the entry).
 * 		Address of the probe for uprobe and return uprobe.

This seems expecting to get the function address, not ftrace entry
address. If user expects it returns actual function entry address,
we need to change `get_entry_ip()`(*) as the patch I sent.

If they can accept the address a bit shifted from the entry address,
I think we can change link->addrs.

Jiri, and other BPF users, what you expect and what you want to
have? (what the reason to use this API?)

Thank you,

(*) bpf_get_func_ip() seems to read `run_ctx->entry_ip` which is
set by `get_entry_ip(fentry_ip)`.

> 
> >
> > Thank you,
> >
> > >
> > > which is then checked as
> > >
> > > if ((const void *) addr == &bpf_testmod_fentry_test3)
> > >
> > >
> > > With your patches this doesn't match anymore.
> >
> > It actually enables the fprobe on those architectures, so
> > without my series, those test should be skipped.
> 
> Ok, cool, still, let's fix the issue.
> 
> >
> > Thank you,
> >
> > >
> > >
> > > Hopefully the above gives you some pointers, let me know if you run
> > > into any problems.
> > >
> > > >
> > > > Thank you,
> > > >
> > > > >
> > > > > >
> > > > > > Please note: this email is coming from an unmonitored mailbox. If you have
> > > > > > questions or feedback, please reach out to the Meta Kernel CI team at
> > > > > > kernel-ci@meta.com.
> > > >
> > > >
> > > > --
> > > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) Sept. 14, 2024, 2:10 a.m. UTC | #10
On Fri, 13 Sep 2024 14:23:38 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Fri, Sep 13, 2024 at 6:50 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Fri, 13 Sep 2024 21:45:15 +0900
> > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> >
> > > On Fri, 13 Sep 2024 17:59:35 +0900
> > > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> > >
> > > > >
> > > > > Taking failing output from the test:
> > > > >
> > > > > > > > kprobe_multi_testmod_check:FAIL:kretprobe_test3_result unexpected kretprobe_test3_result: actual 0 != expected 1
> > > > >
> > > > > kretprobe_test3_result is a sort of identifier for a test condition,
> > > > > you can find a corresponding line in user space .c file grepping for
> > > > > that:
> > > > >
> > > > > ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1,
> > > > > "kretprobe_test3_result");
> > > > >
> > > > > Most probably the problem is in:
> > > > >
> > > > > __u64 addr = bpf_get_func_ip(ctx);
> > > >
> > > > Yeah, and as I replyed to another thread, the problem is
> > > > that the ftrace entry_ip is not symbol ip.
> > > >
> > > > We have ftrace_call_adjust() arch function for reverse
> > > > direction (symbol ip to ftrace entry ip) but what we need
> > > > here is the reverse translate function (ftrace entry to symbol)
> > > >
> > > > The easiest way is to use kallsyms to find it, but this is
> > > > a bit costful (but it just increase bsearch in several levels).
> > > > Other possible options are
> > > >
> > > >  - Change bpf_kprobe_multi_addrs_cmp() to accept a range
> > > >    of address. [sym_addr, sym_addr + offset) returns true,
> > > >    bpf_kprobe_multi_cookie() can find the entry address.
> > > >    The offset depends on arch, but 16 is enough.
> > >
> > > Oops. no, this bpf_kprobe_multi_cookie() is used only for storing
> > > test data. Not a general problem solving. (I saw kprobe_multi_check())
> > >
> > > So solving problem is much costly, we need to put more arch-
> > > dependent in bpf_trace, and make sure it does reverse translation
> > > of ftrace_call_adjust(). (this means if you expand arch support,
> > > you need to add such implementation)
> >
> > OK, can you try this one?
> >
> 
> I'm running out of time today, so I won't have time to try this, sorry.
> 
> But see my last reply, I think adjusting link->addrs once before
> attachment is the way to go. It gives us fast and simple lookups at
> runtime.
> 
> In my last reply I assumed that we won't need to keep a copy of
> original addrs (because we can dynamically adjust for
> bpf_kprobe_multi_link_fill_link_info()), but I now realize that we do
> need that for bpf_get_func_ip() anyways.

Yes, that's the point.

> 
> Still, I'd rather have an extra link->adj_addrs with a copy and do a
> quick and simple lookup at runtime. So I suggest going with that. At
> the very worst case it's a few kilobytes of memory for thousands of
> attached functions, no big deal, IMO.

But if you look carefully the below code, it should be faster than
looking up from `link->adj_addrs` since most of conditions are
build-time condition. (Only when the kernel enables BTI and the
function entry(+8bytes) is on the page boundary, we will call
get_kernel_nofault(), but it is very rare case.)

The only one concern about the below code is that is architecture
dependent. It should be provided by arch/arm64/kernel/ftrace.c.

> 
> It's vastly better than maintaining arch-specific reverse of
> ftrace_call_adjust(), isn't it?

Yes, it should be (and x86_64 ENDBR part too).

> 
> Jiri, any opinion here?


Thank you,

> 
> >
> > From 81bc599911507215aa9faa1077a601880cbd654a Mon Sep 17 00:00:00 2001
> > From: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
> > Date: Fri, 13 Sep 2024 21:43:46 +0900
> > Subject: [PATCH] bpf: Add get_entry_ip() for arm64
> >
> > Add get_entry_ip() implementation for arm64. This is based on
> > the information in ftrace_call_adjust() for arm64.
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  kernel/trace/bpf_trace.c | 64 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 64 insertions(+)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index deb629f4a510..b0cf6e5b8965 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1066,6 +1066,70 @@ static unsigned long get_entry_ip(unsigned long fentry_ip)
> >                 fentry_ip -= ENDBR_INSN_SIZE;
> >         return fentry_ip;
> >  }
> > +#elif defined (CONFIG_ARM64)
> > +#include <asm/insn.h>
> > +
> > +static unsigned long get_entry_ip(unsigned long fentry_ip)
> > +{
> > +       u32 insn;
> > +
> > +       /*
> > +        * When using patchable-function-entry without pre-function NOPS, ftrace
> > +        * entry is the address of the first NOP after the function entry point.
> > +        *
> > +        * The compiler has either generated:
> > +        *
> > +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> > +        * func+04:             NOP             // To be patched to BL <caller>
> > +        *
> > +        * Or:
> > +        *
> > +        * func-04:             BTI     C
> > +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> > +        * func+04:             NOP             // To be patched to BL <caller>
> > +        *
> > +        * The fentry_ip is the address of `BL <caller>` which is at `func + 4`
> > +        * bytes in either case.
> > +        */
> > +       if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
> > +               return fentry_ip - AARCH64_INSN_SIZE;
> > +
> > +       /*
> > +        * When using patchable-function-entry with pre-function NOPs, BTI is
> > +        * a bit different.
> > +        *
> > +        * func+00:     func:   NOP             // To be patched to MOV X9, LR
> > +        * func+04:             NOP             // To be patched to BL <caller>
> > +        *
> > +        * Or:
> > +        *
> > +        * func+00:     func:   BTI     C
> > +        * func+04:             NOP             // To be patched to MOV X9, LR
> > +        * func+08:             NOP             // To be patched to BL <caller>
> > +        *
> > +        * The fentry_ip is the address of `BL <caller>` which is at either
> > +        * `func + 4` or `func + 8` depends on whether there is a BTI.
> > +        */
> > +
> > +       /* If there is no BTI, the func address should be one instruction before. */
> > +       if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
> > +               return fentry_ip - AARCH64_INSN_SIZE;
> > +
> > +       /* We want to be extra safe in case entry ip is on the page edge,
> > +        * but otherwise we need to avoid get_kernel_nofault()'s overhead.
> > +        */
> > +       if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
> > +               if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
> > +                       return fentry_ip - AARCH64_INSN_SIZE;
> > +       } else {
> > +               insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
> > +       }
> > +
> > +       if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
> > +               return fentry_ip - AARCH64_INSN_SIZE * 2;
> > +
> > +       return fentry_ip - AARCH64_INSN_SIZE;
> > +}
> >  #else
> >  #define get_entry_ip(fentry_ip) fentry_ip
> >  #endif
> > --
> > 2.34.1
> >
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>