mbox series

[v7,00/10] fprobe: Introduce fprobe function entry/exit probe

Message ID 164360522462.65877.1891020292202285106.stgit@devnote2 (mailing list archive)
Headers show
Series fprobe: Introduce fprobe function entry/exit probe | expand

Message

Masami Hiramatsu (Google) Jan. 31, 2022, 5 a.m. UTC
Hi,

Here is the 7th version of fprobe. This version fixes unregister_fprobe()
ensures that exit_handler is not called after returning from the
unregister_fprobe(), and fixes some comments and documents.

The previous version is here[1];

[1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u

This series introduces the fprobe, the function entry/exit probe
with multiple probe point support. This also introduces the rethook
for hooking function return as same as the kretprobe does. This
abstraction will help us to generalize the fgraph tracer,
because we can just switch to it from the rethook in fprobe,
depending on the kernel configuration.

The patch [1/10] is from Jiri's series[2].

[2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u

And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
if user wants to share the same code (or share a same resource) on the
fprobe and the kprobes.

Thank you,

---

Jiri Olsa (1):
      ftrace: Add ftrace_set_filter_ips function

Masami Hiramatsu (9):
      fprobe: Add ftrace based probe APIs
      rethook: Add a generic return hook
      rethook: x86: Add rethook x86 implementation
      ARM: rethook: Add rethook arm implementation
      arm64: rethook: Add arm64 rethook implementation
      fprobe: Add exit_handler support
      fprobe: Add sample program for fprobe
      fprobe: Introduce FPROBE_FL_KPROBE_SHARED flag for fprobe
      docs: fprobe: Add fprobe description to ftrace-use.rst


 Documentation/trace/fprobe.rst                |  171 +++++++++++++
 Documentation/trace/index.rst                 |    1 
 arch/arm/Kconfig                              |    1 
 arch/arm/include/asm/stacktrace.h             |    4 
 arch/arm/kernel/stacktrace.c                  |    6 
 arch/arm/probes/Makefile                      |    1 
 arch/arm/probes/rethook.c                     |   71 +++++
 arch/arm64/Kconfig                            |    1 
 arch/arm64/include/asm/stacktrace.h           |    2 
 arch/arm64/kernel/probes/Makefile             |    1 
 arch/arm64/kernel/probes/rethook.c            |   25 ++
 arch/arm64/kernel/probes/rethook_trampoline.S |   87 ++++++
 arch/arm64/kernel/stacktrace.c                |    7 -
 arch/x86/Kconfig                              |    1 
 arch/x86/include/asm/unwind.h                 |    8 +
 arch/x86/kernel/Makefile                      |    1 
 arch/x86/kernel/kprobes/common.h              |    1 
 arch/x86/kernel/rethook.c                     |  115 ++++++++
 include/linux/fprobe.h                        |   97 +++++++
 include/linux/ftrace.h                        |    3 
 include/linux/kprobes.h                       |    3 
 include/linux/rethook.h                       |  100 +++++++
 include/linux/sched.h                         |    3 
 kernel/exit.c                                 |    2 
 kernel/fork.c                                 |    3 
 kernel/trace/Kconfig                          |   26 ++
 kernel/trace/Makefile                         |    2 
 kernel/trace/fprobe.c                         |  341 +++++++++++++++++++++++++
 kernel/trace/ftrace.c                         |   58 ++++
 kernel/trace/rethook.c                        |  313 +++++++++++++++++++++++
 samples/Kconfig                               |    7 +
 samples/Makefile                              |    1 
 samples/fprobe/Makefile                       |    3 
 samples/fprobe/fprobe_example.c               |  120 +++++++++
 34 files changed, 1572 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/trace/fprobe.rst
 create mode 100644 arch/arm/probes/rethook.c
 create mode 100644 arch/arm64/kernel/probes/rethook.c
 create mode 100644 arch/arm64/kernel/probes/rethook_trampoline.S
 create mode 100644 arch/x86/kernel/rethook.c
 create mode 100644 include/linux/fprobe.h
 create mode 100644 include/linux/rethook.h
 create mode 100644 kernel/trace/fprobe.c
 create mode 100644 kernel/trace/rethook.c
 create mode 100644 samples/fprobe/Makefile
 create mode 100644 samples/fprobe/fprobe_example.c

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

Comments

Masami Hiramatsu (Google) Jan. 31, 2022, 9:36 a.m. UTC | #1
Hi Jiri,

On Mon, 31 Jan 2022 14:00:24 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Hi,
> 
> Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> ensures that exit_handler is not called after returning from the
> unregister_fprobe(), and fixes some comments and documents.
> 
> The previous version is here[1];
> 
> [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> 
> This series introduces the fprobe, the function entry/exit probe
> with multiple probe point support. This also introduces the rethook
> for hooking function return as same as the kretprobe does. This
> abstraction will help us to generalize the fgraph tracer,
> because we can just switch to it from the rethook in fprobe,
> depending on the kernel configuration.
> 
> The patch [1/10] is from Jiri's series[2].
> 
> [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> 
> And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> if user wants to share the same code (or share a same resource) on the
> fprobe and the kprobes.

If you want to work on this series, I pushed my working branch on here;

https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/ kprobes/fprobe

Thank you,

> 
> Thank you,
> 
> ---
> 
> Jiri Olsa (1):
>       ftrace: Add ftrace_set_filter_ips function
> 
> Masami Hiramatsu (9):
>       fprobe: Add ftrace based probe APIs
>       rethook: Add a generic return hook
>       rethook: x86: Add rethook x86 implementation
>       ARM: rethook: Add rethook arm implementation
>       arm64: rethook: Add arm64 rethook implementation
>       fprobe: Add exit_handler support
>       fprobe: Add sample program for fprobe
>       fprobe: Introduce FPROBE_FL_KPROBE_SHARED flag for fprobe
>       docs: fprobe: Add fprobe description to ftrace-use.rst
> 
> 
>  Documentation/trace/fprobe.rst                |  171 +++++++++++++
>  Documentation/trace/index.rst                 |    1 
>  arch/arm/Kconfig                              |    1 
>  arch/arm/include/asm/stacktrace.h             |    4 
>  arch/arm/kernel/stacktrace.c                  |    6 
>  arch/arm/probes/Makefile                      |    1 
>  arch/arm/probes/rethook.c                     |   71 +++++
>  arch/arm64/Kconfig                            |    1 
>  arch/arm64/include/asm/stacktrace.h           |    2 
>  arch/arm64/kernel/probes/Makefile             |    1 
>  arch/arm64/kernel/probes/rethook.c            |   25 ++
>  arch/arm64/kernel/probes/rethook_trampoline.S |   87 ++++++
>  arch/arm64/kernel/stacktrace.c                |    7 -
>  arch/x86/Kconfig                              |    1 
>  arch/x86/include/asm/unwind.h                 |    8 +
>  arch/x86/kernel/Makefile                      |    1 
>  arch/x86/kernel/kprobes/common.h              |    1 
>  arch/x86/kernel/rethook.c                     |  115 ++++++++
>  include/linux/fprobe.h                        |   97 +++++++
>  include/linux/ftrace.h                        |    3 
>  include/linux/kprobes.h                       |    3 
>  include/linux/rethook.h                       |  100 +++++++
>  include/linux/sched.h                         |    3 
>  kernel/exit.c                                 |    2 
>  kernel/fork.c                                 |    3 
>  kernel/trace/Kconfig                          |   26 ++
>  kernel/trace/Makefile                         |    2 
>  kernel/trace/fprobe.c                         |  341 +++++++++++++++++++++++++
>  kernel/trace/ftrace.c                         |   58 ++++
>  kernel/trace/rethook.c                        |  313 +++++++++++++++++++++++
>  samples/Kconfig                               |    7 +
>  samples/Makefile                              |    1 
>  samples/fprobe/Makefile                       |    3 
>  samples/fprobe/fprobe_example.c               |  120 +++++++++
>  34 files changed, 1572 insertions(+), 14 deletions(-)
>  create mode 100644 Documentation/trace/fprobe.rst
>  create mode 100644 arch/arm/probes/rethook.c
>  create mode 100644 arch/arm64/kernel/probes/rethook.c
>  create mode 100644 arch/arm64/kernel/probes/rethook_trampoline.S
>  create mode 100644 arch/x86/kernel/rethook.c
>  create mode 100644 include/linux/fprobe.h
>  create mode 100644 include/linux/rethook.h
>  create mode 100644 kernel/trace/fprobe.c
>  create mode 100644 kernel/trace/rethook.c
>  create mode 100644 samples/fprobe/Makefile
>  create mode 100644 samples/fprobe/fprobe_example.c
> 
> --
> Masami Hiramatsu (Linaro) <mhiramat@kernel.org>
Jiri Olsa Jan. 31, 2022, 11:45 a.m. UTC | #2
On Mon, Jan 31, 2022 at 06:36:42PM +0900, Masami Hiramatsu wrote:
> Hi Jiri,
> 
> On Mon, 31 Jan 2022 14:00:24 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Hi,
> > 
> > Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> > ensures that exit_handler is not called after returning from the
> > unregister_fprobe(), and fixes some comments and documents.
> > 
> > The previous version is here[1];
> > 
> > [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> > 
> > This series introduces the fprobe, the function entry/exit probe
> > with multiple probe point support. This also introduces the rethook
> > for hooking function return as same as the kretprobe does. This
> > abstraction will help us to generalize the fgraph tracer,
> > because we can just switch to it from the rethook in fprobe,
> > depending on the kernel configuration.
> > 
> > The patch [1/10] is from Jiri's series[2].
> > 
> > [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> > 
> > And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> > if user wants to share the same code (or share a same resource) on the
> > fprobe and the kprobes.
> 
> If you want to work on this series, I pushed my working branch on here;
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/ kprobes/fprobe

great, I was going to ask for that ;-) thanks

jirka

> 
> Thank you,
> 
> > 
> > Thank you,
> > 
> > ---
> > 
> > Jiri Olsa (1):
> >       ftrace: Add ftrace_set_filter_ips function
> > 
> > Masami Hiramatsu (9):
> >       fprobe: Add ftrace based probe APIs
> >       rethook: Add a generic return hook
> >       rethook: x86: Add rethook x86 implementation
> >       ARM: rethook: Add rethook arm implementation
> >       arm64: rethook: Add arm64 rethook implementation
> >       fprobe: Add exit_handler support
> >       fprobe: Add sample program for fprobe
> >       fprobe: Introduce FPROBE_FL_KPROBE_SHARED flag for fprobe
> >       docs: fprobe: Add fprobe description to ftrace-use.rst
> > 
> > 
> >  Documentation/trace/fprobe.rst                |  171 +++++++++++++
> >  Documentation/trace/index.rst                 |    1 
> >  arch/arm/Kconfig                              |    1 
> >  arch/arm/include/asm/stacktrace.h             |    4 
> >  arch/arm/kernel/stacktrace.c                  |    6 
> >  arch/arm/probes/Makefile                      |    1 
> >  arch/arm/probes/rethook.c                     |   71 +++++
> >  arch/arm64/Kconfig                            |    1 
> >  arch/arm64/include/asm/stacktrace.h           |    2 
> >  arch/arm64/kernel/probes/Makefile             |    1 
> >  arch/arm64/kernel/probes/rethook.c            |   25 ++
> >  arch/arm64/kernel/probes/rethook_trampoline.S |   87 ++++++
> >  arch/arm64/kernel/stacktrace.c                |    7 -
> >  arch/x86/Kconfig                              |    1 
> >  arch/x86/include/asm/unwind.h                 |    8 +
> >  arch/x86/kernel/Makefile                      |    1 
> >  arch/x86/kernel/kprobes/common.h              |    1 
> >  arch/x86/kernel/rethook.c                     |  115 ++++++++
> >  include/linux/fprobe.h                        |   97 +++++++
> >  include/linux/ftrace.h                        |    3 
> >  include/linux/kprobes.h                       |    3 
> >  include/linux/rethook.h                       |  100 +++++++
> >  include/linux/sched.h                         |    3 
> >  kernel/exit.c                                 |    2 
> >  kernel/fork.c                                 |    3 
> >  kernel/trace/Kconfig                          |   26 ++
> >  kernel/trace/Makefile                         |    2 
> >  kernel/trace/fprobe.c                         |  341 +++++++++++++++++++++++++
> >  kernel/trace/ftrace.c                         |   58 ++++
> >  kernel/trace/rethook.c                        |  313 +++++++++++++++++++++++
> >  samples/Kconfig                               |    7 +
> >  samples/Makefile                              |    1 
> >  samples/fprobe/Makefile                       |    3 
> >  samples/fprobe/fprobe_example.c               |  120 +++++++++
> >  34 files changed, 1572 insertions(+), 14 deletions(-)
> >  create mode 100644 Documentation/trace/fprobe.rst
> >  create mode 100644 arch/arm/probes/rethook.c
> >  create mode 100644 arch/arm64/kernel/probes/rethook.c
> >  create mode 100644 arch/arm64/kernel/probes/rethook_trampoline.S
> >  create mode 100644 arch/x86/kernel/rethook.c
> >  create mode 100644 include/linux/fprobe.h
> >  create mode 100644 include/linux/rethook.h
> >  create mode 100644 kernel/trace/fprobe.c
> >  create mode 100644 kernel/trace/rethook.c
> >  create mode 100644 samples/fprobe/Makefile
> >  create mode 100644 samples/fprobe/fprobe_example.c
> > 
> > --
> > Masami Hiramatsu (Linaro) <mhiramat@kernel.org>
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>
>
Jiri Olsa Feb. 2, 2022, 12:02 a.m. UTC | #3
On Mon, Jan 31, 2022 at 02:00:24PM +0900, Masami Hiramatsu wrote:
> Hi,
> 
> Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> ensures that exit_handler is not called after returning from the
> unregister_fprobe(), and fixes some comments and documents.
> 
> The previous version is here[1];
> 
> [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> 
> This series introduces the fprobe, the function entry/exit probe
> with multiple probe point support. This also introduces the rethook
> for hooking function return as same as the kretprobe does. This
> abstraction will help us to generalize the fgraph tracer,
> because we can just switch to it from the rethook in fprobe,
> depending on the kernel configuration.
> 
> The patch [1/10] is from Jiri's series[2].
> 
> [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> 
> And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> if user wants to share the same code (or share a same resource) on the
> fprobe and the kprobes.

hi,
it works fine for bpf selftests, but when I use it through bpftrace
to attach more probes with:

  # ./src/bpftrace -e 'kprobe:ksys_* { }'
  Attaching 27 probes

I'm getting stalls like:

krava33 login: [  988.574069] INFO: task bpftrace:4137 blocked for more than 122 seconds.
[  988.577577]       Not tainted 5.16.0+ #89
[  988.580173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  988.585538] task:bpftrace        state:D stack:    0 pid: 4137 ppid:  4123 flags:0x00004004
[  988.589869] Call Trace:
[  988.591312]  <TASK>
[  988.592577]  __schedule+0x3a8/0xd30
[  988.594469]  ? wait_for_completion+0x84/0x110
[  988.596753]  schedule+0x4e/0xc0
[  988.598480]  schedule_timeout+0xed/0x130
[  988.600524]  ? rcu_read_lock_sched_held+0x12/0x70
[  988.602901]  ? lock_release+0x253/0x4a0
[  988.604935]  ? lock_acquired+0x1b7/0x410
[  988.607041]  ? trace_hardirqs_on+0x1b/0xe0
[  988.609202]  wait_for_completion+0xae/0x110
[  988.613762]  __wait_rcu_gp+0x127/0x130
[  988.615787]  synchronize_rcu_tasks_generic+0x46/0xa0
[  988.618329]  ? call_rcu_tasks+0x20/0x20
[  988.620600]  ? rcu_tasks_pregp_step+0x10/0x10
[  988.623232]  ftrace_shutdown.part.0+0x174/0x210
[  988.625820]  unregister_ftrace_function+0x37/0x60
[  988.628480]  unregister_fprobe+0x2d/0x50
[  988.630928]  bpf_link_free+0x4e/0x70
[  988.633126]  bpf_link_release+0x11/0x20
[  988.635249]  __fput+0xae/0x270
[  988.637022]  task_work_run+0x5c/0xa0
[  988.639016]  exit_to_user_mode_prepare+0x251/0x260
[  988.641294]  syscall_exit_to_user_mode+0x16/0x50
[  988.646249]  do_syscall_64+0x48/0x90
[  988.648218]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  988.650787] RIP: 0033:0x7f9079e95fbb
[  988.652761] RSP: 002b:00007ffd474fa3b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  988.656718] RAX: 0000000000000000 RBX: 00000000011bf8d0 RCX: 00007f9079e95fbb
[  988.660110] RDX: 0000000000000000 RSI: 00007ffd474fa3b0 RDI: 0000000000000019
[  988.663512] RBP: 00007ffd474faaf0 R08: 0000000000000000 R09: 000000000000001a
[  988.666673] R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000001
[  988.669770] R13: 00000000004a19a1 R14: 00007f9083428c00 R15: 00000000008c02d8
[  988.672601]  </TASK>
[  988.675763] INFO: lockdep is turned off.

I have't investigated yet, any idea?

thanks,
jirka
Andrii Nakryiko Feb. 2, 2022, 12:09 a.m. UTC | #4
On Tue, Feb 1, 2022 at 4:02 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Mon, Jan 31, 2022 at 02:00:24PM +0900, Masami Hiramatsu wrote:
> > Hi,
> >
> > Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> > ensures that exit_handler is not called after returning from the
> > unregister_fprobe(), and fixes some comments and documents.
> >
> > The previous version is here[1];
> >
> > [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> >
> > This series introduces the fprobe, the function entry/exit probe
> > with multiple probe point support. This also introduces the rethook
> > for hooking function return as same as the kretprobe does. This
> > abstraction will help us to generalize the fgraph tracer,
> > because we can just switch to it from the rethook in fprobe,
> > depending on the kernel configuration.
> >
> > The patch [1/10] is from Jiri's series[2].
> >
> > [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> >
> > And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> > if user wants to share the same code (or share a same resource) on the
> > fprobe and the kprobes.
>
> hi,
> it works fine for bpf selftests, but when I use it through bpftrace
> to attach more probes with:
>
>   # ./src/bpftrace -e 'kprobe:ksys_* { }'
>   Attaching 27 probes
>
> I'm getting stalls like:
>
> krava33 login: [  988.574069] INFO: task bpftrace:4137 blocked for more than 122 seconds.
> [  988.577577]       Not tainted 5.16.0+ #89
> [  988.580173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  988.585538] task:bpftrace        state:D stack:    0 pid: 4137 ppid:  4123 flags:0x00004004
> [  988.589869] Call Trace:
> [  988.591312]  <TASK>
> [  988.592577]  __schedule+0x3a8/0xd30
> [  988.594469]  ? wait_for_completion+0x84/0x110
> [  988.596753]  schedule+0x4e/0xc0
> [  988.598480]  schedule_timeout+0xed/0x130
> [  988.600524]  ? rcu_read_lock_sched_held+0x12/0x70
> [  988.602901]  ? lock_release+0x253/0x4a0
> [  988.604935]  ? lock_acquired+0x1b7/0x410
> [  988.607041]  ? trace_hardirqs_on+0x1b/0xe0
> [  988.609202]  wait_for_completion+0xae/0x110
> [  988.613762]  __wait_rcu_gp+0x127/0x130
> [  988.615787]  synchronize_rcu_tasks_generic+0x46/0xa0
> [  988.618329]  ? call_rcu_tasks+0x20/0x20
> [  988.620600]  ? rcu_tasks_pregp_step+0x10/0x10
> [  988.623232]  ftrace_shutdown.part.0+0x174/0x210
> [  988.625820]  unregister_ftrace_function+0x37/0x60
> [  988.628480]  unregister_fprobe+0x2d/0x50
> [  988.630928]  bpf_link_free+0x4e/0x70
> [  988.633126]  bpf_link_release+0x11/0x20
> [  988.635249]  __fput+0xae/0x270
> [  988.637022]  task_work_run+0x5c/0xa0
> [  988.639016]  exit_to_user_mode_prepare+0x251/0x260
> [  988.641294]  syscall_exit_to_user_mode+0x16/0x50
> [  988.646249]  do_syscall_64+0x48/0x90
> [  988.648218]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  988.650787] RIP: 0033:0x7f9079e95fbb
> [  988.652761] RSP: 002b:00007ffd474fa3b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
> [  988.656718] RAX: 0000000000000000 RBX: 00000000011bf8d0 RCX: 00007f9079e95fbb
> [  988.660110] RDX: 0000000000000000 RSI: 00007ffd474fa3b0 RDI: 0000000000000019
> [  988.663512] RBP: 00007ffd474faaf0 R08: 0000000000000000 R09: 000000000000001a
> [  988.666673] R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000001
> [  988.669770] R13: 00000000004a19a1 R14: 00007f9083428c00 R15: 00000000008c02d8
> [  988.672601]  </TASK>
> [  988.675763] INFO: lockdep is turned off.
>
> I have't investigated yet, any idea?
>

Do you happen to have a CPU count that's not a power of 2? Check if
you have [0] in your tree, it might be that.

  [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a773abf72eb0cac008743891068ca6edecc44683

> thanks,
> jirka
>
Masami Hiramatsu (Google) Feb. 2, 2022, 7:29 a.m. UTC | #5
Hi Jiri,

On Wed, 2 Feb 2022 01:02:43 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> On Mon, Jan 31, 2022 at 02:00:24PM +0900, Masami Hiramatsu wrote:
> > Hi,
> > 
> > Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> > ensures that exit_handler is not called after returning from the
> > unregister_fprobe(), and fixes some comments and documents.
> > 
> > The previous version is here[1];
> > 
> > [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> > 
> > This series introduces the fprobe, the function entry/exit probe
> > with multiple probe point support. This also introduces the rethook
> > for hooking function return as same as the kretprobe does. This
> > abstraction will help us to generalize the fgraph tracer,
> > because we can just switch to it from the rethook in fprobe,
> > depending on the kernel configuration.
> > 
> > The patch [1/10] is from Jiri's series[2].
> > 
> > [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> > 
> > And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> > if user wants to share the same code (or share a same resource) on the
> > fprobe and the kprobes.
> 
> hi,
> it works fine for bpf selftests, but when I use it through bpftrace
> to attach more probes with:
> 
>   # ./src/bpftrace -e 'kprobe:ksys_* { }'
>   Attaching 27 probes
> 
> I'm getting stalls like:
> 
> krava33 login: [  988.574069] INFO: task bpftrace:4137 blocked for more than 122 seconds.
> [  988.577577]       Not tainted 5.16.0+ #89
> [  988.580173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  988.585538] task:bpftrace        state:D stack:    0 pid: 4137 ppid:  4123 flags:0x00004004
> [  988.589869] Call Trace:
> [  988.591312]  <TASK>
> [  988.592577]  __schedule+0x3a8/0xd30
> [  988.594469]  ? wait_for_completion+0x84/0x110
> [  988.596753]  schedule+0x4e/0xc0
> [  988.598480]  schedule_timeout+0xed/0x130
> [  988.600524]  ? rcu_read_lock_sched_held+0x12/0x70
> [  988.602901]  ? lock_release+0x253/0x4a0
> [  988.604935]  ? lock_acquired+0x1b7/0x410
> [  988.607041]  ? trace_hardirqs_on+0x1b/0xe0
> [  988.609202]  wait_for_completion+0xae/0x110
> [  988.613762]  __wait_rcu_gp+0x127/0x130
> [  988.615787]  synchronize_rcu_tasks_generic+0x46/0xa0
> [  988.618329]  ? call_rcu_tasks+0x20/0x20
> [  988.620600]  ? rcu_tasks_pregp_step+0x10/0x10
> [  988.623232]  ftrace_shutdown.part.0+0x174/0x210
> [  988.625820]  unregister_ftrace_function+0x37/0x60
> [  988.628480]  unregister_fprobe+0x2d/0x50
> [  988.630928]  bpf_link_free+0x4e/0x70
> [  988.633126]  bpf_link_release+0x11/0x20
> [  988.635249]  __fput+0xae/0x270
> [  988.637022]  task_work_run+0x5c/0xa0
> [  988.639016]  exit_to_user_mode_prepare+0x251/0x260
> [  988.641294]  syscall_exit_to_user_mode+0x16/0x50
> [  988.646249]  do_syscall_64+0x48/0x90
> [  988.648218]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  988.650787] RIP: 0033:0x7f9079e95fbb
> [  988.652761] RSP: 002b:00007ffd474fa3b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
> [  988.656718] RAX: 0000000000000000 RBX: 00000000011bf8d0 RCX: 00007f9079e95fbb
> [  988.660110] RDX: 0000000000000000 RSI: 00007ffd474fa3b0 RDI: 0000000000000019
> [  988.663512] RBP: 00007ffd474faaf0 R08: 0000000000000000 R09: 000000000000001a
> [  988.666673] R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000001
> [  988.669770] R13: 00000000004a19a1 R14: 00007f9083428c00 R15: 00000000008c02d8
> [  988.672601]  </TASK>
> [  988.675763] INFO: lockdep is turned off.
> 
> I have't investigated yet, any idea?

Hmm, no, as far as I tested with my example module, it works well as below;

 # insmod fprobe_example.ko symbol='ksys_*' && ls && sleep 1 && rmmod  fprobe_example.ko 
[  125.820113] fprobe_init: Planted fprobe at ksys_*
[  125.823153] sample_entry_handler: Enter <ksys_write+0x0/0xf0> ip = 0x000000008d8da91f
[  125.824247]                          fprobe_handler.part.0+0xb1/0x150
[  125.825024]                          fprobe_handler+0x1e/0x20
[  125.825799]                          0xffffffffa000e0e3
[  125.826540]                          ksys_write+0x5/0xf0
[  125.827344]                          do_syscall_64+0x3b/0x90
[  125.828144]                          entry_SYSCALL_64_after_hwframe+0x44/0xae
fprobe_example.ko
[  125.829178] sample_exit_handler: Return from <ksys_write+0x0/0xf0> ip = 0x000000008d8da91f to rip = 0x00000000be5e197e (__x64_sys_write+0x1a/0x20)
[  125.830707]                          fprobe_exit_handler+0x29/0x30
[  125.831415]                          rethook_trampoline_handler+0x99/0x140
[  125.832259]                          arch_rethook_trampoline_callback+0x3f/0x50
[  125.833110]                          arch_rethook_trampoline+0x2f/0x50
[  125.833803]                          __x64_sys_write+0x1a/0x20
[  125.834448]                          do_syscall_64+0x3b/0x90
[  125.835055]                          entry_SYSCALL_64_after_hwframe+0x44/0xae
[  126.878825] fprobe_exit: fprobe at ksys_* unregistered
#

Even with NR_CPUS=3, it didn't cause the stall. But maybe you'd better test
with Paul's fix as Andrii pointed.

Thank you,




> 
> thanks,
> jirka
>
Jiri Olsa Feb. 2, 2022, 9:03 a.m. UTC | #6
On Tue, Feb 01, 2022 at 04:09:05PM -0800, Andrii Nakryiko wrote:
> On Tue, Feb 1, 2022 at 4:02 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Mon, Jan 31, 2022 at 02:00:24PM +0900, Masami Hiramatsu wrote:
> > > Hi,
> > >
> > > Here is the 7th version of fprobe. This version fixes unregister_fprobe()
> > > ensures that exit_handler is not called after returning from the
> > > unregister_fprobe(), and fixes some comments and documents.
> > >
> > > The previous version is here[1];
> > >
> > > [1] https://lore.kernel.org/all/164338031590.2429999.6203979005944292576.stgit@devnote2/T/#u
> > >
> > > This series introduces the fprobe, the function entry/exit probe
> > > with multiple probe point support. This also introduces the rethook
> > > for hooking function return as same as the kretprobe does. This
> > > abstraction will help us to generalize the fgraph tracer,
> > > because we can just switch to it from the rethook in fprobe,
> > > depending on the kernel configuration.
> > >
> > > The patch [1/10] is from Jiri's series[2].
> > >
> > > [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> > >
> > > And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
> > > if user wants to share the same code (or share a same resource) on the
> > > fprobe and the kprobes.
> >
> > hi,
> > it works fine for bpf selftests, but when I use it through bpftrace
> > to attach more probes with:
> >
> >   # ./src/bpftrace -e 'kprobe:ksys_* { }'
> >   Attaching 27 probes
> >
> > I'm getting stalls like:
> >
> > krava33 login: [  988.574069] INFO: task bpftrace:4137 blocked for more than 122 seconds.
> > [  988.577577]       Not tainted 5.16.0+ #89
> > [  988.580173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  988.585538] task:bpftrace        state:D stack:    0 pid: 4137 ppid:  4123 flags:0x00004004
> > [  988.589869] Call Trace:
> > [  988.591312]  <TASK>
> > [  988.592577]  __schedule+0x3a8/0xd30
> > [  988.594469]  ? wait_for_completion+0x84/0x110
> > [  988.596753]  schedule+0x4e/0xc0
> > [  988.598480]  schedule_timeout+0xed/0x130
> > [  988.600524]  ? rcu_read_lock_sched_held+0x12/0x70
> > [  988.602901]  ? lock_release+0x253/0x4a0
> > [  988.604935]  ? lock_acquired+0x1b7/0x410
> > [  988.607041]  ? trace_hardirqs_on+0x1b/0xe0
> > [  988.609202]  wait_for_completion+0xae/0x110
> > [  988.613762]  __wait_rcu_gp+0x127/0x130
> > [  988.615787]  synchronize_rcu_tasks_generic+0x46/0xa0
> > [  988.618329]  ? call_rcu_tasks+0x20/0x20
> > [  988.620600]  ? rcu_tasks_pregp_step+0x10/0x10
> > [  988.623232]  ftrace_shutdown.part.0+0x174/0x210
> > [  988.625820]  unregister_ftrace_function+0x37/0x60
> > [  988.628480]  unregister_fprobe+0x2d/0x50
> > [  988.630928]  bpf_link_free+0x4e/0x70
> > [  988.633126]  bpf_link_release+0x11/0x20
> > [  988.635249]  __fput+0xae/0x270
> > [  988.637022]  task_work_run+0x5c/0xa0
> > [  988.639016]  exit_to_user_mode_prepare+0x251/0x260
> > [  988.641294]  syscall_exit_to_user_mode+0x16/0x50
> > [  988.646249]  do_syscall_64+0x48/0x90
> > [  988.648218]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [  988.650787] RIP: 0033:0x7f9079e95fbb
> > [  988.652761] RSP: 002b:00007ffd474fa3b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
> > [  988.656718] RAX: 0000000000000000 RBX: 00000000011bf8d0 RCX: 00007f9079e95fbb
> > [  988.660110] RDX: 0000000000000000 RSI: 00007ffd474fa3b0 RDI: 0000000000000019
> > [  988.663512] RBP: 00007ffd474faaf0 R08: 0000000000000000 R09: 000000000000001a
> > [  988.666673] R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000001
> > [  988.669770] R13: 00000000004a19a1 R14: 00007f9083428c00 R15: 00000000008c02d8
> > [  988.672601]  </TASK>
> > [  988.675763] INFO: lockdep is turned off.
> >
> > I have't investigated yet, any idea?
> >
> 
> Do you happen to have a CPU count that's not a power of 2? Check if
> you have [0] in your tree, it might be that.
> 
>   [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a773abf72eb0cac008743891068ca6edecc44683

yes, that helped, thanks

jirka