Message ID | 169139091575.324433.13168120610633669432.stgit@devnote2 (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs | expand |
On Mon, Aug 7, 2023 at 8:48 AM Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe > on arm64. This patch lets fprobe code build on configs WITH_ARGS and !WITH_REGS but fprobe wouldn't run on these builds because fprobe still registers to ftrace with FTRACE_OPS_FL_SAVE_REGS, which would fail on !WITH_REGS. Shouldn't we also let the fprobe_init callers decide whether they want REGS or not ? > static int > kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, > - unsigned long ret_ip, struct pt_regs *regs, > + unsigned long ret_ip, struct ftrace_regs *fregs, > void *data) > { > struct bpf_kprobe_multi_link *link; > + struct pt_regs *regs = ftrace_get_regs(fregs); > + > + if (!regs) > + return 0; (with the above comment addressed) this means that BPF multi_kprobe would successfully attach on builds WITH_ARGS but the programs would never actually run because here regs would be 0. This is a confusing failure mode for the user. I think that if multi_kprobe won't work (because we don't have a pt_regs conversion path yet), the user should be notified at attachment time that they won't be getting any events. That's why I think kprobe_multi should inform fprobe_init that it wants FTRACE_OPS_FL_SAVE_REGS and fail if that's not possible (no trampoline for it for example)
Hi Florent, On Wed, 9 Aug 2023 12:28:38 +0200 Florent Revest <revest@chromium.org> wrote: > On Mon, Aug 7, 2023 at 8:48 AM Masami Hiramatsu (Google) > <mhiramat@kernel.org> wrote: > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe > > on arm64. > > This patch lets fprobe code build on configs WITH_ARGS and !WITH_REGS > but fprobe wouldn't run on these builds because fprobe still registers > to ftrace with FTRACE_OPS_FL_SAVE_REGS, which would fail on > !WITH_REGS. Shouldn't we also let the fprobe_init callers decide > whether they want REGS or not ? Ah, I think you meant FPROBE_EVENTS? Yes I forgot to add the dependency on it. But fprobe itself can work because fprobe just pass the ftrace_regs to the handlers. (Note that exit callback may not work until next patch) > > > static int > > kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, > > - unsigned long ret_ip, struct pt_regs *regs, > > + unsigned long ret_ip, struct ftrace_regs *fregs, > > void *data) > > { > > struct bpf_kprobe_multi_link *link; > > + struct pt_regs *regs = ftrace_get_regs(fregs); > > + > > + if (!regs) > > + return 0; > > (with the above comment addressed) this means that BPF multi_kprobe > would successfully attach on builds WITH_ARGS but the programs would > never actually run because here regs would be 0. This is a confusing > failure mode for the user. I think that if multi_kprobe won't work > (because we don't have a pt_regs conversion path yet), the user should > be notified at attachment time that they won't be getting any events. Yes, so I changed it will not be compiled in that case. @@ -2460,7 +2460,7 @@ static int __init bpf_event_init(void) fs_initcall(bpf_event_init); #endif /* CONFIG_MODULES */ -#ifdef CONFIG_FPROBE +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS struct bpf_kprobe_multi_link { struct bpf_link link; struct fprobe fp; > That's why I think kprobe_multi should inform fprobe_init that it > wants FTRACE_OPS_FL_SAVE_REGS and fail if that's not possible (no > trampoline for it for example) Yeah, that's another possibility, but in the previous thread we discussed and agreed to introduce the ftrace_partial_regs() which will copy the partial registers from ftrace_regs to pt_regs. Thank you,
On Wed, Aug 9, 2023 at 4:10 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > Hi Florent, > > On Wed, 9 Aug 2023 12:28:38 +0200 > Florent Revest <revest@chromium.org> wrote: > > > On Mon, Aug 7, 2023 at 8:48 AM Masami Hiramatsu (Google) > > <mhiramat@kernel.org> wrote: > > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS > > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe > > > on arm64. > > > > This patch lets fprobe code build on configs WITH_ARGS and !WITH_REGS > > but fprobe wouldn't run on these builds because fprobe still registers > > to ftrace with FTRACE_OPS_FL_SAVE_REGS, which would fail on > > !WITH_REGS. Shouldn't we also let the fprobe_init callers decide > > whether they want REGS or not ? > > Ah, I think you meant FPROBE_EVENTS? Yes I forgot to add the dependency > on it. But fprobe itself can work because fprobe just pass the ftrace_regs > to the handlers. (Note that exit callback may not work until next patch) No, I mean that fprobe still registers its ftrace ops with the FTRACE_OPS_FL_SAVE_REGS flag: https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/fprobe.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n185 Which means that __register_ftrace_function will return -EINVAL on builds !WITH_REGS: https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/ftrace.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n338 As documented here: https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/include/linux/ftrace.h?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n188 There are two parts to using sparse pt_regs. One is "static": having WITH_ARGS in the config, the second one is "dynamic": a ftrace ops needs to specify that it doesn't want to go through the ftrace trampoline that saves a full pt_regs, by not giving FTRACE_OPS_FL_SAVE_REGS. If we want fprobe to work on builds !WITH_REGS then we should both remove Kconfig dependencies to WITH_REGS (like you've done) but also stop passing this ftrace ops flag. > > > > > static int > > > kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, > > > - unsigned long ret_ip, struct pt_regs *regs, > > > + unsigned long ret_ip, struct ftrace_regs *fregs, > > > void *data) > > > { > > > struct bpf_kprobe_multi_link *link; > > > + struct pt_regs *regs = ftrace_get_regs(fregs); > > > + > > > + if (!regs) > > > + return 0; > > > > (with the above comment addressed) this means that BPF multi_kprobe > > would successfully attach on builds WITH_ARGS but the programs would > > never actually run because here regs would be 0. This is a confusing > > failure mode for the user. I think that if multi_kprobe won't work > > (because we don't have a pt_regs conversion path yet), the user should > > be notified at attachment time that they won't be getting any events. > > Yes, so I changed it will not be compiled in that case. Ah ok I missed the CONFIG_DYNAMIC_FTRACE_WITH_REGS guard that you keep between patch 1 and 6 to avoid this case. (after patch 6, it's no longer an issue) then that's fine.
On Wed, Aug 9, 2023 at 6:09 PM Florent Revest <revest@chromium.org> wrote: > > On Wed, Aug 9, 2023 at 4:10 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > Hi Florent, > > > > On Wed, 9 Aug 2023 12:28:38 +0200 > > Florent Revest <revest@chromium.org> wrote: > > > > > On Mon, Aug 7, 2023 at 8:48 AM Masami Hiramatsu (Google) > > > <mhiramat@kernel.org> wrote: > > > > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS > > > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe > > > > on arm64. > > > > > > This patch lets fprobe code build on configs WITH_ARGS and !WITH_REGS > > > but fprobe wouldn't run on these builds because fprobe still registers > > > to ftrace with FTRACE_OPS_FL_SAVE_REGS, which would fail on > > > !WITH_REGS. Shouldn't we also let the fprobe_init callers decide > > > whether they want REGS or not ? > > > > Ah, I think you meant FPROBE_EVENTS? Yes I forgot to add the dependency > > on it. But fprobe itself can work because fprobe just pass the ftrace_regs > > to the handlers. (Note that exit callback may not work until next patch) > > No, I mean that fprobe still registers its ftrace ops with the > FTRACE_OPS_FL_SAVE_REGS flag: > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/fprobe.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n185 > > Which means that __register_ftrace_function will return -EINVAL on > builds !WITH_REGS: > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/ftrace.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n338 > > As documented here: > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/include/linux/ftrace.h?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n188 > > There are two parts to using sparse pt_regs. One is "static": having > WITH_ARGS in the config, the second one is "dynamic": a ftrace ops > needs to specify that it doesn't want to go through the ftrace > trampoline that saves a full pt_regs, by not giving > FTRACE_OPS_FL_SAVE_REGS. If we want fprobe to work on builds > !WITH_REGS then we should both remove Kconfig dependencies to > WITH_REGS (like you've done) but also stop passing this ftrace ops > flag. Said in a different way: there are arches that support both WITH_ARGS and WITH_REGS (like x86 actually). They have two ftrace trampolines compiled in: ftrace_caller and ftrace_regs_caller, one for each usecase. If you register to ftrace with the FTRACE_OPS_FL_SAVE_REGS flag you are telling it that what you want is a pt_regs. If you are trying to move away from pt_regs and support ftrace_regs in the more general case (meaning, in the case where it can contain a sparse pt_regs) then you should stop passing that flag so you go through the lighter, faster trampoline and test your code in the circumstances where ftrace_regs isn't just a regular pt_regs but an actually sparse or light data structure. I hope that makes my thoughts clearer? It's a hairy topic ahah
On Wed, 9 Aug 2023 18:17:47 +0200 Florent Revest <revest@chromium.org> wrote: > On Wed, Aug 9, 2023 at 6:09 PM Florent Revest <revest@chromium.org> wrote: > > > > On Wed, Aug 9, 2023 at 4:10 PM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > Hi Florent, > > > > > > On Wed, 9 Aug 2023 12:28:38 +0200 > > > Florent Revest <revest@chromium.org> wrote: > > > > > > > On Mon, Aug 7, 2023 at 8:48 AM Masami Hiramatsu (Google) > > > > <mhiramat@kernel.org> wrote: > > > > > > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS > > > > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe > > > > > on arm64. > > > > > > > > This patch lets fprobe code build on configs WITH_ARGS and !WITH_REGS > > > > but fprobe wouldn't run on these builds because fprobe still registers > > > > to ftrace with FTRACE_OPS_FL_SAVE_REGS, which would fail on > > > > !WITH_REGS. Shouldn't we also let the fprobe_init callers decide > > > > whether they want REGS or not ? > > > > > > Ah, I think you meant FPROBE_EVENTS? Yes I forgot to add the dependency > > > on it. But fprobe itself can work because fprobe just pass the ftrace_regs > > > to the handlers. (Note that exit callback may not work until next patch) > > > > No, I mean that fprobe still registers its ftrace ops with the > > FTRACE_OPS_FL_SAVE_REGS flag: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/fprobe.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n185 > > > > Which means that __register_ftrace_function will return -EINVAL on > > builds !WITH_REGS: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/kernel/trace/ftrace.c?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n338 > > > > As documented here: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/tree/include/linux/ftrace.h?h=topic/fprobe-ftrace-regs&id=2ca022b2753ae0d2a2513c95f7ed5b5b727fb2c4#n188 > > > > There are two parts to using sparse pt_regs. One is "static": having > > WITH_ARGS in the config, the second one is "dynamic": a ftrace ops > > needs to specify that it doesn't want to go through the ftrace > > trampoline that saves a full pt_regs, by not giving > > FTRACE_OPS_FL_SAVE_REGS. If we want fprobe to work on builds > > !WITH_REGS then we should both remove Kconfig dependencies to > > WITH_REGS (like you've done) but also stop passing this ftrace ops > > flag. > > Said in a different way: there are arches that support both WITH_ARGS > and WITH_REGS (like x86 actually). They have two ftrace trampolines > compiled in: ftrace_caller and ftrace_regs_caller, one for each > usecase. If you register to ftrace with the FTRACE_OPS_FL_SAVE_REGS > flag you are telling it that what you want is a pt_regs. If you are > trying to move away from pt_regs and support ftrace_regs in the more > general case (meaning, in the case where it can contain a sparse > pt_regs) then you should stop passing that flag so you go through the > lighter, faster trampoline and test your code in the circumstances > where ftrace_regs isn't just a regular pt_regs but an actually sparse > or light data structure. > > I hope that makes my thoughts clearer? It's a hairy topic ahah Ah, I see your point. static void fprobe_init(struct fprobe *fp) { fp->nmissed = 0; if (fprobe_shared_with_kprobes(fp)) fp->ops.func = fprobe_kprobe_handler; else fp->ops.func = fprobe_handler; fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS; <---- This flag! } So it should be FTRACE_OPS_FL_ARGS. Let me fix that. Thank you!
On Thu, 10 Aug 2023 07:13:30 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > > I hope that makes my thoughts clearer? It's a hairy topic ahah > > Ah, I see your point. > > static void fprobe_init(struct fprobe *fp) > { > fp->nmissed = 0; > if (fprobe_shared_with_kprobes(fp)) > fp->ops.func = fprobe_kprobe_handler; > else > fp->ops.func = fprobe_handler; > fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS; <---- This flag! > } > > So it should be FTRACE_OPS_FL_ARGS. Let me fix that. Yes, this was the concern that I was bringing up, where I did not see an advantage of fprobes over kprobes using ftrace, because they both were saving all registers. -- Steve
diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h index 3e03758151f4..36c0595f7b93 100644 --- a/include/linux/fprobe.h +++ b/include/linux/fprobe.h @@ -35,7 +35,7 @@ struct fprobe { int nr_maxactive; int (*entry_handler)(struct fprobe *fp, unsigned long entry_ip, - unsigned long ret_ip, struct pt_regs *regs, + unsigned long ret_ip, struct ftrace_regs *regs, void *entry_data); void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 61c541c36596..976fd594b446 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -287,7 +287,7 @@ config DYNAMIC_FTRACE_WITH_ARGS config FPROBE bool "Kernel Function Probe (fprobe)" depends on FUNCTION_TRACER - depends on DYNAMIC_FTRACE_WITH_REGS + depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS depends on HAVE_RETHOOK select RETHOOK default n @@ -672,6 +672,7 @@ config FPROBE_EVENTS select TRACING select PROBE_EVENTS select DYNAMIC_EVENTS + depends on DYNAMIC_FTRACE_WITH_REGS default y help This allows user to add tracing events on the function entry and diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 5f2dcabad202..51573eaa04c4 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2460,7 +2460,7 @@ static int __init bpf_event_init(void) fs_initcall(bpf_event_init); #endif /* CONFIG_MODULES */ -#ifdef CONFIG_FPROBE +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS struct bpf_kprobe_multi_link { struct bpf_link link; struct fprobe fp; @@ -2652,10 +2652,14 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link, static int kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip, - unsigned long ret_ip, struct pt_regs *regs, + unsigned long ret_ip, struct ftrace_regs *fregs, void *data) { struct bpf_kprobe_multi_link *link; + struct pt_regs *regs = ftrace_get_regs(fregs); + + if (!regs) + return 0; link = container_of(fp, struct bpf_kprobe_multi_link, fp); kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs); @@ -2910,7 +2914,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr kvfree(cookies); return err; } -#else /* !CONFIG_FPROBE */ +#else /* !CONFIG_DYNAMIC_FTRACE_WITH_REGS */ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) { return -EOPNOTSUPP; diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index 3b21f4063258..15a2aef92733 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -46,7 +46,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long parent_ip, } if (fp->entry_handler) - ret = fp->entry_handler(fp, ip, parent_ip, ftrace_get_regs(fregs), entry_data); + ret = fp->entry_handler(fp, ip, parent_ip, fregs, entry_data); /* If entry_handler returns !0, nmissed is not counted. */ if (rh) { diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c index dfe2e546acdc..4d3ae79f036e 100644 --- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -320,12 +320,16 @@ NOKPROBE_SYMBOL(fexit_perf_func); #endif /* CONFIG_PERF_EVENTS */ static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip, - unsigned long ret_ip, struct pt_regs *regs, + unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data) { struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); + struct pt_regs *regs = ftrace_get_regs(fregs); int ret = 0; + if (!regs) + return 0; + if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE)) fentry_trace_func(tf, entry_ip, regs); #ifdef CONFIG_PERF_EVENTS diff --git a/lib/test_fprobe.c b/lib/test_fprobe.c index 24de0e5ff859..ff607babba18 100644 --- a/lib/test_fprobe.c +++ b/lib/test_fprobe.c @@ -40,7 +40,7 @@ static noinline u32 fprobe_selftest_nest_target(u32 value, u32 (*nest)(u32)) static notrace int fp_entry_handler(struct fprobe *fp, unsigned long ip, unsigned long ret_ip, - struct pt_regs *regs, void *data) + struct ftrace_regs *fregs, void *data) { KUNIT_EXPECT_FALSE(current_test, preemptible()); /* This can be called on the fprobe_selftest_target and the fprobe_selftest_target2 */ @@ -81,7 +81,7 @@ static notrace void fp_exit_handler(struct fprobe *fp, unsigned long ip, static notrace int nest_entry_handler(struct fprobe *fp, unsigned long ip, unsigned long ret_ip, - struct pt_regs *regs, void *data) + struct ftrace_regs *fregs, void *data) { KUNIT_EXPECT_FALSE(current_test, preemptible()); return 0; diff --git a/samples/fprobe/fprobe_example.c b/samples/fprobe/fprobe_example.c index 64e715e7ed11..1545a1aac616 100644 --- a/samples/fprobe/fprobe_example.c +++ b/samples/fprobe/fprobe_example.c @@ -50,7 +50,7 @@ static void show_backtrace(void) static int sample_entry_handler(struct fprobe *fp, unsigned long ip, unsigned long ret_ip, - struct pt_regs *regs, void *data) + struct ftrace_regs *fregs, void *data) { if (use_trace) /*