Message ID | 20191218140622.57bbaca5@xhacker.debian (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v6] arm64: implement KPROBES_ON_FTRACE | expand |
On Wed, 18 Dec 2019 06:21:35 +0000 Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it > eliminates the need for a trap, as well as the need to emulate or > single-step instructions. > > Tested on berlin arm64 platform. > > ~ # mount -t debugfs debugfs /sys/kernel/debug/ > ~ # cd /sys/kernel/debug/ > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events > > before the patch: > > /sys/kernel/debug # cat kprobes/list > ffffff801009fe28 k _do_fork+0x0 [DISABLED] > > after the patch: > > /sys/kernel/debug # cat kprobes/list > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE] BTW, it seems this automatically changes the offset without user's intention or any warnings. How would you manage if the user pass a new probe on _do_fork+0x4? IOW, it is still the question who really wants to probe on the _do_fork+"0", if kprobes modifies it automatically, no one can do that anymore. This can be happen if the user want to record LR or SP value at the function call for debug. If kprobe always modifies it, we will lose the way to do it. Could you remove below function at this moment? > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset) > +{ > + unsigned long addr = kallsyms_lookup_name(name); > + > + if (addr && !offset) { > + unsigned long faddr; > + /* > + * with -fpatchable-function-entry=2, the first 4 bytes is the > + * LR saver, then the actual call insn. So ftrace location is > + * always on the first 4 bytes offset. > + */ > + faddr = ftrace_location_range(addr, > + addr + AARCH64_INSN_SIZE); > + if (faddr) > + return (kprobe_opcode_t *)faddr; > + } > + return (kprobe_opcode_t *)addr; > +} > + > +bool arch_kprobe_on_func_entry(unsigned long offset) > +{ > + return offset <= AARCH64_INSN_SIZE; > +} Without this automatic change, we still can change the offset in upper layer. Thank you,
Hi Masami, On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote: > > > On Wed, 18 Dec 2019 06:21:35 +0000 > Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it > > eliminates the need for a trap, as well as the need to emulate or > > single-step instructions. > > > > Tested on berlin arm64 platform. > > > > ~ # mount -t debugfs debugfs /sys/kernel/debug/ > > ~ # cd /sys/kernel/debug/ > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events > > > > before the patch: > > > > /sys/kernel/debug # cat kprobes/list > > ffffff801009fe28 k _do_fork+0x0 [DISABLED] > > > > after the patch: > > > > /sys/kernel/debug # cat kprobes/list > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE] > > BTW, it seems this automatically changes the offset without > user's intention or any warnings. How would you manage if the user > pass a new probe on _do_fork+0x4? In current implementation, two probes at the same address _do_fork+0x4 > > IOW, it is still the question who really wants to probe on > the _do_fork+"0", if kprobes modifies it automatically, > no one can do that anymore. > This can be happen if the user want to record LR or SP value > at the function call for debug. If kprobe always modifies it, > we will lose the way to do it. arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC -fpatchable-function-entry=2 option to insert two nops. When the function is traced, the first nop will be modified to the LR saver, then the second nop to "bl <ftrace-entry>", commit 3b23e4991fb6(" arm64: implement ftrace with regs") explains the mechanism. So on arm64(in fact any arch makes use of -fpatchable-function-entry will behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location is always on the first 4 bytes offset. I think when users want to register a kprobe on _do_fork+0, what he really want is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4 PS: per my understanding, powerpc's kprobes_on_ftrace also introduces an extra automatic offset due to its implementation. > > Could you remove below function at this moment? > > > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset) > > +{ > > + unsigned long addr = kallsyms_lookup_name(name); > > + > > + if (addr && !offset) { > > + unsigned long faddr; > > + /* > > + * with -fpatchable-function-entry=2, the first 4 bytes is the > > + * LR saver, then the actual call insn. So ftrace location is > > + * always on the first 4 bytes offset. > > + */ > > + faddr = ftrace_location_range(addr, > > + addr + AARCH64_INSN_SIZE); > > + if (faddr) > > + return (kprobe_opcode_t *)faddr; > > + } > > + return (kprobe_opcode_t *)addr; > > +} > > + > > +bool arch_kprobe_on_func_entry(unsigned long offset) > > +{ > > + return offset <= AARCH64_INSN_SIZE; > > +} > > > Without this automatic change, we still can change the offset > in upper layer. If remove the two functions, kprobe on _do_fork can't ride on ftrace infrastructure, but kprobe on _do_fork+4 can. I'm not sure whether this is reasonable. Every kprobe users on arm64 will need to remember to pass an extra offset +4 to make use of kprobe_on_ftrace, could we hide the "+4"? Thanks
Hi Jisheng, On Mon, 23 Dec 2019 07:47:24 +0000 Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > Hi Masami, > > On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote: > > > > > > > > On Wed, 18 Dec 2019 06:21:35 +0000 > > Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > > > > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it > > > eliminates the need for a trap, as well as the need to emulate or > > > single-step instructions. > > > > > > Tested on berlin arm64 platform. > > > > > > ~ # mount -t debugfs debugfs /sys/kernel/debug/ > > > ~ # cd /sys/kernel/debug/ > > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events > > > > > > before the patch: > > > > > > /sys/kernel/debug # cat kprobes/list > > > ffffff801009fe28 k _do_fork+0x0 [DISABLED] > > > > > > after the patch: > > > > > > /sys/kernel/debug # cat kprobes/list > > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE] > > > > BTW, it seems this automatically changes the offset without > > user's intention or any warnings. How would you manage if the user > > pass a new probe on _do_fork+0x4? > > In current implementation, two probes at the same address _do_fork+0x4 OK, that is my point. > > IOW, it is still the question who really wants to probe on > > the _do_fork+"0", if kprobes modifies it automatically, > > no one can do that anymore. > > This can be happen if the user want to record LR or SP value > > at the function call for debug. If kprobe always modifies it, > > we will lose the way to do it. > > arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC > -fpatchable-function-entry=2 option to insert two nops. When the function > is traced, the first nop will be modified to the LR saver, then the > second nop to "bl <ftrace-entry>", commit 3b23e4991fb6(" > arm64: implement ftrace with regs") explains the mechanism. So both of the instruction at func+0 and func+4 are replaced. > So on arm64(in fact any arch makes use of -fpatchable-function-entry will > behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location > is always on the first 4 bytes offset. > > I think when users want to register a kprobe on _do_fork+0, what he really want > is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4 OK, in this case, kprobe should treat the first 2 instructions as a single virtual instruction. This means, - kprobes can probe func+0, but not func+4 if the function is ftraced. (-EILSEQ must be returned) - both debugfs/kprobes/list and tracefs/kprobe_events should show the probed address as func+0. Not func+4. Then, user can use kprobes as if there is one big (8-byte) instruction at the entry of the function. Since probing on func+4 is rejected and the call-site LR/SP is restored in ftrace, there should be no contradiction. It should work as if we put a breakpoint on the func + 0. > > PS: per my understanding, powerpc's kprobes_on_ftrace also introduces an > extra automatic offset due to its implementation. Uh, that is also ugly.... must be fixed. > > > > Could you remove below function at this moment? > > > > > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset) > > > +{ > > > + unsigned long addr = kallsyms_lookup_name(name); > > > + > > > + if (addr && !offset) { > > > + unsigned long faddr; > > > + /* > > > + * with -fpatchable-function-entry=2, the first 4 bytes is the > > > + * LR saver, then the actual call insn. So ftrace location is > > > + * always on the first 4 bytes offset. > > > + */ > > > + faddr = ftrace_location_range(addr, > > > + addr + AARCH64_INSN_SIZE); > > > + if (faddr) > > > + return (kprobe_opcode_t *)faddr; > > > + } > > > + return (kprobe_opcode_t *)addr; > > > +} > > > + > > > +bool arch_kprobe_on_func_entry(unsigned long offset) > > > +{ > > > + return offset <= AARCH64_INSN_SIZE; > > > +} > > > > > > Without this automatic change, we still can change the offset > > in upper layer. > > If remove the two functions, kprobe on _do_fork can't ride on > ftrace infrastructure, but kprobe on _do_fork+4 can. I'm not sure > whether this is reasonable. Every kprobe users on arm64 will need to > remember to pass an extra offset +4 to make use of kprobe_on_ftrace, could > we hide the "+4"? Yes, OK, as I said above, please hide +4. We will see the virtual "call" instruction (= "mov x9, lr; br <addr>") at the entry of ftraced function. :) Thank you,
Hi, On Tue, 24 Dec 2019 19:09:16 +0900 Masami Hiramatsu wrote: > > Hi Jisheng, > > On Mon, 23 Dec 2019 07:47:24 +0000 > Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > > > Hi Masami, > > > > On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote: > > > > > > > > > > > > > On Wed, 18 Dec 2019 06:21:35 +0000 > > > Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote: > > > > > > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it > > > > eliminates the need for a trap, as well as the need to emulate or > > > > single-step instructions. > > > > > > > > Tested on berlin arm64 platform. > > > > > > > > ~ # mount -t debugfs debugfs /sys/kernel/debug/ > > > > ~ # cd /sys/kernel/debug/ > > > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events > > > > > > > > before the patch: > > > > > > > > /sys/kernel/debug # cat kprobes/list > > > > ffffff801009fe28 k _do_fork+0x0 [DISABLED] > > > > > > > > after the patch: > > > > > > > > /sys/kernel/debug # cat kprobes/list > > > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE] > > > > > > BTW, it seems this automatically changes the offset without > > > user's intention or any warnings. How would you manage if the user > > > pass a new probe on _do_fork+0x4? > > > > In current implementation, two probes at the same address _do_fork+0x4 > > OK, that is my point. > > > > IOW, it is still the question who really wants to probe on > > > the _do_fork+"0", if kprobes modifies it automatically, > > > no one can do that anymore. > > > This can be happen if the user want to record LR or SP value > > > at the function call for debug. If kprobe always modifies it, > > > we will lose the way to do it. > > > > arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC > > -fpatchable-function-entry=2 option to insert two nops. When the function > > is traced, the first nop will be modified to the LR saver, then the > > second nop to "bl <ftrace-entry>", commit 3b23e4991fb6(" > > arm64: implement ftrace with regs") explains the mechanism. > > So both of the instruction at func+0 and func+4 are replaced. > > > So on arm64(in fact any arch makes use of -fpatchable-function-entry will > > behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location > > is always on the first 4 bytes offset. > > > > I think when users want to register a kprobe on _do_fork+0, what he really want > > is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4 > > OK, in this case, kprobe should treat the first 2 instructions as a > single virtual instruction. This means, > > - kprobes can probe func+0, but not func+4 if the function is ftraced. > (-EILSEQ must be returned) > - both debugfs/kprobes/list and tracefs/kprobe_events should show the > probed address as func+0. Not func+4. > > Then, user can use kprobes as if there is one big (8-byte) instruction > at the entry of the function. Since probing on func+4 is rejected and > the call-site LR/SP is restored in ftrace, there should be no > contradiction. It should work as if we put a breakpoint on the func + 0. From https://lkml.org/lkml/2019/6/18/648, Naveen tried to allow probing on any ftrace address, is it acceptable on arm64 as well? I will post patches for this purpose. Thanks for your review
diff --git a/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt b/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt index 4fae0464ddff..f9dd9dd91e0c 100644 --- a/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt +++ b/Documentation/features/debug/kprobes-on-ftrace/arch-support.txt @@ -9,7 +9,7 @@ | alpha: | TODO | | arc: | TODO | | arm: | TODO | - | arm64: | TODO | + | arm64: | ok | | c6x: | TODO | | csky: | TODO | | h8300: | TODO | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b1b4476ddb83..92b9882889ac 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -166,6 +166,7 @@ config ARM64 select HAVE_STACKPROTECTOR select HAVE_SYSCALL_TRACEPOINTS select HAVE_KPROBES + select HAVE_KPROBES_ON_FTRACE select HAVE_KRETPROBES select HAVE_GENERIC_VDSO select IOMMU_DMA if IOMMU_SUPPORT diff --git a/arch/arm64/kernel/probes/Makefile b/arch/arm64/kernel/probes/Makefile index 8e4be92e25b1..4020cfc66564 100644 --- a/arch/arm64/kernel/probes/Makefile +++ b/arch/arm64/kernel/probes/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_KPROBES) += kprobes.o decode-insn.o \ simulate-insn.o obj-$(CONFIG_UPROBES) += uprobes.o decode-insn.o \ simulate-insn.o +obj-$(CONFIG_KPROBES_ON_FTRACE) += ftrace.o diff --git a/arch/arm64/kernel/probes/ftrace.c b/arch/arm64/kernel/probes/ftrace.c new file mode 100644 index 000000000000..9f80905f02fa --- /dev/null +++ b/arch/arm64/kernel/probes/ftrace.c @@ -0,0 +1,83 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Dynamic Ftrace based Kprobes Optimization + * + * Copyright (C) Hitachi Ltd., 2012 + * Copyright (C) 2019 Jisheng Zhang <jszhang@kernel.org> + * Synaptics Incorporated + */ + +#include <linux/kprobes.h> + +/* Ftrace callback handler for kprobes -- called under preepmt disabed */ +void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip, + struct ftrace_ops *ops, struct pt_regs *regs) +{ + struct kprobe *p; + struct kprobe_ctlblk *kcb; + + /* Preempt is disabled by ftrace */ + p = get_kprobe((kprobe_opcode_t *)ip); + if (unlikely(!p) || kprobe_disabled(p)) + return; + + kcb = get_kprobe_ctlblk(); + if (kprobe_running()) { + kprobes_inc_nmissed_count(p); + } else { + unsigned long orig_ip = instruction_pointer(regs); + + instruction_pointer_set(regs, ip); + __this_cpu_write(current_kprobe, p); + kcb->kprobe_status = KPROBE_HIT_ACTIVE; + if (!p->pre_handler || !p->pre_handler(p, regs)) { + /* + * Emulate singlestep (and also recover regs->pc) + * as if there is a nop + */ + instruction_pointer_set(regs, + (unsigned long)p->addr + MCOUNT_INSN_SIZE); + if (unlikely(p->post_handler)) { + kcb->kprobe_status = KPROBE_HIT_SSDONE; + p->post_handler(p, regs, 0); + } + instruction_pointer_set(regs, orig_ip); + } + /* + * If pre_handler returns !0, it changes regs->pc. We have to + * skip emulating post_handler. + */ + __this_cpu_write(current_kprobe, NULL); + } +} +NOKPROBE_SYMBOL(kprobe_ftrace_handler); + +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset) +{ + unsigned long addr = kallsyms_lookup_name(name); + + if (addr && !offset) { + unsigned long faddr; + /* + * with -fpatchable-function-entry=2, the first 4 bytes is the + * LR saver, then the actual call insn. So ftrace location is + * always on the first 4 bytes offset. + */ + faddr = ftrace_location_range(addr, + addr + AARCH64_INSN_SIZE); + if (faddr) + return (kprobe_opcode_t *)faddr; + } + return (kprobe_opcode_t *)addr; +} + +bool arch_kprobe_on_func_entry(unsigned long offset) +{ + return offset <= AARCH64_INSN_SIZE; +} + +int arch_prepare_kprobe_ftrace(struct kprobe *p) +{ + p->ainsn.api.insn = NULL; + return 0; +}