diff mbox

[v6,3/6] arm64: Kprobes with single stepping support

Message ID 555F5011.90808@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

William Cohen May 22, 2015, 3:49 p.m. UTC
On 05/22/2015 07:00 AM, Catalin Marinas wrote:
> On Thu, May 21, 2015 at 12:44:45AM -0400, David Long wrote:
>> On 05/20/15 12:39, Catalin Marinas wrote:
>>> On Mon, Apr 20, 2015 at 04:19:44PM -0400, David Long wrote:
>>>> Add support for basic kernel probes(kprobes) and jump probes
>>>> (jprobes) for ARM64.
>>>>
>>>> Kprobes utilizes software breakpoint and single step debug
>>>> exceptions supported on ARM v8.
>>>>
>>>> A software breakpoint is placed at the probe address to trap the
>>>> kernel execution into the kprobe handler.
>>>>
>>>> ARM v8 supports enabling single stepping before the break exception
>>>> return (ERET), with next PC in exception return address (ELR_EL1). The
>>>> kprobe handler prepares an executable memory slot for out-of-line
>>>> execution with a copy of the original instruction being probed, and
>>>> enables single stepping. The PC is set to the out-of-line slot address
>>>> before the ERET. With this scheme, the instruction is executed with the
>>>> exact same register context except for the PC (and DAIF) registers.
>>>
>>> I wonder whether it would be simpler to use another software breakpoint
>>> after the out of line instruction copy. You won't run the instructions
>>> that change the PC anyway.
>>
>> We put quite a bit of work into making single-step work.  I don't see any
>> obvious advantage to trying to switch to a software breakpoint. Both are
>> debug exceptions but SS does leave open the possibility of maybe eventually
>> running some instructions that do change the PC.
> 
> I'm not trying to get this re-written, just as a potential workaround
> for the unexpected single-step error reported but I need to read some
> more before I understand it properly (and I can see patches already for
> fixing this).
> 
>>> Since an unconditional branch instruction within the kernel address
>>> space can reach any point in the kernel (and modules), could we go a
>>> step further and avoid the software breakpoint altogether, just generate
>>> a branch instruction to the original location (after the software
>>> breakpoint)?
>>
>> Wouldn't a branch instruction have to make use of a register in order to
>> span the whole address space?  How could you do that and have all the
>> registers unmolested when you land back after the original probe point?  The
>> thing that really kills this though is the fact we need to be able to run
>> the pre and post functions before and *after* the XOL stepping.
> 
> A "b #imm" would be able to cover a wide range (+/-128MB), but, as you
> said, it doesn't help with the post function call.
> 
> Any plans to post an updated version with the "unexpected single-step
> error" fixed?
> 

Hi Catalin,

The only place this issue with the "unexpected single-step error" has been observed is with the arm64 kretporbe handler code calling kprobed functions.  Experiments with kprobed functions being called in the kprobe handlers showed that situation was handled appropriately. 

There is proposed fix to address the issue with the trampoline, the attached patch.  This is modeled after the way that the x86 handles the kretprobe.  The trampoline directly save and restores the registers and uses a normal call to the kretprobe handler.  It operates similarly to what you are suggesting above, but just for the special case of the kretprobes.

-Will Cohen

Comments

Catalin Marinas May 22, 2015, 4:54 p.m. UTC | #1
On Fri, May 22, 2015 at 11:49:37AM -0400, William Cohen wrote:
> On 05/22/2015 07:00 AM, Catalin Marinas wrote:
> > Any plans to post an updated version with the "unexpected single-step
> > error" fixed?
> 
> The only place this issue with the "unexpected single-step error" has
> been observed is with the arm64 kretporbe handler code calling kprobed
> functions.  Experiments with kprobed functions being called in the
> kprobe handlers showed that situation was handled appropriately. 
> 
> There is proposed fix to address the issue with the trampoline, the
> attached patch.  This is modeled after the way that the x86 handles
> the kretprobe.  The trampoline directly save and restores the
> registers and uses a normal call to the kretprobe handler.  It
> operates similarly to what you are suggesting above, but just for the
> special case of the kretprobes.

Thanks. I guess David will post a v7 series with this patch included and
other comments addressed.

BTW, I'll be on holiday for a week, back on the 1st of June. Hopefully
this series gets some more reviews by then ;)
David Long May 22, 2015, 4:57 p.m. UTC | #2
On 05/22/15 12:54, Catalin Marinas wrote:
> On Fri, May 22, 2015 at 11:49:37AM -0400, William Cohen wrote:
>> On 05/22/2015 07:00 AM, Catalin Marinas wrote:
>>> Any plans to post an updated version with the "unexpected single-step
>>> error" fixed?
>>
>> The only place this issue with the "unexpected single-step error" has
>> been observed is with the arm64 kretporbe handler code calling kprobed
>> functions.  Experiments with kprobed functions being called in the
>> kprobe handlers showed that situation was handled appropriately.
>>
>> There is proposed fix to address the issue with the trampoline, the
>> attached patch.  This is modeled after the way that the x86 handles
>> the kretprobe.  The trampoline directly save and restores the
>> registers and uses a normal call to the kretprobe handler.  It
>> operates similarly to what you are suggesting above, but just for the
>> special case of the kretprobes.
>
> Thanks. I guess David will post a v7 series with this patch included and
> other comments addressed.
>
> BTW, I'll be on holiday for a week, back on the 1st of June. Hopefully
> this series gets some more reviews by then ;)
>

Yes, the v7 patch is in the works, with Will Cohen's trampoline fixes as 
well as your feedback.

-fl
diff mbox

Patch

From 3ee0894279693822132fa43cf5a0eefe898955d9 Mon Sep 17 00:00:00 2001
From: William Cohen <wcohen@redhat.com>
Date: Tue, 19 May 2015 10:03:30 -0400
Subject: [PATCH] Avoid using kprobe in the arm64 kretprobe trampoline

The aarch64 exception handling does not allow nesting of debug exceptions.
Using a kprobe in the kretprobe trampoline can cause problems if any of
the functions used by the kretprobe handler are instrumented with kprobes.
This problem was observed with the systemtap functioncallcount.stp
test when it instrumented the kfree function.  The kretprobe handler
under some conditions can all kfree.

Instead of using a kprobe to trigger an exception to save the processor state
and calling the kretprobe handler from the kprobei, the trampoline can
directly save the register state and call the kretprobe handler.  This
avoids the possible nested debug exceptions and reduces the overhead of
handling a kretprobe.  This code for the arm64 was based on the x86 code.

Signed-off-by: William Cohen <wcohen@redhat.com>
---
 arch/arm64/kernel/kprobes-arm64.h | 43 +++++++++++++++++++++++++++++++++
 arch/arm64/kernel/kprobes.c       | 51 +++++++++++++++------------------------
 2 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/kernel/kprobes-arm64.h b/arch/arm64/kernel/kprobes-arm64.h
index ff8a55f..46dab24 100644
--- a/arch/arm64/kernel/kprobes-arm64.h
+++ b/arch/arm64/kernel/kprobes-arm64.h
@@ -27,4 +27,47 @@  extern kprobes_pstate_check_t * const kprobe_condition_checks[16];
 enum kprobe_insn __kprobes
 arm_kprobe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi);
 
+#define SAVE_REGS_STRING\
+	"	stp x0, x1, [sp, #16 * 0]\n"	\
+	"	stp x2, x3, [sp, #16 * 1]\n"	\
+	"	stp x4, x5, [sp, #16 * 2]\n"	\
+	"	stp x6, x7, [sp, #16 * 3]\n"	\
+	"	stp x8, x9, [sp, #16 * 4]\n"	\
+	"	stp x10, x11, [sp, #16 * 5]\n"	\
+	"	stp x12, x13, [sp, #16 * 6]\n"	\
+	"	stp x14, x15, [sp, #16 * 7]\n"	\
+	"	stp x16, x17, [sp, #16 * 8]\n"	\
+	"	stp x18, x19, [sp, #16 * 9]\n"	\
+	"	stp x20, x21, [sp, #16 * 10]\n"	\
+	"	stp x22, x23, [sp, #16 * 11]\n"	\
+	"	stp x24, x25, [sp, #16 * 12]\n"	\
+	"	stp x26, x27, [sp, #16 * 13]\n"	\
+	"	stp x28, x29, [sp, #16 * 14]\n"	\
+	"	str x30,   [sp, #16 * 15]\n"    \
+	"	mrs x0, nzcv\n"			\
+	"	str x0, [sp, #8 * 33]\n"
+	
+
+#define RESTORE_REGS_STRING\
+	"	ldr x0, [sp, #8 * 33]\n"	\
+	"	msr nzcv, x0\n"			\
+	"	ldp x0, x1, [sp, #16 * 0]\n"	\
+	"	ldp x2, x3, [sp, #16 * 1]\n"	\
+	"	ldp x4, x5, [sp, #16 * 2]\n"	\
+	"	ldp x6, x7, [sp, #16 * 3]\n"	\
+	"	ldp x8, x9, [sp, #16 * 4]\n"	\
+	"	ldp x10, x11, [sp, #16 * 5]\n"	\
+	"	ldp x12, x13, [sp, #16 * 6]\n"	\
+	"	ldp x14, x15, [sp, #16 * 7]\n"	\
+	"	ldp x16, x17, [sp, #16 * 8]\n"	\
+	"	ldp x18, x19, [sp, #16 * 9]\n"	\
+	"	ldp x20, x21, [sp, #16 * 10]\n"	\
+	"	ldp x22, x23, [sp, #16 * 11]\n"	\
+	"	ldp x24, x25, [sp, #16 * 12]\n"	\
+	"	ldp x26, x27, [sp, #16 * 13]\n"	\
+	"	ldp x28, x29, [sp, #16 * 14]\n"	\
+	"	ldr x30,   [sp, #16 * 15]\n"	\
+
+
+
 #endif /* _ARM_KERNEL_KPROBES_ARM64_H */
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index d2aa4bc..af55e37 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -565,32 +565,27 @@  int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
 }
 
 /*
- * Kretprobes: kernel return probes handling
- *
- * AArch64 mode does not support popping the PC value from the
- * stack like on ARM 32-bit (ldmia {..,pc}), so atleast one
- * register need to be used to achieve branching/return.
- * It means return probes cannot return back to the original
- * return address directly without modifying the register context.
- *
- * So like other architectures, we prepare a global routine
- * with NOPs, which serve as trampoline address that hack away the
- * function return, with the exact register context.
- * Placing a kprobe on trampoline routine entry will trap again to
- * execute return probe handlers and restore original return address
- * in ELR_EL1, this way saved pt_regs still hold the original
- * register values to be carried back to the caller.
+ * When a retprobed function returns, this code saves registers and
+ * calls trampoline_handler() runs, which calls the kretprobe's handler.
  */
-static void __used kretprobe_trampoline_holder(void)
+static void __used __kprobes kretprobe_trampoline_holder(void)
 {
 	asm volatile (".global kretprobe_trampoline\n"
 			"kretprobe_trampoline:\n"
-			"NOP\n\t"
-			"NOP\n\t");
+		        "sub sp, sp, %0\n"
+			SAVE_REGS_STRING
+			"mov x0, sp\n"
+			"bl trampoline_probe_handler\n"
+			/* Replace trampoline address in lr with actual
+			   orig_ret_addr return address. */
+			"str x0, [sp, #16 * 15]\n"
+			RESTORE_REGS_STRING
+		        "add sp, sp, %0\n"
+			"ret\n"
+		      : : "I"(sizeof(struct pt_regs)) : "memory");
 }
 
-static int __kprobes
-trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
+static void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
 	struct kretprobe_instance *ri = NULL;
 	struct hlist_head *head, empty_rp;
@@ -651,7 +646,7 @@  trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 	}
 
 	/* return 1 so that post handlers not called */
-	return 1;
+	return (void *) orig_ret_addr;
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
@@ -663,18 +658,12 @@  void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	regs->regs[30] = (long)&kretprobe_trampoline;
 }
 
-static struct kprobe trampoline = {
-	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
-	.pre_handler = trampoline_probe_handler
-};
-
-int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+int __init arch_init_kprobes(void)
 {
-	return p->addr == (kprobe_opcode_t *) &kretprobe_trampoline;
+	return 0;
 }
 
-int __init arch_init_kprobes(void)
+int arch_trampoline_kprobe(struct kprobe *p)
 {
-	/* register trampoline for kret probe */
-	return register_kprobe(&trampoline);
+	return 0;
 }
-- 
1.8.3.1