Message ID | 1458549470-124791-2-git-send-email-hekuang@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 21/03/2016:08:37:50 AM, He Kuang wrote: > On arm64, watchpoint handler enables single-step to bypass the next > instruction for not recursive enter. If an irq is triggered right > after the watchpoint, a single-step will be wrongly triggered in irq > handler, which causes the watchpoint address not stepped over and > system hang. Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still not been sent for review. Your test result will be helpful. ~Pratyush [1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652
On 2016/3/21 18:24, Pratyush Anand wrote: > On 21/03/2016:08:37:50 AM, He Kuang wrote: >> On arm64, watchpoint handler enables single-step to bypass the next >> instruction for not recursive enter. If an irq is triggered right >> after the watchpoint, a single-step will be wrongly triggered in irq >> handler, which causes the watchpoint address not stepped over and >> system hang. > Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still > not been sent for review. Your test result will be helpful. > > ~Pratyush > > [1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652 Could you please provide a test program for your case so we can test it on our devices? I guess setting breakpoint on a "copy_from_user()" accessing an invalid address can trigger this problem? Thank you.
On 21/03/2016:06:38:31 PM, Wangnan (F) wrote: > > > On 2016/3/21 18:24, Pratyush Anand wrote: > >On 21/03/2016:08:37:50 AM, He Kuang wrote: > >>On arm64, watchpoint handler enables single-step to bypass the next > >>instruction for not recursive enter. If an irq is triggered right > >>after the watchpoint, a single-step will be wrongly triggered in irq > >>handler, which causes the watchpoint address not stepped over and > >>system hang. > >Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still > >not been sent for review. Your test result will be helpful. > > > >~Pratyush > > > >[1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652 > > Could you please provide a test program for your case so we can test > it on our devices? I guess setting breakpoint on a "copy_from_user()" > accessing an invalid address can trigger this problem? My test case was to test kprobing of copy_from_user. I used kprobe64-v11. I reverted "patch v11 3/9" and used following script for __copy_to_user(), which instruments kprobe at every instruction of a given function. I can easily see "Unexpected kernel single-step exception at EL1". ------------------------------------------------------------- #kprobe_at_function_all_inst.sh ------------------------------------------------------------- #! /bin/sh #$1: function name echo 0 > /sys/kernel/debug/tracing/events/kprobes/enable echo > /sys/kernel/debug/tracing/trace echo > /sys/kernel/debug/tracing/kprobe_events func=$(cat /proc/kallsyms | grep -A 1 -w $1 | cut -d ' ' -f 1) func_start=$((0x$(echo $func | cut -d ' ' -f 1))) func_end=$((0x$(echo $func | cut -d ' ' -f 2))) offset=0 while [ $(($func_start + $offset)) -lt $func_end ] do printf -v cmd "p:probe_%x $1+0x%x" $offset $offset echo $cmd >> /sys/kernel/debug/tracing/kprobe_events offset=$((offset + 4)) done echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable ------------------------------------------------------------- # ./kprobe_at_function_all_inst.sh __copy_to_user Now, if I apply the patch which I referred in [1], I can no longer see any "Unexpected kernel single-step exception at EL1" with above test script. If I understood correctly, then the problem you described in your patch is that an irq (el1_irq) is raised when watchpoint was being handled by kernel(specially before kernel could call reinstall_suspended_bps() to disable single stepping). Since, I disable single stepping for all the el1 exception mode, if kernel_enable_single_step() had been called but kernel_disable_single_step() had n't been called. So, your test case could be another good test for my patch. ~Pratyush
Hi Pratyush, on 2016/3/21 18:24, Pratyush Anand wrote: > On 21/03/2016:08:37:50 AM, He Kuang wrote: >> On arm64, watchpoint handler enables single-step to bypass the next >> instruction for not recursive enter. If an irq is triggered right >> after the watchpoint, a single-step will be wrongly triggered in irq >> handler, which causes the watchpoint address not stepped over and >> system hang. > > Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still > not been sent for review. Your test result will be helpful. > > ~Pratyush > > [1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652 This patch did not consider that, when excetpion return, the singlestep flag should be restored, otherwise the right singlestep will not triggered. Right? Thanks, Li Bin > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > . >
Hi Li, On 31/03/2016:08:45:05 PM, Li Bin wrote: > Hi Pratyush, > > on 2016/3/21 18:24, Pratyush Anand wrote: > > On 21/03/2016:08:37:50 AM, He Kuang wrote: > >> On arm64, watchpoint handler enables single-step to bypass the next > >> instruction for not recursive enter. If an irq is triggered right > >> after the watchpoint, a single-step will be wrongly triggered in irq > >> handler, which causes the watchpoint address not stepped over and > >> system hang. > > > > Does patch [1] resolves this issue as well? I hope it should. Patch[1] has still > > not been sent for review. Your test result will be helpful. > > > > ~Pratyush > > > > [1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652 > > This patch did not consider that, when excetpion return, the singlestep flag > should be restored, otherwise the right singlestep will not triggered. > Right? Yes, you are right, and there are other problems as well. Will Deacon pointed out [1] that kernel debugging is per-cpu rather than per-task. So, I did thought of a per-cpu implementation by introducing a new element "flags" in struct pt_regs. But even with that I see issues. For example: - While executing single step instruction, we get a data abort - In the kernel_entry of data abort we disable single stepping based on "flags" bit field - While handling data abort we receive anther interrupt, so we are again in kernel_entry (for el1_irq). Single stepping will be disabled again (although it does not matter). Now the issue is that, what condition should be verified in kernel_exit for enabling single step again? In the above scenario, kernel_exit for el1_irq should not enable single stepping, but how to prevent that elegantly? [1] http://www.spinics.net/lists/arm-kernel/msg491844.html
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h index b5902e8..fe6939e 100644 --- a/arch/arm64/include/asm/debug-monitors.h +++ b/arch/arm64/include/asm/debug-monitors.h @@ -133,7 +133,10 @@ int kernel_active_single_step(void); #ifdef CONFIG_HAVE_HW_BREAKPOINT int reinstall_suspended_bps(struct pt_regs *regs); u64 signal_single_step_enable_bps(void); +u64 irq_single_step_enable_bps(void); + void signal_reinstall_single_step(u64 pstate); +void irq_reinstall_single_step(struct pt_regs *regs); #else static inline int reinstall_suspended_bps(struct pt_regs *regs) { @@ -145,7 +148,13 @@ static inline u64 signal_single_step_enable_bps(void) return 0; } +static inline u64 irq_single_step_enable_bps(void) +{ + return 0; +} + static inline void signal_reinstall_single_step(u64 pstate) { } +static inline void irq_reinstall_single_step(struct pt_regs *regs) { } #endif int aarch32_break_handler(struct pt_regs *regs); diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c index c536c9e..fab1faa 100644 --- a/arch/arm64/kernel/debug-monitors.c +++ b/arch/arm64/kernel/debug-monitors.c @@ -245,9 +245,22 @@ static void send_user_sigtrap(int si_code) force_sig_info(SIGTRAP, &info, current); } +extern unsigned long el1_irq_ss_entry[]; + static int single_step_handler(unsigned long addr, unsigned int esr, struct pt_regs *regs) { + void *pc = (void *)instruction_pointer(regs); + + if (pc == &el1_irq_ss_entry) { + struct pt_regs *irq_regs = (struct pt_regs *)(regs->sp); + + irq_regs->pstate |= irq_single_step_enable_bps(); + kernel_disable_single_step(); + + return 0; + } + /* * If we are stepping a pending breakpoint, call the hw_breakpoint * handler first. diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 1f7f5a2..836d98e 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -402,12 +402,18 @@ ENDPROC(el1_sync) el1_irq: kernel_entry 1 enable_dbg + .global el1_irq_ss_entry +el1_irq_ss_entry: #ifdef CONFIG_TRACE_IRQFLAGS bl trace_hardirqs_off #endif get_thread_info tsk irq_handler +#ifdef CONFIG_HAVE_HW_BREAKPOINT + mov x0, sp + bl irq_reinstall_single_step +#endif #ifdef CONFIG_PREEMPT ldr w24, [tsk, #TI_PREEMPT] // get preempt count diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index 18fd3d3..0cf13ee 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -540,11 +540,12 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp) * exception level at the register level. * This is used when single-stepping after a breakpoint exception. */ -static void toggle_bp_registers(int reg, enum dbg_active_el el, int enable) +static bool toggle_bp_registers(int reg, enum dbg_active_el el, int enable) { int i, max_slots, privilege; u32 ctrl; struct perf_event **slots; + bool origin_state = false; switch (reg) { case AARCH64_DBG_REG_BCR: @@ -556,7 +557,7 @@ static void toggle_bp_registers(int reg, enum dbg_active_el el, int enable) max_slots = core_num_wrps; break; default: - return; + return false; } for (i = 0; i < max_slots; ++i) { @@ -568,12 +569,16 @@ static void toggle_bp_registers(int reg, enum dbg_active_el el, int enable) continue; ctrl = read_wb_reg(reg, i); + if (ctrl & 0x1) + origin_state = true; if (enable) ctrl |= 0x1; else ctrl &= ~0x1; write_wb_reg(reg, i, ctrl); } + + return origin_state; } /* @@ -982,6 +987,41 @@ u64 signal_single_step_enable_bps(void) return retval; } +u64 irq_single_step_enable_bps(void) +{ + u64 retval = 0; + + if (!toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL1, 1)) + retval |= PSR_LINUX_HW_WP_SS; + + if (!toggle_bp_registers(AARCH64_DBG_REG_BCR, DBG_ACTIVE_EL1, 1)) + retval |= PSR_LINUX_HW_BP_SS; + + return retval; +} + +void irq_reinstall_single_step(struct pt_regs *regs) +{ + u64 pstate = regs->pstate; + + if (likely(!(regs->pstate & PSR_LINUX_HW_SS))) + return; + + if (!user_mode(regs)) { + if (pstate & PSR_LINUX_HW_BP_SS) + toggle_bp_registers(AARCH64_DBG_REG_BCR, + DBG_ACTIVE_EL1, 0); + if (pstate & PSR_LINUX_HW_WP_SS) + toggle_bp_registers(AARCH64_DBG_REG_WCR, + DBG_ACTIVE_EL1, 0); + + if (!kernel_active_single_step()) { + asm volatile ("msr daifset, #8\n"); + kernel_enable_single_step(regs); + } + } +} + void signal_reinstall_single_step(u64 pstate) { struct debug_info *debug_info = ¤t->thread.debug;