Message ID | 1555070699-3685-3-git-send-email-wangxiongfeng2@huawei.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | Enable kprobe to monitor sdei event handler | expand |
Hi Xiongfeng Wang, On 12/04/2019 13:04, Xiongfeng Wang wrote: > When we monitor a sdei_event callback using 'kprobe', the singlestep > handler can not be called because dbg is masked in sdei_handler. For a good reason: the debug hardware may be in use single-stepping the kernel, or worse: in use by a KVM guest. The DAIF flags are all set because none of these things are safe for an SDEI-event/NMI-handler. We haven't done KVM's guest exit work yet, so things like the debug hardware belong to the guest. Its not safe to take an exception from NMI context, avoid it at all costs! > This patch enable dbg in '_sdei_handler'. This is very bad as the debug hardware may have been in use by something else. A malicious guest can now cause you to take breakpoint/watchpoint exceptions. > When SDEI events interrupt the userspace, 'vbar_el1' contains > 'tramp_vectors' if CONFIG_UNMAP_KERNEL_AT_EL0 is enabled. So we need to > restore 'vbar_el1' with kernel vectors, otherwise we will go to the > wrong place when brk exception or dbg exception happens. This is the tip of the iceberg. You may have been partway through KVM's world-switch, you'd need to temporarily save/restore the host context. This ends up being a parody-world-switch, which has to be kept in-sync with KVM. We didn't go this way because its fragile and unmaintainable. > SDEI events may interrupt 'kernel_entry' before we save 'spsr_el1' and > 'elr_el1', and dbg exception will corrupts these two registers. So we > also need to save and restore these two registers. (or don't do anything that would cause them to get clobbered) > If SDEI events interrupt an instruction being singlestepped, the > instruction in '_sdei_handler' will begin to be singlestepped after we > enable dbg. So we need to disable singlestep in the beginning of > _sdei_handler if we find out we interrupt a singlestep process. And now the arch code's do_debug_exception() is re-rentrant, which it doesn't expect. The kprobes core code can't be kprobed, but it can be interrupted by NMI. If you can kprobe the NMI, you've made the kprobes core re-entrant too. A quick look shows raw_spin_lock() that could deadlock, and it passes values around with a per-cpu variable. You could interrupt any part of the krpobes machinery with an SDEI event, take a kprobe, then take a critical SDEI event, and a third kprobe. I don't see how its possible to make this safe without re-writing kprobes to be NMI safe. > diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c > index ea94cf8..9dd9cf6 100644 > --- a/arch/arm64/kernel/sdei.c > +++ b/arch/arm64/kernel/sdei.c > - if (elr != read_sysreg(elr_el1)) { > - /* > - * We took a synchronous exception from the SDEI handler. > - * This could deadlock, and if you interrupt KVM it will > - * hyp-panic instead. > - */ > - pr_warn("unsafe: exception during handler\n"); > - } This is here to catch a WARN_ON() firing and not killing the system. Its not safe to take an exception from the NMI handler. Why do you need to kprobe an NMI handler? Thanks, James
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h index a44cf52..a730266 100644 --- a/arch/arm64/include/asm/debug-monitors.h +++ b/arch/arm64/include/asm/debug-monitors.h @@ -121,6 +121,7 @@ enum dbg_active_el { void user_fastforward_single_step(struct task_struct *task); void kernel_enable_single_step(struct pt_regs *regs); +void kernel_enable_single_step_noregs(void); void kernel_disable_single_step(void); int kernel_active_single_step(void); diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c index d7bb6ae..d6074f4 100644 --- a/arch/arm64/kernel/debug-monitors.c +++ b/arch/arm64/kernel/debug-monitors.c @@ -404,6 +404,14 @@ void kernel_enable_single_step(struct pt_regs *regs) } NOKPROBE_SYMBOL(kernel_enable_single_step); +void kernel_enable_single_step_noregs(void) +{ + WARN_ON(!irqs_disabled()); + mdscr_write(mdscr_read() | DBG_MDSCR_SS); + enable_debug_monitors(DBG_ACTIVE_EL1); +} +NOKPROBE_SYMBOL(kernel_enable_single_step_noregs); + void kernel_disable_single_step(void) { WARN_ON(!irqs_disabled()); diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c index ea94cf8..9dd9cf6 100644 --- a/arch/arm64/kernel/sdei.c +++ b/arch/arm64/kernel/sdei.c @@ -9,6 +9,7 @@ #include <linux/uaccess.h> #include <asm/alternative.h> +#include <asm/debug-monitors.h> #include <asm/kprobes.h> #include <asm/mmu.h> #include <asm/ptrace.h> @@ -176,6 +177,8 @@ unsigned long sdei_arch_get_entry_point(int conduit) } +extern char vectors[]; /* kernel exception vectors */ + /* * __sdei_handler() returns one of: * SDEI_EV_HANDLED - success, return to the interrupted context. @@ -189,8 +192,10 @@ static __kprobes unsigned long _sdei_handler(struct pt_regs *regs, int i, err = 0; int clobbered_registers = 4; u64 elr = read_sysreg(elr_el1); + u64 spsr = read_sysreg(spsr_el1); u32 kernel_mode = read_sysreg(CurrentEL) | 1; /* +SPSel */ unsigned long vbar = read_sysreg(vbar_el1); + int ss_active = 0; if (arm64_kernel_unmapped_at_el0()) clobbered_registers++; @@ -207,19 +212,39 @@ static __kprobes unsigned long _sdei_handler(struct pt_regs *regs, */ __uaccess_enable_hw_pan(); + /* + * Enable dbg here so that we can kprobe a sdei event handler. Before we + * enable dbg, we need to restore vbar_el1 with kernel vectors + */ +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + write_sysreg(vectors, vbar_el1); + isb(); +#endif + ss_active = kernel_active_single_step(); + if (ss_active) + kernel_disable_single_step(); + local_dbg_enable(); + err = sdei_event_handler(regs, arg); + + local_dbg_disable(); + if (ss_active) + kernel_enable_single_step_noregs(); + + /* + * brk exception will corrupt elr_el1 and spsr_el1, and trust firmware + * doesn't save it for us. So we need to restore these two registers + * after 'sdei_event_handler'. + */ + write_sysreg(elr, elr_el1); + write_sysreg(spsr, spsr_el1); +#ifdef CONFIG_UNMAP_KERNEL_AT_EL0 + write_sysreg(vbar, vbar_el1); +#endif + if (err) return SDEI_EV_FAILED; - if (elr != read_sysreg(elr_el1)) { - /* - * We took a synchronous exception from the SDEI handler. - * This could deadlock, and if you interrupt KVM it will - * hyp-panic instead. - */ - pr_warn("unsafe: exception during handler\n"); - } - mode = regs->pstate & (PSR_MODE32_BIT | PSR_MODE_MASK); /*
When we monitor a sdei_event callback using 'kprobe', the singlestep handler can not be called because dbg is masked in sdei_handler. This patch enable dbg in '_sdei_handler'. When SDEI events interrupt the userspace, 'vbar_el1' contains 'tramp_vectors' if CONFIG_UNMAP_KERNEL_AT_EL0 is enabled. So we need to restore 'vbar_el1' with kernel vectors, otherwise we will go to the wrong place when brk exception or dbg exception happens. SDEI events may interrupt 'kernel_entry' before we save 'spsr_el1' and 'elr_el1', and dbg exception will corrupts these two registers. So we also need to save and restore these two registers. If SDEI events interrupt an instruction being singlestepped, the instruction in '_sdei_handler' will begin to be singlestepped after we enable dbg. So we need to disable singlestep in the beginning of _sdei_handler if we find out we interrupt a singlestep process. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> --- arch/arm64/include/asm/debug-monitors.h | 1 + arch/arm64/kernel/debug-monitors.c | 8 ++++++ arch/arm64/kernel/sdei.c | 43 ++++++++++++++++++++++++++------- 3 files changed, 43 insertions(+), 9 deletions(-)