diff mbox series

[RFC,2/3] sdei: enable dbg in '_sdei_handler'

Message ID 1555070699-3685-3-git-send-email-wangxiongfeng2@huawei.com (mailing list archive)
State RFC
Headers show
Series Enable kprobe to monitor sdei event handler | expand

Commit Message

Xiongfeng Wang April 12, 2019, 12:04 p.m. UTC
When we monitor a sdei_event callback using 'kprobe', the singlestep
handler can not be called because dbg is masked in sdei_handler. This
patch enable dbg in '_sdei_handler'.

When SDEI events interrupt the userspace, 'vbar_el1' contains
'tramp_vectors' if CONFIG_UNMAP_KERNEL_AT_EL0 is enabled. So we need to
restore 'vbar_el1' with kernel vectors, otherwise we will go to the
wrong place when brk exception or dbg exception happens.

SDEI events may interrupt 'kernel_entry' before we save 'spsr_el1' and
'elr_el1', and dbg exception will corrupts these two registers. So we
also need to save and restore these two registers.

If SDEI events interrupt an instruction being singlestepped, the
instruction in '_sdei_handler' will begin to be singlestepped after we
enable dbg. So we need to disable singlestep in the beginning of
_sdei_handler if we find out we interrupt a singlestep process.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
---
 arch/arm64/include/asm/debug-monitors.h |  1 +
 arch/arm64/kernel/debug-monitors.c      |  8 ++++++
 arch/arm64/kernel/sdei.c                | 43 ++++++++++++++++++++++++++-------
 3 files changed, 43 insertions(+), 9 deletions(-)

Comments

James Morse April 24, 2019, 4:21 p.m. UTC | #1
Hi Xiongfeng Wang,

On 12/04/2019 13:04, Xiongfeng Wang wrote:
> When we monitor a sdei_event callback using 'kprobe', the singlestep
> handler can not be called because dbg is masked in sdei_handler.

For a good reason: the debug hardware may be in use single-stepping the kernel, or worse:
in use by a KVM guest.

The DAIF flags are all set because none of these things are safe for an
SDEI-event/NMI-handler. We haven't done KVM's guest exit work yet, so things like the
debug hardware belong to the guest.

Its not safe to take an exception from NMI context, avoid it at all costs!


> This patch enable dbg in '_sdei_handler'.

This is very bad as the debug hardware may have been in use by something else. A malicious
guest can now cause you to take breakpoint/watchpoint exceptions.


> When SDEI events interrupt the userspace, 'vbar_el1' contains
> 'tramp_vectors' if CONFIG_UNMAP_KERNEL_AT_EL0 is enabled. So we need to
> restore 'vbar_el1' with kernel vectors, otherwise we will go to the
> wrong place when brk exception or dbg exception happens.

This is the tip of the iceberg. You may have been partway through KVM's world-switch,
you'd need to temporarily save/restore the host context. This ends up being a
parody-world-switch, which has to be kept in-sync with KVM. We didn't go this way because
its fragile and unmaintainable.


> SDEI events may interrupt 'kernel_entry' before we save 'spsr_el1' and
> 'elr_el1', and dbg exception will corrupts these two registers. So we
> also need to save and restore these two registers.

(or don't do anything that would cause them to get clobbered)


> If SDEI events interrupt an instruction being singlestepped, the
> instruction in '_sdei_handler' will begin to be singlestepped after we
> enable dbg. So we need to disable singlestep in the beginning of
> _sdei_handler if we find out we interrupt a singlestep process.

And now the arch code's do_debug_exception() is re-rentrant, which it doesn't expect.

The kprobes core code can't be kprobed, but it can be interrupted by NMI. If you can
kprobe the NMI, you've made the kprobes core re-entrant too. A quick look shows
raw_spin_lock() that could deadlock, and it passes values around with a per-cpu variable.

You could interrupt any part of the krpobes machinery with an SDEI event, take a kprobe,
then take a critical SDEI event, and a third kprobe.

I don't see how its possible to make this safe without re-writing kprobes to be NMI safe.


> diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
> index ea94cf8..9dd9cf6 100644
> --- a/arch/arm64/kernel/sdei.c
> +++ b/arch/arm64/kernel/sdei.c

> -	if (elr != read_sysreg(elr_el1)) {
> -		/*
> -		 * We took a synchronous exception from the SDEI handler.
> -		 * This could deadlock, and if you interrupt KVM it will
> -		 * hyp-panic instead.
> -		 */
> -		pr_warn("unsafe: exception during handler\n");
> -	}

This is here to catch a WARN_ON() firing and not killing the system. Its not safe to take
an exception from the NMI handler.


Why do you need to kprobe an NMI handler?


Thanks,

James
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index a44cf52..a730266 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -121,6 +121,7 @@  enum dbg_active_el {
 void user_fastforward_single_step(struct task_struct *task);
 
 void kernel_enable_single_step(struct pt_regs *regs);
+void kernel_enable_single_step_noregs(void);
 void kernel_disable_single_step(void);
 int kernel_active_single_step(void);
 
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index d7bb6ae..d6074f4 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -404,6 +404,14 @@  void kernel_enable_single_step(struct pt_regs *regs)
 }
 NOKPROBE_SYMBOL(kernel_enable_single_step);
 
+void kernel_enable_single_step_noregs(void)
+{
+	WARN_ON(!irqs_disabled());
+	mdscr_write(mdscr_read() | DBG_MDSCR_SS);
+	enable_debug_monitors(DBG_ACTIVE_EL1);
+}
+NOKPROBE_SYMBOL(kernel_enable_single_step_noregs);
+
 void kernel_disable_single_step(void)
 {
 	WARN_ON(!irqs_disabled());
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index ea94cf8..9dd9cf6 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -9,6 +9,7 @@ 
 #include <linux/uaccess.h>
 
 #include <asm/alternative.h>
+#include <asm/debug-monitors.h>
 #include <asm/kprobes.h>
 #include <asm/mmu.h>
 #include <asm/ptrace.h>
@@ -176,6 +177,8 @@  unsigned long sdei_arch_get_entry_point(int conduit)
 
 }
 
+extern char vectors[];  /* kernel exception vectors */
+
 /*
  * __sdei_handler() returns one of:
  *  SDEI_EV_HANDLED -  success, return to the interrupted context.
@@ -189,8 +192,10 @@  static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,
 	int i, err = 0;
 	int clobbered_registers = 4;
 	u64 elr = read_sysreg(elr_el1);
+	u64 spsr = read_sysreg(spsr_el1);
 	u32 kernel_mode = read_sysreg(CurrentEL) | 1;	/* +SPSel */
 	unsigned long vbar = read_sysreg(vbar_el1);
+	int ss_active = 0;
 
 	if (arm64_kernel_unmapped_at_el0())
 		clobbered_registers++;
@@ -207,19 +212,39 @@  static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,
 	 */
 	__uaccess_enable_hw_pan();
 
+	/*
+	 * Enable dbg here so that we can kprobe a sdei event handler. Before we
+	 * enable dbg, we need to restore vbar_el1 with kernel vectors
+	 */
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	write_sysreg(vectors, vbar_el1);
+	isb();
+#endif
+	ss_active = kernel_active_single_step();
+	if (ss_active)
+		kernel_disable_single_step();
+	local_dbg_enable();
+
 	err = sdei_event_handler(regs, arg);
+
+	local_dbg_disable();
+	if (ss_active)
+		kernel_enable_single_step_noregs();
+
+	/*
+	 * brk exception will corrupt elr_el1 and spsr_el1, and trust firmware
+	 * doesn't save it for us. So we need to restore these two registers
+	 * after 'sdei_event_handler'.
+	 */
+	write_sysreg(elr, elr_el1);
+	write_sysreg(spsr, spsr_el1);
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	write_sysreg(vbar, vbar_el1);
+#endif
+
 	if (err)
 		return SDEI_EV_FAILED;
 
-	if (elr != read_sysreg(elr_el1)) {
-		/*
-		 * We took a synchronous exception from the SDEI handler.
-		 * This could deadlock, and if you interrupt KVM it will
-		 * hyp-panic instead.
-		 */
-		pr_warn("unsafe: exception during handler\n");
-	}
-
 	mode = regs->pstate & (PSR_MODE32_BIT | PSR_MODE_MASK);
 
 	/*