diff mbox

x86/debug: Avoid crashing in early boot because of debugger_trap_entry()

Message ID 1470244085-1389-1-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrew Cooper Aug. 3, 2016, 5:08 p.m. UTC
debugger_trap_entry() is not safe to use during early boot, as it follows
current before it is necesserily safe to do so.  Futhermore it does this
unconditionally, despite most callsites turning into no-ops because of the
vector test.

Inline debugger_trap_entry() into the two callers where doesn't become a
no-op, and reposition the guest_kernel_mode() test to be after the
guest_mode() test, avoiding the reference to current if the exception occured
from hypervisor context.  This makes the exception handlers safe in early
boot.  The repositioning also has bugfix in the TRAP_debug case, where dr6
will be correct in the debuggers view of state.

This causes the deletion of DEBUGGER_trap_entry(), which is a good thing as
hiding a return statement in a function-like macro is very antisocial
programming.

While cleaning this area up, drop the DEBUGGER_trap_fatal() wrapper as well,
making the code flow more apparent, and switch debugger_trap_fatal()'s return
type to boolean to match its semantics.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/arch/x86/traps.c           | 54 ++++++++++++++++++++++++++++--------------
 xen/include/asm-x86/debugger.h | 29 +++--------------------
 2 files changed, 39 insertions(+), 44 deletions(-)

Comments

Jan Beulich Aug. 4, 2016, 7:05 a.m. UTC | #1
>>> On 03.08.16 at 19:08, <andrew.cooper3@citrix.com> wrote:
> debugger_trap_entry() is not safe to use during early boot, as it follows
> current before it is necesserily safe to do so.  Futhermore it does this
> unconditionally, despite most callsites turning into no-ops because of the
> vector test.
> 
> Inline debugger_trap_entry() into the two callers where doesn't become a
> no-op,

Implied from that you delete all other invocations. While from a
(current) functionality pov that's no functional change, to me these
invocations have always served as an indication that some action
_could_ be taken there as well. I'm not sure it's a good idea to
remove that.

> @@ -3814,6 +3825,13 @@ void do_debug(struct cpu_user_regs *regs)
>      /* Save debug status register where guest OS can peek at it */
>      v->arch.debugreg[6] = read_debugreg(6);
>  
> +    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
> +    {
> +        v->arch.gdbsx_vcpu_event = TRAP_debug;

Note how in the original code this write didn't happen. I can't tell why,
but it's a behavioral change which certainly would need explaining and
blessing by the gdbsx maintainer.

Jan
Andrew Cooper Aug. 4, 2016, 9:20 a.m. UTC | #2
On 04/08/16 08:05, Jan Beulich wrote:
>>>> On 03.08.16 at 19:08, <andrew.cooper3@citrix.com> wrote:
>> debugger_trap_entry() is not safe to use during early boot, as it follows
>> current before it is necesserily safe to do so.  Futhermore it does this
>> unconditionally, despite most callsites turning into no-ops because of the
>> vector test.
>>
>> Inline debugger_trap_entry() into the two callers where doesn't become a
>> no-op,
> Implied from that you delete all other invocations. While from a
> (current) functionality pov that's no functional change, to me these
> invocations have always served as an indication that some action
> _could_ be taken there as well. I'm not sure it's a good idea to
> remove that.

I considered that point specifically, but I disagree.

The current callsites require debugger_trap_entry() to double up all the
safety checks the main, which isn't reasonable IMO.

This code hasn't changed in years and don't think it is likely to change
in the forseeable future.  If people did want something a little like
how it currently is, the current code is not a good starting point.

OTOH, I do have an alternative patch, but would prefer not to use it.

>
>> @@ -3814,6 +3825,13 @@ void do_debug(struct cpu_user_regs *regs)
>>      /* Save debug status register where guest OS can peek at it */
>>      v->arch.debugreg[6] = read_debugreg(6);
>>  
>> +    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>> +    {
>> +        v->arch.gdbsx_vcpu_event = TRAP_debug;
> Note how in the original code this write didn't happen. I can't tell why,
> but it's a behavioral change which certainly would need explaining

This was mentioned in the commit message.

~Andrew
Jan Beulich Aug. 4, 2016, 9:30 a.m. UTC | #3
>>> On 04.08.16 at 11:20, <andrew.cooper3@citrix.com> wrote:
> On 04/08/16 08:05, Jan Beulich wrote:
>>>>> On 03.08.16 at 19:08, <andrew.cooper3@citrix.com> wrote:
>>> debugger_trap_entry() is not safe to use during early boot, as it follows
>>> current before it is necesserily safe to do so.  Futhermore it does this
>>> unconditionally, despite most callsites turning into no-ops because of the
>>> vector test.
>>>
>>> Inline debugger_trap_entry() into the two callers where doesn't become a
>>> no-op,
>> Implied from that you delete all other invocations. While from a
>> (current) functionality pov that's no functional change, to me these
>> invocations have always served as an indication that some action
>> _could_ be taken there as well. I'm not sure it's a good idea to
>> remove that.
> 
> I considered that point specifically, but I disagree.
> 
> The current callsites require debugger_trap_entry() to double up all the
> safety checks the main, which isn't reasonable IMO.
> 
> This code hasn't changed in years and don't think it is likely to change
> in the forseeable future.  If people did want something a little like
> how it currently is, the current code is not a good starting point.

If I had ever gotten to port my Linux kernel debugger to Xen, I
would most likely have customized debugger_trap_entry(), and
the deletion of its invocations would then have broken everything.

>>> @@ -3814,6 +3825,13 @@ void do_debug(struct cpu_user_regs *regs)
>>>      /* Save debug status register where guest OS can peek at it */
>>>      v->arch.debugreg[6] = read_debugreg(6);
>>>  
>>> +    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
>>> +    {
>>> +        v->arch.gdbsx_vcpu_event = TRAP_debug;
>> Note how in the original code this write didn't happen. I can't tell why,
>> but it's a behavioral change which certainly would need explaining
> 
> This was mentioned in the commit message.

Well, before writing the reply I did read it a second time. Now I've
read a third and fourth time, and still can't spot any mention of this.
I'm sorry if I'm being dense...

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 767d0b0..933435d 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -736,7 +736,9 @@  void do_reserved_trap(struct cpu_user_regs *regs)
 {
     unsigned int trapnr = regs->entry_vector;
 
-    DEBUGGER_trap_fatal(trapnr, regs);
+    if ( debugger_trap_fatal(trapnr, regs) )
+        return;
+
     show_execution_state(regs);
     panic("FATAL RESERVED TRAP %#x: %s", trapnr, trapstr(trapnr));
 }
@@ -750,8 +752,6 @@  static void do_trap(struct cpu_user_regs *regs, int use_error_code)
     if ( regs->error_code & X86_XEC_EXT )
         goto hardware_trap;
 
-    DEBUGGER_trap_entry(trapnr, regs);
-
     if ( guest_mode(regs) )
     {
         do_guest_trap(trapnr, regs, use_error_code);
@@ -777,7 +777,8 @@  static void do_trap(struct cpu_user_regs *regs, int use_error_code)
     }
 
  hardware_trap:
-    DEBUGGER_trap_fatal(trapnr, regs);
+    if ( debugger_trap_fatal(trapnr, regs) )
+        return;
 
     show_execution_state(regs);
     panic("FATAL TRAP: vector = %d (%s)\n"
@@ -1307,8 +1308,6 @@  void do_invalid_op(struct cpu_user_regs *regs)
     int id = -1, lineno;
     const struct virtual_region *region;
 
-    DEBUGGER_trap_entry(TRAP_invalid_op, regs);
-
     if ( likely(guest_mode(regs)) )
     {
         if ( !emulate_invalid_rdtscp(regs) &&
@@ -1377,7 +1376,10 @@  void do_invalid_op(struct cpu_user_regs *regs)
 
     case BUGFRAME_bug:
         printk("Xen BUG at %s%s:%d\n", prefix, filename, lineno);
-        DEBUGGER_trap_fatal(TRAP_invalid_op, regs);
+
+        if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
+            return;
+
         show_execution_state(regs);
         panic("Xen BUG at %s%s:%d", prefix, filename, lineno);
 
@@ -1389,7 +1391,10 @@  void do_invalid_op(struct cpu_user_regs *regs)
 
         printk("Assertion '%s' failed at %s%s:%d\n",
                predicate, prefix, filename, lineno);
-        DEBUGGER_trap_fatal(TRAP_invalid_op, regs);
+
+        if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
+            return;
+
         show_execution_state(regs);
         panic("Assertion '%s' failed at %s%s:%d",
               predicate, prefix, filename, lineno);
@@ -1402,14 +1407,17 @@  void do_invalid_op(struct cpu_user_regs *regs)
         regs->eip = fixup;
         return;
     }
-    DEBUGGER_trap_fatal(TRAP_invalid_op, regs);
+
+    if ( debugger_trap_fatal(TRAP_invalid_op, regs) )
+        return;
+
     show_execution_state(regs);
     panic("FATAL TRAP: vector = %d (invalid opcode)", TRAP_invalid_op);
 }
 
 void do_int3(struct cpu_user_regs *regs)
 {
-    DEBUGGER_trap_entry(TRAP_int3, regs);
+    struct vcpu *curr = current;
 
     if ( !guest_mode(regs) )
     {
@@ -1417,6 +1425,13 @@  void do_int3(struct cpu_user_regs *regs)
         return;
     } 
 
+    if ( guest_kernel_mode(curr, regs) && curr->domain->debugger_attached )
+    {
+        curr->arch.gdbsx_vcpu_event = TRAP_int3;
+        domain_pause_for_debugger();
+        return;
+    }
+
     do_guest_trap(TRAP_int3, regs, 0);
 }
 
@@ -1744,8 +1759,6 @@  void do_page_fault(struct cpu_user_regs *regs)
     /* fixup_page_fault() might change regs->error_code, so cache it here. */
     error_code = regs->error_code;
 
-    DEBUGGER_trap_entry(TRAP_page_fault, regs);
-
     perfc_incr(page_faults);
 
     if ( unlikely(fixup_page_fault(addr, regs) != 0) )
@@ -1774,7 +1787,8 @@  void do_page_fault(struct cpu_user_regs *regs)
             return;
         }
 
-        DEBUGGER_trap_fatal(TRAP_page_fault, regs);
+        if ( debugger_trap_fatal(TRAP_page_fault, regs) )
+            return;
 
         show_execution_state(regs);
         show_page_walk(addr);
@@ -3471,8 +3485,6 @@  void do_general_protection(struct cpu_user_regs *regs)
     struct vcpu *v = current;
     unsigned long fixup;
 
-    DEBUGGER_trap_entry(TRAP_gp_fault, regs);
-
     if ( regs->error_code & X86_XEC_EXT )
         goto hardware_gp;
 
@@ -3543,7 +3555,8 @@  void do_general_protection(struct cpu_user_regs *regs)
     }
 
  hardware_gp:
-    DEBUGGER_trap_fatal(TRAP_gp_fault, regs);
+    if ( debugger_trap_fatal(TRAP_gp_fault, regs) )
+        return;
 
     show_execution_state(regs);
     panic("GENERAL PROTECTION FAULT\n[error_code=%04x]", regs->error_code);
@@ -3778,8 +3791,6 @@  void do_debug(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
 
-    DEBUGGER_trap_entry(TRAP_debug, regs);
-
     if ( !guest_mode(regs) )
     {
         if ( regs->eflags & X86_EFLAGS_TF )
@@ -3814,6 +3825,13 @@  void do_debug(struct cpu_user_regs *regs)
     /* Save debug status register where guest OS can peek at it */
     v->arch.debugreg[6] = read_debugreg(6);
 
+    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached )
+    {
+        v->arch.gdbsx_vcpu_event = TRAP_debug;
+        domain_pause_for_debugger();
+        return;
+    }
+
     ler_enable();
     do_guest_trap(TRAP_debug, regs, 0);
     return;
diff --git a/xen/include/asm-x86/debugger.h b/xen/include/asm-x86/debugger.h
index 271cbaa..0f95fe9 100644
--- a/xen/include/asm-x86/debugger.h
+++ b/xen/include/asm-x86/debugger.h
@@ -33,17 +33,11 @@ 
 #include <asm/regs.h>
 #include <asm/processor.h>
 
-/* The main trap handlers use these helper macros which include early bail. */
-#define DEBUGGER_trap_entry(_v, _r) \
-    if ( debugger_trap_entry(_v, _r) ) return;
-#define DEBUGGER_trap_fatal(_v, _r) \
-    if ( debugger_trap_fatal(_v, _r) ) return;
-
 #ifdef CONFIG_CRASH_DEBUG
 
 #include <xen/gdbstub.h>
 
-static inline int debugger_trap_fatal(
+static inline bool debugger_trap_fatal(
     unsigned int vector, struct cpu_user_regs *regs)
 {
     int rc = __trap_to_gdb(regs, vector);
@@ -55,33 +49,16 @@  static inline int debugger_trap_fatal(
 
 #else
 
-static inline int debugger_trap_fatal(
+static inline bool debugger_trap_fatal(
     unsigned int vector, struct cpu_user_regs *regs)
 {
-    return 0;
+    return false;
 }
 
 #define debugger_trap_immediate() ((void)0)
 
 #endif
 
-static inline int debugger_trap_entry(
-    unsigned int vector, struct cpu_user_regs *regs)
-{
-    struct vcpu *v = current;
-
-    if ( guest_kernel_mode(v, regs) && v->domain->debugger_attached &&
-         ((vector == TRAP_int3) || (vector == TRAP_debug)) )
-    {
-        if ( vector != TRAP_debug ) /* domain pause is good enough */
-            current->arch.gdbsx_vcpu_event = vector;
-        domain_pause_for_debugger();
-        return 1;
-    }
-
-    return 0;
-}
-
 unsigned int dbg_rw_mem(void * __user addr, void * __user buf,
                         unsigned int len, domid_t domid, bool_t toaddr,
                         uint64_t pgd3);