[v2,4/6] x86: track when in #NMI context
diff mbox series

Message ID 20200217184324.73762-5-roger.pau@citrix.com
State New
Headers show
Series
  • x86: fixes/improvements for scratch cpumask
Related show

Commit Message

Roger Pau Monné Feb. 17, 2020, 6:43 p.m. UTC
Add helpers to track when running in #MC context. This is modeled
after the in_irq helpers, but does not support reentry.

Note that there are no users of in_mc() introduced by the change,
further users will be added by followup changes.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/traps.c          |  6 ++++++
 xen/include/asm-x86/hardirq.h | 18 +++++++++++++++++-
 2 files changed, 23 insertions(+), 1 deletion(-)

Comments

Andrew Cooper Feb. 18, 2020, 10:40 a.m. UTC | #1
On 17/02/2020 18:43, Roger Pau Monne wrote:
> Add helpers to track when running in #MC context. This is modeled
> after the in_irq helpers, but does not support reentry.
>
> Note that there are no users of in_mc() introduced by the change,
> further users will be added by followup changes.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

You probably mean s/mc/nmi/ throughout the commit message, but I'm
afraid these are rather problematic.

NMIs can be recursively entered, especially on corner cases in the crash
path.  Asserting that the crash path is not recursive can lead to never
entering the crash kernel.

~Andrew
Roger Pau Monné Feb. 18, 2020, 10:59 a.m. UTC | #2
On Tue, Feb 18, 2020 at 10:40:02AM +0000, Andrew Cooper wrote:
> On 17/02/2020 18:43, Roger Pau Monne wrote:
> > Add helpers to track when running in #MC context. This is modeled
> > after the in_irq helpers, but does not support reentry.
> >
> > Note that there are no users of in_mc() introduced by the change,
> > further users will be added by followup changes.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> You probably mean s/mc/nmi/ throughout the commit message, but I'm
> afraid these are rather problematic.

Er, yes, sorry, c&p from the previous commit and I failed to adjust
it.

> 
> NMIs can be recursively entered, especially on corner cases in the crash
> path.  Asserting that the crash path is not recursive can lead to never
> entering the crash kernel.

Is this specific to how Xen handles #NMI?

Intel SDM states that #NMI is not reentrant, as further #NMIs are
blocked until the execution of the iret instruction:

"While an NMI interrupt handler is executing, the processor blocks
delivery of subsequent NMIs until the next execu- tion of the IRET
instruction. This blocking of NMIs prevents nested execution of the
NMI handler. It is recommended that the NMI interrupt handler be
accessed through an interrupt gate to disable maskable hardware
interrupts (see Section 6.8.1, “Masking Maskable Hardware
Interrupts”)."

AFAICT there's no iret in do_nmi until it has finished execution.

Roger.

Patch
diff mbox series

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 3dbc66bb64..f4f2c13ae9 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1692,9 +1692,13 @@  void do_nmi(const struct cpu_user_regs *regs)
     bool handle_unknown = false;
 
     this_cpu(nmi_count)++;
+    nmi_enter();
 
     if ( nmi_callback(regs, cpu) )
+    {
+        nmi_exit();
         return;
+    }
 
     /*
      * Accessing port 0x61 may trap to SMM which has been actually
@@ -1720,6 +1724,8 @@  void do_nmi(const struct cpu_user_regs *regs)
         if ( !(reason & 0xc0) && handle_unknown )
             unknown_nmi_error(regs, reason);
     }
+
+    nmi_exit();
 }
 
 nmi_callback_t *set_nmi_callback(nmi_callback_t *callback)
diff --git a/xen/include/asm-x86/hardirq.h b/xen/include/asm-x86/hardirq.h
index af3eab6a4d..8bcae99eac 100644
--- a/xen/include/asm-x86/hardirq.h
+++ b/xen/include/asm-x86/hardirq.h
@@ -2,12 +2,14 @@ 
 #define __ASM_HARDIRQ_H
 
 #include <xen/cache.h>
+#include <xen/lib.h>
+#include <xen/smp.h>
 #include <xen/types.h>
 
 typedef struct {
 	unsigned int __softirq_pending;
 	unsigned int __local_irq_count;
-	unsigned int __nmi_count;
+	bool in_nmi;
 	unsigned int mc_count;
 	bool_t __mwait_wakeup;
 } __cacheline_aligned irq_cpustat_t;
@@ -23,6 +25,20 @@  typedef struct {
 #define mc_enter()	(mc_count(smp_processor_id())++)
 #define mc_exit()	(mc_count(smp_processor_id())--)
 
+#define in_nmi()	__IRQ_STAT(smp_processor_id(), in_nmi)
+
+static inline void nmi_enter(void)
+{
+    ASSERT(!in_nmi());
+    in_nmi() = true;
+}
+
+static inline void nmi_exit(void)
+{
+    ASSERT(in_nmi());
+    in_nmi() = false;
+}
+
 void ack_bad_irq(unsigned int irq);
 
 extern void apic_intr_init(void);