diff mbox series

[v3,1/3] xen/x86: add nmi continuation framework

Message ID 20201016085350.10233-2-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series xen/x86: implement NMI continuation | expand

Commit Message

Jürgen Groß Oct. 16, 2020, 8:53 a.m. UTC
Actions in NMI context are rather limited as e.g. locking is rather
fragile.

Add a generic framework to continue processing in normal interrupt
context after leaving NMI processing.

This is done by a high priority interrupt vector triggered via a
self IPI from NMI context, which will then call the continuation
function specified during NMI handling.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- add prototype for continuation function (Roger Pau Monné)
- switch from softirq to explicit self-IPI (Jan Beulich)
---
 xen/arch/x86/apic.c       | 13 +++++++---
 xen/arch/x86/traps.c      | 52 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/nmi.h | 13 +++++++++-
 3 files changed, 74 insertions(+), 4 deletions(-)

Comments

Jan Beulich Oct. 20, 2020, 1:33 p.m. UTC | #1
On 16.10.2020 10:53, Juergen Gross wrote:
> Actions in NMI context are rather limited as e.g. locking is rather
> fragile.
> 
> Add a generic framework to continue processing in normal interrupt
> context after leaving NMI processing.
> 
> This is done by a high priority interrupt vector triggered via a
> self IPI from NMI context, which will then call the continuation
> function specified during NMI handling.

I'm concerned by there being just a single handler allowed, when
the series already introduces two uses. A single NMI instance
may signal multiple things in one go. At the very least we then
need a priority, such that SERR could override oprofile.

> @@ -1799,6 +1800,57 @@ void unset_nmi_callback(void)
>      nmi_callback = dummy_nmi_callback;
>  }
>  
> +static DEFINE_PER_CPU(nmi_contfunc_t *, nmi_cont_func);
> +static DEFINE_PER_CPU(void *, nmi_cont_arg);
> +static DEFINE_PER_CPU(bool, nmi_cont_busy);
> +
> +bool nmi_check_continuation(void)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    nmi_contfunc_t *func = per_cpu(nmi_cont_func, cpu);
> +    void *arg = per_cpu(nmi_cont_arg, cpu);
> +
> +    if ( per_cpu(nmi_cont_busy, cpu) )
> +    {
> +        per_cpu(nmi_cont_busy, cpu) = false;
> +        printk("Trying to set NMI continuation while still one active!\n");

I guess the bool would better be a const void *, written
with __builtin_return_address(0) and logged accordingly here
(also including the CPU number).

Also (nit) maybe "... while one still active"?

> +    }
> +
> +    /* Reads must be done before following write (local cpu ordering only). */
> +    barrier();
> +
> +    per_cpu(nmi_cont_func, cpu) = NULL;

I think this still is too lax, and you really need to use
xchg() to make sure you won't lose any case of "busy" (which
may become set between the if() above and the write here).

Jan
Jürgen Groß Nov. 5, 2020, 10:22 a.m. UTC | #2
On 20.10.20 15:33, Jan Beulich wrote:
> On 16.10.2020 10:53, Juergen Gross wrote:
>> Actions in NMI context are rather limited as e.g. locking is rather
>> fragile.
>>
>> Add a generic framework to continue processing in normal interrupt
>> context after leaving NMI processing.
>>
>> This is done by a high priority interrupt vector triggered via a
>> self IPI from NMI context, which will then call the continuation
>> function specified during NMI handling.
> 
> I'm concerned by there being just a single handler allowed, when
> the series already introduces two uses. A single NMI instance
> may signal multiple things in one go. At the very least we then
> need a priority, such that SERR could override oprofile.

A different approach could be not to introduce a generic interface,
but to explicitly call the continuation handlers in the interrupt
handler.

Instead of a function pointer, a parameter pointer and a busy
indicator (probably another function pointer) per cpu, we'd need for
now only a parameter value per cpu (for the oprofile case) and a
global flag (for the SERR case).

The downside would be having to add additional fields for other
use cases, but for now I think this could be the better way,
especially as this would remove the theoretical case of multiple
issues overwriting one another.

Thoughts?


Juergen
Jan Beulich Nov. 5, 2020, 11:25 a.m. UTC | #3
On 05.11.2020 11:22, Jürgen Groß wrote:
> On 20.10.20 15:33, Jan Beulich wrote:
>> On 16.10.2020 10:53, Juergen Gross wrote:
>>> Actions in NMI context are rather limited as e.g. locking is rather
>>> fragile.
>>>
>>> Add a generic framework to continue processing in normal interrupt
>>> context after leaving NMI processing.
>>>
>>> This is done by a high priority interrupt vector triggered via a
>>> self IPI from NMI context, which will then call the continuation
>>> function specified during NMI handling.
>>
>> I'm concerned by there being just a single handler allowed, when
>> the series already introduces two uses. A single NMI instance
>> may signal multiple things in one go. At the very least we then
>> need a priority, such that SERR could override oprofile.
> 
> A different approach could be not to introduce a generic interface,
> but to explicitly call the continuation handlers in the interrupt
> handler.
> 
> Instead of a function pointer, a parameter pointer and a busy
> indicator (probably another function pointer) per cpu, we'd need for
> now only a parameter value per cpu (for the oprofile case) and a
> global flag (for the SERR case).
> 
> The downside would be having to add additional fields for other
> use cases, but for now I think this could be the better way,
> especially as this would remove the theoretical case of multiple
> issues overwriting one another.

Yes, perhaps less abstraction is the better approach here, for now
at least. Perhaps give Andrew and Roger a little bit of time to
object before going down that route.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 60627fd6e6..7497ddb5da 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -40,6 +40,7 @@ 
 #include <irq_vectors.h>
 #include <xen/kexec.h>
 #include <asm/guest.h>
+#include <asm/nmi.h>
 #include <asm/time.h>
 
 static bool __read_mostly tdt_enabled;
@@ -1376,16 +1377,22 @@  void spurious_interrupt(struct cpu_user_regs *regs)
 {
     /*
      * Check if this is a vectored interrupt (most likely, as this is probably
-     * a request to dump local CPU state). Vectored interrupts are ACKed;
-     * spurious interrupts are not.
+     * a request to dump local CPU state or to continue NMI handling).
+     * Vectored interrupts are ACKed; spurious interrupts are not.
      */
     if (apic_isr_read(SPURIOUS_APIC_VECTOR)) {
+        bool is_spurious;
+
         ack_APIC_irq();
+        is_spurious = !nmi_check_continuation();
         if (this_cpu(state_dump_pending)) {
             this_cpu(state_dump_pending) = false;
             dump_execstate(regs);
-            return;
+            is_spurious = false;
         }
+
+        if ( !is_spurious )
+            return;
     }
 
     /* see sw-dev-man vol 3, chapter 7.4.13.5 */
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index bc5b8f8ea3..6f4db9d549 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -79,6 +79,7 @@ 
 #include <public/hvm/params.h>
 #include <asm/cpuid.h>
 #include <xsm/xsm.h>
+#include <asm/mach-default/irq_vectors.h>
 #include <asm/pv/traps.h>
 #include <asm/pv/mm.h>
 
@@ -1799,6 +1800,57 @@  void unset_nmi_callback(void)
     nmi_callback = dummy_nmi_callback;
 }
 
+static DEFINE_PER_CPU(nmi_contfunc_t *, nmi_cont_func);
+static DEFINE_PER_CPU(void *, nmi_cont_arg);
+static DEFINE_PER_CPU(bool, nmi_cont_busy);
+
+bool nmi_check_continuation(void)
+{
+    unsigned int cpu = smp_processor_id();
+    nmi_contfunc_t *func = per_cpu(nmi_cont_func, cpu);
+    void *arg = per_cpu(nmi_cont_arg, cpu);
+
+    if ( per_cpu(nmi_cont_busy, cpu) )
+    {
+        per_cpu(nmi_cont_busy, cpu) = false;
+        printk("Trying to set NMI continuation while still one active!\n");
+    }
+
+    /* Reads must be done before following write (local cpu ordering only). */
+    barrier();
+
+    per_cpu(nmi_cont_func, cpu) = NULL;
+
+    if ( func )
+        func(arg);
+
+    return func;
+}
+
+int set_nmi_continuation(nmi_contfunc_t *func, void *arg)
+{
+    unsigned int cpu = smp_processor_id();
+
+    if ( per_cpu(nmi_cont_func, cpu) )
+    {
+        per_cpu(nmi_cont_busy, cpu) = true;
+        return -EBUSY;
+    }
+
+    per_cpu(nmi_cont_func, cpu) = func;
+    per_cpu(nmi_cont_arg, cpu) = arg;
+
+    /*
+     * Issue a self-IPI. Handling is done in spurious_interrupt().
+     * NMI could have happened in IPI sequence, so wait for ICR being idle
+     * again before leaving NMI handler.
+     */
+    send_IPI_self(SPURIOUS_APIC_VECTOR);
+    apic_wait_icr_idle();
+
+    return 0;
+}
+
 void do_device_not_available(struct cpu_user_regs *regs)
 {
 #ifdef CONFIG_PV
diff --git a/xen/include/asm-x86/nmi.h b/xen/include/asm-x86/nmi.h
index a288f02a50..68db75b1ed 100644
--- a/xen/include/asm-x86/nmi.h
+++ b/xen/include/asm-x86/nmi.h
@@ -33,5 +33,16 @@  nmi_callback_t *set_nmi_callback(nmi_callback_t *callback);
 void unset_nmi_callback(void);
 
 DECLARE_PER_CPU(unsigned int, nmi_count);
- 
+
+typedef void nmi_contfunc_t(void *arg);
+
+/**
+ * set_nmi_continuation
+ *
+ * Schedule a function to be started in interrupt context after NMI handling.
+ */
+int set_nmi_continuation(nmi_contfunc_t *func, void *arg);
+
+/* Check for NMI continuation pending. */
+bool nmi_check_continuation(void);
 #endif /* ASM_NMI_H */