diff mbox series

[RFC,05/13] x86/irq: Reserve a user IPI notification vector

Message ID 20210913200132.3396598-6-sohil.mehta@intel.com (mailing list archive)
State New
Headers show
Series x86 User Interrupts support | expand

Commit Message

Sohil Mehta Sept. 13, 2021, 8:01 p.m. UTC
A user interrupt notification vector is used on the receiver's cpu to
identify an interrupt as a user interrupt (and not a kernel interrupt).
Hardware uses the same notification vector to generate an IPI from a
sender's cpu core when the SENDUIPI instruction is executed.

Typically, the kernel shouldn't receive an interrupt with this vector.
However, it is possible that the kernel might receive this vector.

Scenario that can cause the spurious interrupt:

Step	cpu 0 (receiver task)		cpu 1 (sender task)
----	---------------------		-------------------
1	task is running
2					executes SENDUIPI
3					IPI sent
4	context switched out
5	IPI delivered
	(kernel interrupt detected)

A kernel interrupt can be detected, if a receiver task gets scheduled
out after the SENDUIPI-based IPI was sent but before the IPI was
delivered.

The kernel doesn't need to do anything in this case other than receiving
the interrupt and clearing the local APIC. The user interrupt is always
stored in the receiver's UPID before the IPI is generated. When the
receiver gets scheduled back the interrupt would be delivered based on
its UPID.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
---
 arch/x86/include/asm/hardirq.h     |  3 +++
 arch/x86/include/asm/idtentry.h    |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  5 ++++-
 arch/x86/kernel/idt.c              |  3 +++
 arch/x86/kernel/irq.c              | 33 ++++++++++++++++++++++++++++++
 5 files changed, 47 insertions(+), 1 deletion(-)

Comments

Thomas Gleixner Sept. 23, 2021, 11:07 p.m. UTC | #1
On Mon, Sep 13 2021 at 13:01, Sohil Mehta wrote:
> A user interrupt notification vector is used on the receiver's cpu to
> identify an interrupt as a user interrupt (and not a kernel interrupt).
> Hardware uses the same notification vector to generate an IPI from a
> sender's cpu core when the SENDUIPI instruction is executed.
>
> Typically, the kernel shouldn't receive an interrupt with this vector.
> However, it is possible that the kernel might receive this vector.
>
> Scenario that can cause the spurious interrupt:
>
> Step	cpu 0 (receiver task)		cpu 1 (sender task)
> ----	---------------------		-------------------
> 1	task is running
> 2					executes SENDUIPI
> 3					IPI sent
> 4	context switched out
> 5	IPI delivered
> 	(kernel interrupt detected)
>
> A kernel interrupt can be detected, if a receiver task gets scheduled
> out after the SENDUIPI-based IPI was sent but before the IPI was
> delivered.

What happens if the SENDUIPI is issued when the target task is not on
the CPU? How is that any different from the above?

> The kernel doesn't need to do anything in this case other than receiving
> the interrupt and clearing the local APIC. The user interrupt is always
> stored in the receiver's UPID before the IPI is generated. When the
> receiver gets scheduled back the interrupt would be delivered based on
> its UPID.

So why on earth is that vector reaching the CPU at all?

> +#ifdef CONFIG_X86_USER_INTERRUPTS
> +	seq_printf(p, "%*s: ", prec, "UIS");

No point in printing that when user interrupts are not available/enabled
on the system.

> +	for_each_online_cpu(j)
> +		seq_printf(p, "%10u ", irq_stats(j)->uintr_spurious_count);
> +	seq_puts(p, "  User-interrupt spurious event\n");
>  #endif
>  	return 0;
>  }
> @@ -325,6 +331,33 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi)
>  }
>  #endif
>  
> +#ifdef CONFIG_X86_USER_INTERRUPTS
> +/*
> + * Handler for UINTR_NOTIFICATION_VECTOR.
> + *
> + * The notification vector is used by the cpu to detect a User Interrupt. In
> + * the typical usage, the cpu would handle this interrupt and clear the local
> + * apic.
> + *
> + * However, it is possible that the kernel might receive this vector. This can
> + * happen if the receiver thread was running when the interrupt was sent but it
> + * got scheduled out before the interrupt was delivered. The kernel doesn't
> + * need to do anything other than clearing the local APIC. A pending user
> + * interrupt is always saved in the receiver's UPID which can be referenced
> + * when the receiver gets scheduled back.
> + *
> + * If the kernel receives a storm of these, it could mean an issue with the
> + * kernel's saving and restoring of the User Interrupt MSR state; Specifically,
> + * the notification vector bits in the IA32_UINTR_MISC_MSR.

Definitely well thought out hardware that.

Thanks,

        tglx
Thomas Gleixner Sept. 25, 2021, 1:30 p.m. UTC | #2
On Fri, Sep 24 2021 at 01:07, Thomas Gleixner wrote:
> On Mon, Sep 13 2021 at 13:01, Sohil Mehta wrote:
>> The kernel doesn't need to do anything in this case other than receiving
>> the interrupt and clearing the local APIC. The user interrupt is always
>> stored in the receiver's UPID before the IPI is generated. When the
>> receiver gets scheduled back the interrupt would be delivered based on
>> its UPID.
>
> So why on earth is that vector reaching the CPU at all?

Let's see how this works:

  task starts using UINTR.
    set UINTR_NOTIFACTION_VECTOR in MSR_IA32_UINTR_MISC
    
So from that point on the User-Interrupt Notification Identification
mechanism swallows the vector.

Where this stops working is not limited to context switch. The wreckage
comes from XSAVES:

 "After saving the user-interrupt state component, XSAVES clears
  UINV. (UINV is IA32_UINTR_MISC[39:32]; XSAVES does not modify the
  remainder of that MSR.)"

So the problem is _not_ context switch. The problem is XSAVES and that
can be issued even without a context switch.

The obvious question is: What is the value of clearing UINV?

Absolutely none. That notification vector cannot be used for anything
else, so why would the OS be interested to see it ever? This is about
user space interupts, right?

UINV should be set _ONCE_ when CR4.UINTR is enabled and not be touched
by XSAVES/XRSTORS at all. Any delivery of this vector to the OS should
be considered a hardware bug.

Thanks,

         tglx
Thomas Gleixner Sept. 26, 2021, 12:39 p.m. UTC | #3
On Sat, Sep 25 2021 at 15:30, Thomas Gleixner wrote:
> On Fri, Sep 24 2021 at 01:07, Thomas Gleixner wrote:
> The obvious question is: What is the value of clearing UINV?
>
> Absolutely none. That notification vector cannot be used for anything
> else, so why would the OS be interested to see it ever? This is about
> user space interupts, right?
>
> UINV should be set _ONCE_ when CR4.UINTR is enabled and not be touched
> by XSAVES/XRSTORS at all. Any delivery of this vector to the OS should
> be considered a hardware bug.

After decoding the documentation (sigh) and staring at the implications of
keeping UINV armed, I can see the point vs. the UPID lifetime issue when
a task gets scheduled out and migrated to a different CPU.

Not the most pretty solution, but as there needs to be some invalidation
which needs to be undone on return to user space it probably does not
matter much. 

As the whole thing is tightly coupled to XSAVES/RSTORS we need to
integrate it into that machinery and not pretend that it's something
half independent.

That means we have to handle the setting of the SN bit in UPID whenever
XSTATE is saved either during context switch, when the kernel uses the
FPU or in other places (signals, fpu_clone ...). They all end up in
save_fpregs_to_fpstate() so that might be the place to look at.

While talking about that: fpu_clone() has to invalidate the UINTR state
in the clone's xstate after the memcpy() or xsaves() operation.

Also the restore portion on the way back to user space has to be coupled
more tightly:

arch_exit_to_user_mode_prepare()
{
        ...
        if (unlikely(ti_work & _TIF_UPID))
        	uintr_restore_upid(ti_work & _TIF_NEED_FPU_LOAD);
        if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
        	switch_fpu_return();
}

upid_set_ndst(upid)
{
	apicid = __this_cpu_read(x86_cpu_to_apicid);

        if (x2apic_enabled())
            upid->ndst.x2apic = apicid;
        else
            upid->ndst.apic = apicid;
}

uintr_restore_upid(bool xrstors_pending)
{
        clear_thread_flag(TIF_UPID);
        
	// Update destination
        upid_set_ndst(upid);

        // Do we need something stronger here?
        barrier();

        clear_bit(SN, upid->status);

        // Any SENDUIPI after this point sends to this CPU
           
        // Any bit which was set in upid->pir after SN was set
        // and/or UINV was cleared by XSAVES up to the point
        // where SN was cleared above is not reflected in UIRR.

	// As this runs with interrupts disabled the current state
        // of upid->pir can be read and used for restore. A SENDUIPI
        // which sets a bit in upid->pir after that read will send
        // the notification vector which is going to be handled once
        // the task reenables interrupts on return to user space.
        // If the SENDUIPI set the bit before the read then the
        // notification vector handling will just observe the same
        // PIR state.

        // Needs to be a locked access as there might be a
        // concurrent SENDUIPI modiying it.
        pir = read_locked(upid->pir);

        if (xrstors_pending)) {
        	// Update the saved xstate for xrstors
           	current->xstate.uintr.uinv = UINTR_NOTIFICATION_VECTOR;
                current->xstate.uintr.uirr = pir;
        } else {
                // Manually restore UIRR and UINV
                wrmsrl(IA32_UINTR_RR, pir);

	        misc.val64 = 0;
                misc.uittsz = current->uintr->uittsz;
                misc.uinv = UINTR_NOTIFICATION_VECTOR;
                wrmsrl(IA32_UINTR_MISC, misc.val64);
        }
}

That's how I deciphered the documentation and I don't think this is far
from reality, but I might be wrong as usual.

Hmm?

Thanks,

        tglx
Sohil Mehta Sept. 27, 2021, 7:07 p.m. UTC | #4
On 9/26/2021 5:39 AM, Thomas Gleixner wrote:
> On Sat, Sep 25 2021 at 15:30, Thomas Gleixner wrote:
>> On Fri, Sep 24 2021 at 01:07, Thomas Gleixner wrote:
>> The obvious question is: What is the value of clearing UINV?
>>
>> Absolutely none. That notification vector cannot be used for anything
>> else, so why would the OS be interested to see it ever? This is about
>> user space interupts, right?
>>
>> UINV should be set _ONCE_ when CR4.UINTR is enabled and not be touched
>> by XSAVES/XRSTORS at all. Any delivery of this vector to the OS should
>> be considered a hardware bug.
> After decoding the documentation (sigh) and staring at the implications of
> keeping UINV armed, I can see the point vs. the UPID lifetime issue when
> a task gets scheduled out and migrated to a different CPU.


I think you got it right. Here is my understanding of this.

The User-interrupt notification processing moves all the pending 
interrupts from UPID.PIR to the UIRR.

As you mentioned below, XSTATE is saved due to several reasons which 
saves the UIRR into memory. UIRR should no longer be updated after it 
has been saved.

XSAVES clears UINV is to stop detecting additional interrupts in the 
UIRR after it has been saved.


> Not the most pretty solution, but as there needs to be some invalidation
> which needs to be undone on return to user space it probably does not
> matter much.
>
> As the whole thing is tightly coupled to XSAVES/RSTORS we need to
> integrate it into that machinery and not pretend that it's something
> half independent.


I agree. Thank you for pointing this out.

> That means we have to handle the setting of the SN bit in UPID whenever
> XSTATE is saved either during context switch, when the kernel uses the
> FPU or in other places (signals, fpu_clone ...). They all end up in
> save_fpregs_to_fpstate() so that might be the place to look at.

  Yes. The current code doesn't do this. SN bit should be set whenever 
UINTR XSTATE is saved.

> While talking about that: fpu_clone() has to invalidate the UINTR state
> in the clone's xstate after the memcpy() or xsaves() operation.
>
> Also the restore portion on the way back to user space has to be coupled
> more tightly:
>
> arch_exit_to_user_mode_prepare()
> {
>          ...
>          if (unlikely(ti_work & _TIF_UPID))
>          	uintr_restore_upid(ti_work & _TIF_NEED_FPU_LOAD);
>          if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
>          	switch_fpu_return();
> }

I am assuming _TIF_UPID would be set everytime SN is set and XSTATE is 
saved.

> upid_set_ndst(upid)
> {
> 	apicid = __this_cpu_read(x86_cpu_to_apicid);
>
>          if (x2apic_enabled())
>              upid->ndst.x2apic = apicid;
>          else
>              upid->ndst.apic = apicid;
> }
>
> uintr_restore_upid(bool xrstors_pending)
> {
>          clear_thread_flag(TIF_UPID);
>          
> 	// Update destination
>          upid_set_ndst(upid);
>
>          // Do we need something stronger here?
>          barrier();
>
>          clear_bit(SN, upid->status);
>
>          // Any SENDUIPI after this point sends to this CPU
>             
>          // Any bit which was set in upid->pir after SN was set
>          // and/or UINV was cleared by XSAVES up to the point
>          // where SN was cleared above is not reflected in UIRR.
>
> 	// As this runs with interrupts disabled the current state
>          // of upid->pir can be read and used for restore. A SENDUIPI
>          // which sets a bit in upid->pir after that read will send
>          // the notification vector which is going to be handled once
>          // the task reenables interrupts on return to user space.
>          // If the SENDUIPI set the bit before the read then the
>          // notification vector handling will just observe the same
>          // PIR state.
>
>          // Needs to be a locked access as there might be a
>          // concurrent SENDUIPI modiying it.
>          pir = read_locked(upid->pir);
>
>          if (xrstors_pending)) {
>          	// Update the saved xstate for xrstors
>             	current->xstate.uintr.uinv = UINTR_NOTIFICATION_VECTOR;

XSAVES saves the UINV value into the XSTATE buffer. I am not sure if we 
need this again. Is it because it could have been overwritten by calling 
XSAVES twice?


>                  current->xstate.uintr.uirr = pir;

I believe PIR should be ORed. There could be some bits already set in 
the UIRR.

Also, shouldn't UPID->PIR be cleared? If not, we would detect these 
interrupts all over again during the next ring transition.

>          } else {
>                  // Manually restore UIRR and UINV
>                  wrmsrl(IA32_UINTR_RR, pir);
I believe read-modify-write here as well.
> 	        misc.val64 = 0;
>                  misc.uittsz = current->uintr->uittsz;
>                  misc.uinv = UINTR_NOTIFICATION_VECTOR;
>                  wrmsrl(IA32_UINTR_MISC, misc.val64);

Thanks! This helps reduce the additional MSR read.

>          }
> }
>
> That's how I deciphered the documentation and I don't think this is far
> from reality, but I might be wrong as usual.
>
> Hmm?

Thank you for the simplification. This is very helpful.

Sohil
Sohil Mehta Sept. 27, 2021, 7:26 p.m. UTC | #5
On 9/23/2021 4:07 PM, Thomas Gleixner wrote:
> On Mon, Sep 13 2021 at 13:01, Sohil Mehta wrote:
>> A user interrupt notification vector is used on the receiver's cpu to
>> identify an interrupt as a user interrupt (and not a kernel interrupt).
>> Hardware uses the same notification vector to generate an IPI from a
>> sender's cpu core when the SENDUIPI instruction is executed.
>>
>> Typically, the kernel shouldn't receive an interrupt with this vector.
>> However, it is possible that the kernel might receive this vector.
>>
>> Scenario that can cause the spurious interrupt:
>>
>> Step	cpu 0 (receiver task)		cpu 1 (sender task)
>> ----	---------------------		-------------------
>> 1	task is running
>> 2					executes SENDUIPI
>> 3					IPI sent
>> 4	context switched out
>> 5	IPI delivered
>> 	(kernel interrupt detected)
>>
>> A kernel interrupt can be detected, if a receiver task gets scheduled
>> out after the SENDUIPI-based IPI was sent but before the IPI was
>> delivered.
> What happens if the SENDUIPI is issued when the target task is not on
> the CPU? How is that any different from the above?


This didn't get covered in the other thread. Thought, I would clarify 
this a bit more.

A notification IPI is sent from the CPU that executes SENDUIPI if the 
target task is running (SN is 0).

If the target task is not running SN bit in the UPID is set, which 
prevents any notification interrupts from being generated.

However, it is possible that SN is 0 when SENDUIPI was executed which 
generates the notification IPI. But when the IPI arrives on receiver 
CPU, SN has been set, the task state has been saved and UINV has been 
cleared.

A kernel interrupt is detected in this case. I have a sample that demos 
this. I'll fix the current code and then send out the results.


>> The kernel doesn't need to do anything in this case other than receiving
>> the interrupt and clearing the local APIC. The user interrupt is always
>> stored in the receiver's UPID before the IPI is generated. When the
>> receiver gets scheduled back the interrupt would be delivered based on
>> its UPID.
> So why on earth is that vector reaching the CPU at all?

You covered this in the other thread.

>> +#ifdef CONFIG_X86_USER_INTERRUPTS
>> +	seq_printf(p, "%*s: ", prec, "UIS");
> No point in printing that when user interrupts are not available/enabled
> on the system.
>
Will fix this.

Thanks,

Sohil
Thomas Gleixner Sept. 28, 2021, 8:11 a.m. UTC | #6
Sohil,

On Mon, Sep 27 2021 at 12:07, Sohil Mehta wrote:
> On 9/26/2021 5:39 AM, Thomas Gleixner wrote:
>
> The User-interrupt notification processing moves all the pending 
> interrupts from UPID.PIR to the UIRR.

Indeed that makes sense. Should have thought about that myself.

>> Also the restore portion on the way back to user space has to be coupled
>> more tightly:
>>
>> arch_exit_to_user_mode_prepare()
>> {
>>          ...
>>          if (unlikely(ti_work & _TIF_UPID))
>>          	uintr_restore_upid(ti_work & _TIF_NEED_FPU_LOAD);
>>          if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
>>          	switch_fpu_return();
>> }
>
> I am assuming _TIF_UPID would be set everytime SN is set and XSTATE is 
> saved.

Yes.

>> upid_set_ndst(upid)
>> {
>> 	apicid = __this_cpu_read(x86_cpu_to_apicid);
>>
>>          if (x2apic_enabled())
>>              upid->ndst.x2apic = apicid;
>>          else
>>              upid->ndst.apic = apicid;
>> }
>>
>> uintr_restore_upid(bool xrstors_pending)
>> {
>>          clear_thread_flag(TIF_UPID);
>>          
>> 	// Update destination
>>          upid_set_ndst(upid);
>>
>>          // Do we need something stronger here?
>>          barrier();
>>
>>          clear_bit(SN, upid->status);
>>
>>          // Any SENDUIPI after this point sends to this CPU
>>             
>>          // Any bit which was set in upid->pir after SN was set
>>          // and/or UINV was cleared by XSAVES up to the point
>>          // where SN was cleared above is not reflected in UIRR.
>>
>> 	// As this runs with interrupts disabled the current state
>>          // of upid->pir can be read and used for restore. A SENDUIPI
>>          // which sets a bit in upid->pir after that read will send
>>          // the notification vector which is going to be handled once
>>          // the task reenables interrupts on return to user space.
>>          // If the SENDUIPI set the bit before the read then the
>>          // notification vector handling will just observe the same
>>          // PIR state.
>>
>>          // Needs to be a locked access as there might be a
>>          // concurrent SENDUIPI modiying it.
>>          pir = read_locked(upid->pir);
>>
>>          if (xrstors_pending)) {
>>          	// Update the saved xstate for xrstors
>>             	current->xstate.uintr.uinv = UINTR_NOTIFICATION_VECTOR;
>
> XSAVES saves the UINV value into the XSTATE buffer. I am not sure if we 
> need this again. Is it because it could have been overwritten by calling 
> XSAVES twice?

Yes that can happen AFAICT. I haven't done a deep analysis, but this
needs to looked at.

>>                  current->xstate.uintr.uirr = pir;
>
> I believe PIR should be ORed. There could be some bits already set in 
> the UIRR.
>
> Also, shouldn't UPID->PIR be cleared? If not, we would detect these 
> interrupts all over again during the next ring transition.

Right. So that PIR read above needs to be a locked cmpxchg().

>>          } else {
>>                  // Manually restore UIRR and UINV
>>                  wrmsrl(IA32_UINTR_RR, pir);
> I believe read-modify-write here as well.

Sigh, yes.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 275e7fd20310..279afc01f1ac 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -19,6 +19,9 @@  typedef struct {
 	unsigned int kvm_posted_intr_ipis;
 	unsigned int kvm_posted_intr_wakeup_ipis;
 	unsigned int kvm_posted_intr_nested_ipis;
+#endif
+#ifdef CONFIG_X86_USER_INTERRUPTS
+	unsigned int uintr_spurious_count;
 #endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..5929a6f9eeee 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -671,6 +671,10 @@  DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_WAKEUP_VECTOR,	sysvec_kvm_posted_intr_wakeup
 DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR,	sysvec_kvm_posted_intr_nested_ipi);
 #endif
 
+#ifdef CONFIG_X86_USER_INTERRUPTS
+DECLARE_IDTENTRY_SYSVEC(UINTR_NOTIFICATION_VECTOR,	sysvec_uintr_spurious_interrupt);
+#endif
+
 #if IS_ENABLED(CONFIG_HYPERV)
 DECLARE_IDTENTRY_SYSVEC(HYPERVISOR_CALLBACK_VECTOR,	sysvec_hyperv_callback);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_REENLIGHTENMENT_VECTOR,	sysvec_hyperv_reenlightenment);
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 43dcb9284208..d26faa504931 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -104,7 +104,10 @@ 
 #define HYPERV_STIMER0_VECTOR		0xed
 #endif
 
-#define LOCAL_TIMER_VECTOR		0xec
+/* Vector for User interrupt notifications */
+#define UINTR_NOTIFICATION_VECTOR       0xec
+
+#define LOCAL_TIMER_VECTOR		0xeb
 
 #define NR_VECTORS			 256
 
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index df0fa695bb09..d8c45e0728f0 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -147,6 +147,9 @@  static const __initconst struct idt_data apic_idts[] = {
 	INTG(POSTED_INTR_WAKEUP_VECTOR,		asm_sysvec_kvm_posted_intr_wakeup_ipi),
 	INTG(POSTED_INTR_NESTED_VECTOR,		asm_sysvec_kvm_posted_intr_nested_ipi),
 # endif
+#ifdef CONFIG_X86_USER_INTERRUPTS
+	INTG(UINTR_NOTIFICATION_VECTOR,		asm_sysvec_uintr_spurious_interrupt),
+#endif
 # ifdef CONFIG_IRQ_WORK
 	INTG(IRQ_WORK_VECTOR,			asm_sysvec_irq_work),
 # endif
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e28f6a5d14f1..e3c35668c7c5 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -181,6 +181,12 @@  int arch_show_interrupts(struct seq_file *p, int prec)
 		seq_printf(p, "%10u ",
 			   irq_stats(j)->kvm_posted_intr_wakeup_ipis);
 	seq_puts(p, "  Posted-interrupt wakeup event\n");
+#endif
+#ifdef CONFIG_X86_USER_INTERRUPTS
+	seq_printf(p, "%*s: ", prec, "UIS");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", irq_stats(j)->uintr_spurious_count);
+	seq_puts(p, "  User-interrupt spurious event\n");
 #endif
 	return 0;
 }
@@ -325,6 +331,33 @@  DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi)
 }
 #endif
 
+#ifdef CONFIG_X86_USER_INTERRUPTS
+/*
+ * Handler for UINTR_NOTIFICATION_VECTOR.
+ *
+ * The notification vector is used by the cpu to detect a User Interrupt. In
+ * the typical usage, the cpu would handle this interrupt and clear the local
+ * apic.
+ *
+ * However, it is possible that the kernel might receive this vector. This can
+ * happen if the receiver thread was running when the interrupt was sent but it
+ * got scheduled out before the interrupt was delivered. The kernel doesn't
+ * need to do anything other than clearing the local APIC. A pending user
+ * interrupt is always saved in the receiver's UPID which can be referenced
+ * when the receiver gets scheduled back.
+ *
+ * If the kernel receives a storm of these, it could mean an issue with the
+ * kernel's saving and restoring of the User Interrupt MSR state; Specifically,
+ * the notification vector bits in the IA32_UINTR_MISC_MSR.
+ */
+DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_uintr_spurious_interrupt)
+{
+	/* TODO: Add entry-exit tracepoints */
+	ack_APIC_irq();
+	inc_irq_stat(uintr_spurious_count);
+}
+#endif
+
 
 #ifdef CONFIG_HOTPLUG_CPU
 /* A cpu has been removed from cpu_online_mask.  Reset irq affinities. */