diff mbox series

[v2,10/13] x86/irq: Extend checks for pending vectors to posted interrupts

Message ID 20240405223110.1609888-11-jacob.jun.pan@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Coalesced Interrupt Delivery with posted MSI | expand

Commit Message

Jacob Pan April 5, 2024, 10:31 p.m. UTC
During interrupt affinity change, it is possible to have interrupts delivered
to the old CPU after the affinity has changed to the new one. To prevent lost
interrupts, local APIC IRR is checked on the old CPU. Similar checks must be
done for posted MSIs given the same reason.

Consider the following scenario:
	Device		system agent		iommu		memory 		CPU/LAPIC
1	FEEX_XXXX
2			Interrupt request
3						Fetch IRTE	->
4						->Atomic Swap PID.PIR(vec)
						Push to Global Observable(GO)
5						if (ON*)
	i						done;*
						else
6							send a notification ->

* ON: outstanding notification, 1 will suppress new notifications

If the affinity change happens between 3 and 5 in IOMMU, the old CPU's posted
interrupt request (PIR) could have pending bit set for the vector being moved.

This patch adds a helper function to check individual vector status.
Then use the helper to check for pending interrupts on the source CPU's
PID.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>

---
v2: Fold in helper function patch.
---
 arch/x86/include/asm/apic.h        |  3 ++-
 arch/x86/include/asm/posted_intr.h | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+), 1 deletion(-)

Comments

Tian, Kevin April 12, 2024, 9:25 a.m. UTC | #1
> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Saturday, April 6, 2024 6:31 AM
> 
> During interrupt affinity change, it is possible to have interrupts delivered
> to the old CPU after the affinity has changed to the new one. To prevent lost
> interrupts, local APIC IRR is checked on the old CPU. Similar checks must be
> done for posted MSIs given the same reason.
> 
> Consider the following scenario:
> 	Device		system agent		iommu		memory
> 		CPU/LAPIC
> 1	FEEX_XXXX
> 2			Interrupt request
> 3						Fetch IRTE	->
> 4						->Atomic Swap PID.PIR(vec)
> 						Push to Global
> Observable(GO)
> 5						if (ON*)
> 	i						done;*

there is a stray 'i'

> 						else
> 6							send a notification ->
> 
> * ON: outstanding notification, 1 will suppress new notifications
> 
> If the affinity change happens between 3 and 5 in IOMMU, the old CPU's
> posted
> interrupt request (PIR) could have pending bit set for the vector being moved.

how could affinity change be possible in 3/4 when the cache line is
locked by IOMMU? Strictly speaking it's about a change after 4 and
before 6.
Jacob Pan April 12, 2024, 6:23 p.m. UTC | #2
Hi Kevin,

On Fri, 12 Apr 2024 09:25:57 +0000, "Tian, Kevin" <kevin.tian@intel.com>
wrote:

> > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Sent: Saturday, April 6, 2024 6:31 AM
> > 
> > During interrupt affinity change, it is possible to have interrupts
> > delivered to the old CPU after the affinity has changed to the new one.
> > To prevent lost interrupts, local APIC IRR is checked on the old CPU.
> > Similar checks must be done for posted MSIs given the same reason.
> > 
> > Consider the following scenario:
> > 	Device		system agent		iommu
> > 	memory CPU/LAPIC
> > 1	FEEX_XXXX
> > 2			Interrupt request
> > 3						Fetch IRTE	->
> > 4						->Atomic Swap
> > PID.PIR(vec) Push to Global
> > Observable(GO)
> > 5						if (ON*)
> > 	i						done;*  
> 
> there is a stray 'i'
will fix, thanks

> 
> > 						else
> > 6							send a
> > notification ->
> > 
> > * ON: outstanding notification, 1 will suppress new notifications
> > 
> > If the affinity change happens between 3 and 5 in IOMMU, the old CPU's
> > posted
> > interrupt request (PIR) could have pending bit set for the vector being
> > moved.  
> 
> how could affinity change be possible in 3/4 when the cache line is
> locked by IOMMU? Strictly speaking it's about a change after 4 and
> before 6.
SW can still perform affinity change on IRTE and do the flushing on IR
cache after IOMMU fectched it (step 3). They are async events.

In step 4, the atomic swap is on the PID cacheline, not IRTE.


Thanks,

Jacob
Tian, Kevin April 16, 2024, 3:47 a.m. UTC | #3
> From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Sent: Saturday, April 13, 2024 2:24 AM
> 
> Hi Kevin,
> 
> On Fri, 12 Apr 2024 09:25:57 +0000, "Tian, Kevin" <kevin.tian@intel.com>
> wrote:
> 
> > > From: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > > Sent: Saturday, April 6, 2024 6:31 AM
> > >
> > > During interrupt affinity change, it is possible to have interrupts
> > > delivered to the old CPU after the affinity has changed to the new one.
> > > To prevent lost interrupts, local APIC IRR is checked on the old CPU.
> > > Similar checks must be done for posted MSIs given the same reason.
> > >
> > > Consider the following scenario:
> > > 	Device		system agent		iommu
> > > 	memory CPU/LAPIC
> > > 1	FEEX_XXXX
> > > 2			Interrupt request
> > > 3						Fetch IRTE	->
> > > 4						->Atomic Swap
> > > PID.PIR(vec) Push to Global
> > > Observable(GO)
> > > 5						if (ON*)
> > > 	i						done;*
> >
> > there is a stray 'i'
> will fix, thanks
> 
> >
> > > 						else
> > > 6							send a
> > > notification ->
> > >
> > > * ON: outstanding notification, 1 will suppress new notifications
> > >
> > > If the affinity change happens between 3 and 5 in IOMMU, the old CPU's
> > > posted
> > > interrupt request (PIR) could have pending bit set for the vector being
> > > moved.
> >
> > how could affinity change be possible in 3/4 when the cache line is
> > locked by IOMMU? Strictly speaking it's about a change after 4 and
> > before 6.
> SW can still perform affinity change on IRTE and do the flushing on IR
> cache after IOMMU fectched it (step 3). They are async events.
> 
> In step 4, the atomic swap is on the PID cacheline, not IRTE.
> 

yeah, I mixed IRTE with PID.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index ebf80718912d..5bf0d6c2523b 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -13,6 +13,7 @@ 
 #include <asm/mpspec.h>
 #include <asm/msr.h>
 #include <asm/hardirq.h>
+#include <asm/posted_intr.h>
 
 #define ARCH_APICTIMER_STOPS_ON_C3	1
 
@@ -507,7 +508,7 @@  static inline bool is_vector_pending(unsigned int vector)
 	if (irr  & (1 << (vector % 32)))
 		return true;
 
-	return false;
+	return pi_pending_this_cpu(vector);
 }
 
 /*
diff --git a/arch/x86/include/asm/posted_intr.h b/arch/x86/include/asm/posted_intr.h
index 4e5eea2d20e0..8aaa15515490 100644
--- a/arch/x86/include/asm/posted_intr.h
+++ b/arch/x86/include/asm/posted_intr.h
@@ -1,6 +1,7 @@ 
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _X86_POSTED_INTR_H
 #define _X86_POSTED_INTR_H
+#include <asm/irq_vectors.h>
 
 #define POSTED_INTR_ON  0
 #define POSTED_INTR_SN  1
@@ -81,9 +82,26 @@  static inline bool pi_test_sn(struct pi_desc *pi_desc)
 }
 
 #ifdef CONFIG_X86_POSTED_MSI
+/*
+ * Not all external vectors are subject to interrupt remapping, e.g. IOMMU's
+ * own interrupts. Here we do not distinguish them since those vector bits in
+ * PIR will always be zero.
+ */
+static inline bool pi_pending_this_cpu(unsigned int vector)
+{
+	struct pi_desc *pid = this_cpu_ptr(&posted_interrupt_desc);
+
+	if (WARN_ON_ONCE(vector > NR_VECTORS || vector < FIRST_EXTERNAL_VECTOR))
+		return false;
+
+	return test_bit(vector, (unsigned long *)pid->pir);
+}
+
 extern void intel_posted_msi_init(void);
 
 #else
+static inline bool pi_pending_this_cpu(unsigned int vector) { return false; }
+
 static inline void intel_posted_msi_init(void) {};
 
 #endif /* X86_POSTED_MSI */