diff mbox

xen: Remove event channel notification through Xen PCI platform device

Message ID 1472248536-2063-1-git-send-email-karahmed@amazon.de (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

KarimAllah Ahmed Aug. 26, 2016, 9:55 p.m. UTC
Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
from old kernel") using the INTx interrupt from Xen PCI platform device for
event channel notification would just lockup the guest during bootup.
postcore_initcall now calls xs_reset_watches which will eventually try to read
a value from XenStore and will get stuck on read_reply at XenBus forever since
the platform driver is not probed yet and its INTx interrupt handler is not
registered yet. That means that the guest can not be notified at this moment of
any pending event channels and none of the per-event handlers will ever be
invoked (including the XenStore one) and the reply will never be picked up by
the kernel.

The exact stack where things get stuck during xenbus_init:

-xenbus_init
 -xs_init
  -xs_reset_watches
   -xenbus_scanf
    -xenbus_read
     -xs_single
      -xs_single
       -xs_talkv

Vector callbacks have always been the favourite event notification mechanism
since their introduction in commit 38e20b07efd5 ("x86/xen: event channels
delivery on HVM.") and the vector callback feature has always been advertised
for quite some time by Xen that's why INTx was broken for several years now
without impacting anyone.

Luckily this also means that event channel notification through INTx is
basically dead-code which can be safely removed without impacting anybody since
it has been effectively disabled for more than 4 years with nobody complaining
about it (at least as far as I'm aware of).

This commit removes event channel notification through Xen PCI platform device.

Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 arch/x86/include/asm/xen/events.h | 11 -------
 arch/x86/pci/xen.c                |  2 +-
 arch/x86/xen/enlighten.c          | 19 ++++--------
 arch/x86/xen/smp.c                |  2 --
 arch/x86/xen/time.c               |  5 ---
 drivers/xen/events/events_base.c  | 26 ++++++----------
 drivers/xen/platform-pci.c        | 64 ---------------------------------------
 include/xen/xen.h                 |  3 +-
 8 files changed, 17 insertions(+), 115 deletions(-)

Comments

Boris Ostrovsky Aug. 29, 2016, 5:29 p.m. UTC | #1
On 08/26/2016 05:55 PM, KarimAllah Ahmed wrote:
> Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
> from old kernel") using the INTx interrupt from Xen PCI platform device for
> event channel notification would just lockup the guest during bootup.
> postcore_initcall now calls xs_reset_watches which will eventually try to read
> a value from XenStore and will get stuck on read_reply at XenBus forever since
> the platform driver is not probed yet and its INTx interrupt handler is not
> registered yet. That means that the guest can not be notified at this moment of
> any pending event channels and none of the per-event handlers will ever be
> invoked (including the XenStore one) and the reply will never be picked up by
> the kernel.
>
> The exact stack where things get stuck during xenbus_init:
>
> -xenbus_init
>  -xs_init
>   -xs_reset_watches
>    -xenbus_scanf
>     -xenbus_read
>      -xs_single
>       -xs_single
>        -xs_talkv
>
> Vector callbacks have always been the favourite event notification mechanism
> since their introduction in commit 38e20b07efd5 ("x86/xen: event channels
> delivery on HVM.") and the vector callback feature has always been advertised
> for quite some time by Xen that's why INTx was broken for several years now
> without impacting anyone.
>
> Luckily this also means that event channel notification through INTx is
> basically dead-code which can be safely removed without impacting anybody since
> it has been effectively disabled for more than 4 years with nobody complaining
> about it (at least as far as I'm aware of).
>
> This commit removes event channel notification through Xen PCI platform device.
>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Julien Grall <julien.grall@citrix.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
> Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
> Cc: xen-devel@lists.xenproject.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-pci@vger.kernel.org
> Cc: Anthony Liguori <aliguori@amazon.com>
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>




--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Vrabel Sept. 30, 2016, 2:46 p.m. UTC | #2
On 26/08/16 22:55, KarimAllah Ahmed wrote:
> Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
> from old kernel") using the INTx interrupt from Xen PCI platform device for
> event channel notification would just lockup the guest during bootup.
> postcore_initcall now calls xs_reset_watches which will eventually try to read
> a value from XenStore and will get stuck on read_reply at XenBus forever since
> the platform driver is not probed yet and its INTx interrupt handler is not
> registered yet. That means that the guest can not be notified at this moment of
> any pending event channels and none of the per-event handlers will ever be
> invoked (including the XenStore one) and the reply will never be picked up by
> the kernel.

Applied to for-linus-4.9, thanks.

David

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
KarimAllah Ahmed April 10, 2017, 12:28 p.m. UTC | #3
Unfortunately, this commit is potentially a candidate for reverting. After a
lengthy qualification I realized that there is a function called:
"xen_strict_xenbus_quirk()" that is being called in the offending path that
short-circuits the offending code!

So at the moment any domU kernel with this commit will not boot on any Xen
version < 4.0!  So nobody with Xen < 4.0 was complaining not because nobody is
using it but rather because there is a short-circuit in the code that avoids
hitting the offending code in the first place! So the original assumption that
the code is dead might no be 100% correct!

So even though the code for INTx is broken for any Xen > 4.0, the right thing
to do now is to actually fix the INTx properly and completely revert this
commit (actually now also commit da72ff5bfcb0 needs to be reverted to cleanly
revert this commit) to avoid any potential regression.

David,
Does this make sense to you?

I will send a patch to fix INTx shortly as well.

On 9/30/16, 4:46 PM, "David Vrabel" <david.vrabel@citrix.com> wrote:

    On 26/08/16 22:55, KarimAllah Ahmed wrote:
    > Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches

    > from old kernel") using the INTx interrupt from Xen PCI platform device for

    > event channel notification would just lockup the guest during bootup.

    > postcore_initcall now calls xs_reset_watches which will eventually try to read

    > a value from XenStore and will get stuck on read_reply at XenBus forever since

    > the platform driver is not probed yet and its INTx interrupt handler is not

    > registered yet. That means that the guest can not be notified at this moment of

    > any pending event channels and none of the per-event handlers will ever be

    > invoked (including the XenStore one) and the reply will never be picked up by

    > the kernel.

    
    Applied to for-linus-4.9, thanks.
    
    David
    
    
    

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
diff mbox

Patch

diff --git a/arch/x86/include/asm/xen/events.h b/arch/x86/include/asm/xen/events.h
index e6911ca..608a79d 100644
--- a/arch/x86/include/asm/xen/events.h
+++ b/arch/x86/include/asm/xen/events.h
@@ -20,15 +20,4 @@  static inline int xen_irqs_disabled(struct pt_regs *regs)
 /* No need for a barrier -- XCHG is a barrier on x86. */
 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
 
-extern int xen_have_vector_callback;
-
-/*
- * Events delivered via platform PCI interrupts are always
- * routed to vcpu 0 and hence cannot be rebound.
- */
-static inline bool xen_support_evtchn_rebind(void)
-{
-	return (!xen_hvm_domain() || xen_have_vector_callback);
-}
-
 #endif /* _ASM_X86_XEN_EVENTS_H */
diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 3a483cb..bedfab9 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -456,7 +456,7 @@  void __init xen_msi_init(void)
 
 int __init pci_xen_hvm_init(void)
 {
-	if (!xen_have_vector_callback || !xen_feature(XENFEAT_hvm_pirqs))
+	if (!xen_feature(XENFEAT_hvm_pirqs))
 		return 0;
 
 #ifdef CONFIG_ACPI
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index b86ebb1..2f8fd7f 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -137,8 +137,6 @@  struct shared_info xen_dummy_shared_info;
 void *xen_initial_gdt;
 
 RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
-__read_mostly int xen_have_vector_callback;
-EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 
 /*
  * Point at some empty memory to start with. We map the real shared_info
@@ -1520,10 +1518,7 @@  static void __init xen_pvh_early_guest_init(void)
 	if (!xen_feature(XENFEAT_auto_translated_physmap))
 		return;
 
-	if (!xen_feature(XENFEAT_hvm_callback_vector))
-		return;
-
-	xen_have_vector_callback = 1;
+	BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
 
 	xen_pvh_early_cpu_init(0, false);
 	xen_pvh_set_cr_flags(0);
@@ -1831,10 +1826,8 @@  static int xen_hvm_cpu_notify(struct notifier_block *self, unsigned long action,
 		else
 			per_cpu(xen_vcpu_id, cpu) = cpu;
 		xen_vcpu_setup(cpu);
-		if (xen_have_vector_callback) {
-			if (xen_feature(XENFEAT_hvm_safe_pvclock))
-				xen_setup_timer(cpu);
-		}
+		if (xen_feature(XENFEAT_hvm_safe_pvclock))
+			xen_setup_timer(cpu);
 		break;
 	default:
 		break;
@@ -1872,8 +1865,8 @@  static void __init xen_hvm_guest_init(void)
 
 	xen_panic_handler_init();
 
-	if (xen_feature(XENFEAT_hvm_callback_vector))
-		xen_have_vector_callback = 1;
+	BUG_ON(!xen_feature(XENFEAT_hvm_callback_vector));
+
 	xen_hvm_smp_init();
 	register_cpu_notifier(&xen_hvm_cpu_notifier);
 	xen_unplug_emulated_devices();
@@ -1911,7 +1904,7 @@  bool xen_hvm_need_lapic(void)
 		return false;
 	if (!xen_hvm_domain())
 		return false;
-	if (xen_feature(XENFEAT_hvm_pirqs) && xen_have_vector_callback)
+	if (xen_feature(XENFEAT_hvm_pirqs))
 		return false;
 	return true;
 }
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 0b4d04c..59928cc 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -805,8 +805,6 @@  static int xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle)
 
 void __init xen_hvm_smp_init(void)
 {
-	if (!xen_have_vector_callback)
-		return;
 	smp_ops.smp_prepare_cpus = xen_hvm_smp_prepare_cpus;
 	smp_ops.smp_send_reschedule = xen_smp_send_reschedule;
 	smp_ops.cpu_up = xen_hvm_cpu_up;
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
index 67356d2..33d8f6a 100644
--- a/arch/x86/xen/time.c
+++ b/arch/x86/xen/time.c
@@ -432,11 +432,6 @@  static void xen_hvm_setup_cpu_clockevents(void)
 
 void __init xen_hvm_init_time_ops(void)
 {
-	/* vector callback is needed otherwise we cannot receive interrupts
-	 * on cpu > 0 and at this point we don't know how many cpus are
-	 * available */
-	if (!xen_have_vector_callback)
-		return;
 	if (!xen_feature(XENFEAT_hvm_safe_pvclock)) {
 		printk(KERN_INFO "Xen doesn't support pvclock on HVM,"
 				"disable pv timer\n");
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index d5dbdb9..9ecfcdc 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -1314,9 +1314,6 @@  static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
 	if (!VALID_EVTCHN(evtchn))
 		return -1;
 
-	if (!xen_support_evtchn_rebind())
-		return -1;
-
 	/* Send future instances of this interrupt to other vcpu. */
 	bind_vcpu.port = evtchn;
 	bind_vcpu.vcpu = xen_vcpu_nr(tcpu);
@@ -1650,20 +1647,15 @@  void xen_callback_vector(void)
 {
 	int rc;
 	uint64_t callback_via;
-	if (xen_have_vector_callback) {
-		callback_via = HVM_CALLBACK_VECTOR(HYPERVISOR_CALLBACK_VECTOR);
-		rc = xen_set_callback_via(callback_via);
-		if (rc) {
-			pr_err("Request for Xen HVM callback vector failed\n");
-			xen_have_vector_callback = 0;
-			return;
-		}
-		pr_info("Xen HVM callback vector for event delivery is enabled\n");
-		/* in the restore case the vector has already been allocated */
-		if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors))
-			alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
-					xen_hvm_callback_vector);
-	}
+
+	callback_via = HVM_CALLBACK_VECTOR(HYPERVISOR_CALLBACK_VECTOR);
+	rc = xen_set_callback_via(callback_via);
+	BUG_ON(rc);
+	pr_info("Xen HVM callback vector for event delivery is enabled\n");
+	/* in the restore case the vector has already been allocated */
+	if (!test_bit(HYPERVISOR_CALLBACK_VECTOR, used_vectors))
+		alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR,
+				xen_hvm_callback_vector);
 }
 #else
 void xen_callback_vector(void) {}
diff --git a/drivers/xen/platform-pci.c b/drivers/xen/platform-pci.c
index cf96666..b59c9455 100644
--- a/drivers/xen/platform-pci.c
+++ b/drivers/xen/platform-pci.c
@@ -42,7 +42,6 @@ 
 static unsigned long platform_mmio;
 static unsigned long platform_mmio_alloc;
 static unsigned long platform_mmiolen;
-static uint64_t callback_via;
 
 static unsigned long alloc_xen_mmio(unsigned long len)
 {
@@ -55,51 +54,6 @@  static unsigned long alloc_xen_mmio(unsigned long len)
 	return addr;
 }
 
-static uint64_t get_callback_via(struct pci_dev *pdev)
-{
-	u8 pin;
-	int irq;
-
-	irq = pdev->irq;
-	if (irq < 16)
-		return irq; /* ISA IRQ */
-
-	pin = pdev->pin;
-
-	/* We don't know the GSI. Specify the PCI INTx line instead. */
-	return ((uint64_t)0x01 << 56) | /* PCI INTx identifier */
-		((uint64_t)pci_domain_nr(pdev->bus) << 32) |
-		((uint64_t)pdev->bus->number << 16) |
-		((uint64_t)(pdev->devfn & 0xff) << 8) |
-		((uint64_t)(pin - 1) & 3);
-}
-
-static irqreturn_t do_hvm_evtchn_intr(int irq, void *dev_id)
-{
-	xen_hvm_evtchn_do_upcall();
-	return IRQ_HANDLED;
-}
-
-static int xen_allocate_irq(struct pci_dev *pdev)
-{
-	return request_irq(pdev->irq, do_hvm_evtchn_intr,
-			IRQF_NOBALANCING | IRQF_TRIGGER_RISING,
-			"xen-platform-pci", pdev);
-}
-
-static int platform_pci_resume(struct pci_dev *pdev)
-{
-	int err;
-	if (xen_have_vector_callback)
-		return 0;
-	err = xen_set_callback_via(callback_via);
-	if (err) {
-		dev_err(&pdev->dev, "platform_pci_resume failure!\n");
-		return err;
-	}
-	return 0;
-}
-
 static int platform_pci_probe(struct pci_dev *pdev,
 			      const struct pci_device_id *ent)
 {
@@ -138,21 +92,6 @@  static int platform_pci_probe(struct pci_dev *pdev,
 	platform_mmio = mmio_addr;
 	platform_mmiolen = mmio_len;
 
-	if (!xen_have_vector_callback) {
-		ret = xen_allocate_irq(pdev);
-		if (ret) {
-			dev_warn(&pdev->dev, "request_irq failed err=%d\n", ret);
-			goto out;
-		}
-		callback_via = get_callback_via(pdev);
-		ret = xen_set_callback_via(callback_via);
-		if (ret) {
-			dev_warn(&pdev->dev, "Unable to set the evtchn callback "
-					 "err=%d\n", ret);
-			goto out;
-		}
-	}
-
 	max_nr_gframes = gnttab_max_grant_frames();
 	grant_frames = alloc_xen_mmio(PAGE_SIZE * max_nr_gframes);
 	ret = gnttab_setup_auto_xlat_frames(grant_frames);
@@ -184,9 +123,6 @@  static struct pci_driver platform_driver = {
 	.name =           DRV_NAME,
 	.probe =          platform_pci_probe,
 	.id_table =       platform_pci_tbl,
-#ifdef CONFIG_PM
-	.resume_early =   platform_pci_resume,
-#endif
 };
 
 static int __init platform_pci_init(void)
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 0c0e3ef..f0f0252 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -38,8 +38,7 @@  extern enum xen_domain_type xen_domain_type;
  */
 #include <xen/features.h>
 #define xen_pvh_domain() (xen_pv_domain() && \
-			  xen_feature(XENFEAT_auto_translated_physmap) && \
-			  xen_have_vector_callback)
+			  xen_feature(XENFEAT_auto_translated_physmap))
 #else
 #define xen_pvh_domain()	(0)
 #endif