diff mbox series

[3/3] x86/vmsi: tolerate unsupported MSI address/data fields

Message ID 20210126110606.21741-4-roger.pau@citrix.com (mailing list archive)
State New
Headers show
Series x86/intr: interrupt related fixes | expand

Commit Message

Roger Pau Monné Jan. 26, 2021, 11:06 a.m. UTC
Plain MSI doesn't allow caching the MSI address and data fields while
the capability is enabled and not masked, hence we need to allow any
changes to those fields to update the binding of the interrupt. For
reference, the same doesn't apply to MSI-X that is allowed to cache
the data and address fields while the entry is unmasked, see section
6.8.3.5 of the PCI Local Bus Specification 3.0.

Allowing such updates means that a guest can write an invalid address
(ie: all zeros) and then a valid one, so the PIRQs shouldn't be
unmapped when the interrupt cannot be bound to the guest, since
further updates to the address or data fields can result in the
binding succeeding.

Modify the vPCI MSI arch helpers to track whether the interrupt is
bound, and make failures in vpci_msi_update not unmap the PIRQ, so
that further calls can attempt to bind the PIRQ again.

Note this requires some modifications to the MSI-X handlers, but there
shouldn't be any functional changes in that area.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vmsi.c      | 93 +++++++++++++++++++-----------------
 xen/drivers/vpci/msi.c       |  3 +-
 xen/include/asm-x86/hvm/io.h |  1 +
 xen/include/xen/vpci.h       |  3 +-
 4 files changed, 51 insertions(+), 49 deletions(-)

Comments

Jan Beulich Jan. 26, 2021, 3:17 p.m. UTC | #1
On 26.01.2021 12:06, Roger Pau Monne wrote:
> Plain MSI doesn't allow caching the MSI address and data fields while
> the capability is enabled and not masked, hence we need to allow any
> changes to those fields to update the binding of the interrupt. For
> reference, the same doesn't apply to MSI-X that is allowed to cache
> the data and address fields while the entry is unmasked, see section
> 6.8.3.5 of the PCI Local Bus Specification 3.0.
> 
> Allowing such updates means that a guest can write an invalid address
> (ie: all zeros) and then a valid one, so the PIRQs shouldn't be
> unmapped when the interrupt cannot be bound to the guest, since
> further updates to the address or data fields can result in the
> binding succeeding.

IOW the breakage from the other patch was because rubbish was
written first, and suitable data was written later on? I didn't
think core PCI code in Linux would do such, which would make me
suspect a driver having custom MSI handling code ...

> Modify the vPCI MSI arch helpers to track whether the interrupt is
> bound, and make failures in vpci_msi_update not unmap the PIRQ, so
> that further calls can attempt to bind the PIRQ again.
> 
> Note this requires some modifications to the MSI-X handlers, but there
> shouldn't be any functional changes in that area.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Am I understanding correctly that this change is independent of
the initial 2 patches (where I have reservations), and hence
it could go in ahead of them (or all alone)?

Jan
Roger Pau Monné Jan. 26, 2021, 3:57 p.m. UTC | #2
On Tue, Jan 26, 2021 at 04:17:59PM +0100, Jan Beulich wrote:
> On 26.01.2021 12:06, Roger Pau Monne wrote:
> > Plain MSI doesn't allow caching the MSI address and data fields while
> > the capability is enabled and not masked, hence we need to allow any
> > changes to those fields to update the binding of the interrupt. For
> > reference, the same doesn't apply to MSI-X that is allowed to cache
> > the data and address fields while the entry is unmasked, see section
> > 6.8.3.5 of the PCI Local Bus Specification 3.0.
> > 
> > Allowing such updates means that a guest can write an invalid address
> > (ie: all zeros) and then a valid one, so the PIRQs shouldn't be
> > unmapped when the interrupt cannot be bound to the guest, since
> > further updates to the address or data fields can result in the
> > binding succeeding.
> 
> IOW the breakage from the other patch was because rubbish was
> written first, and suitable data was written later on? I didn't
> think core PCI code in Linux would do such, which would make me
> suspect a driver having custom MSI handling code ...

So it seems that specific Linux driver will write 0s to the address
field at some point during initialization, but it also doesn't end up
using MSI interrupts anyway, so I assume it's somehow broken. FTR it's
the snd_hda_intel driver.

However it seems like Linux likes to zero all addresses fields on
shutdown for MSI (not MSI-X) with the capability enabled, and I do
see:

vmsi.c:688:d0v2 0000:00:1c.3: PIRQ 643: unsupported address 0
vmsi.c:688:d0v2 0000:00:1c.3: PIRQ 643: unsupported address 0
vmsi.c:688:d0v2 0000:00:1c.0: PIRQ 644: unsupported address 0
vmsi.c:688:d0v2 0000:00:1c.0: PIRQ 644: unsupported address 0
vmsi.c:688:d0v2 0000:00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 0000:00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 0000:00:14.0: PIRQ 641: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.2: PIRQ 645: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.2: PIRQ 645: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.1: PIRQ 646: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.1: PIRQ 646: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.0: PIRQ 647: unsupported address 0
vmsi.c:688:d0v2 0000:00:01.0: PIRQ 647: unsupported address 0

When dom0 is shutting down. That's with the 5.4 kernel, maybe other
versions won't do it.

> > Modify the vPCI MSI arch helpers to track whether the interrupt is
> > bound, and make failures in vpci_msi_update not unmap the PIRQ, so
> > that further calls can attempt to bind the PIRQ again.
> > 
> > Note this requires some modifications to the MSI-X handlers, but there
> > shouldn't be any functional changes in that area.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Am I understanding correctly that this change is independent of
> the initial 2 patches (where I have reservations), and hence
> it could go in ahead of them (or all alone)?

Yes, it's fully independent.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index a2ac82c95c..13e2a190b4 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -715,32 +715,37 @@  static int vpci_msi_update(const struct pci_dev *pdev, uint32_t data,
     return 0;
 }
 
-int vpci_msi_arch_update(struct vpci_msi *msi, const struct pci_dev *pdev)
+void vpci_msi_arch_update(struct vpci_msi *msi, const struct pci_dev *pdev)
 {
+    unsigned int i;
     int rc;
 
     ASSERT(msi->arch.pirq != INVALID_PIRQ);
 
     pcidevs_lock();
-    rc = vpci_msi_update(pdev, msi->data, msi->address, msi->vectors,
-                         msi->arch.pirq, msi->mask);
-    if ( rc )
+    for ( i = 0; i < msi->vectors && msi->arch.bound; i++ )
     {
-        spin_lock(&pdev->domain->event_lock);
-        unmap_domain_pirq(pdev->domain, msi->arch.pirq);
-        spin_unlock(&pdev->domain->event_lock);
-        pcidevs_unlock();
-        msi->arch.pirq = INVALID_PIRQ;
-        return rc;
+        struct xen_domctl_bind_pt_irq unbind = {
+            .machine_irq = msi->arch.pirq + i,
+            .irq_type = PT_IRQ_TYPE_MSI,
+        };
+
+        rc = pt_irq_destroy_bind(pdev->domain, &unbind);
+        if ( rc )
+        {
+            ASSERT_UNREACHABLE();
+            domain_crash(pdev->domain);
+            return;
+        }
     }
-    pcidevs_unlock();
 
-    return 0;
+    msi->arch.bound = !vpci_msi_update(pdev, msi->data, msi->address,
+                                       msi->vectors, msi->arch.pirq, msi->mask);
+    pcidevs_unlock();
 }
 
-static int vpci_msi_enable(const struct pci_dev *pdev, uint32_t data,
-                           uint64_t address, unsigned int nr,
-                           paddr_t table_base, uint32_t mask)
+static int vpci_msi_enable(const struct pci_dev *pdev, unsigned int nr,
+                           paddr_t table_base)
 {
     struct msi_info msi_info = {
         .seg = pdev->seg,
@@ -749,7 +754,6 @@  static int vpci_msi_enable(const struct pci_dev *pdev, uint32_t data,
         .table_base = table_base,
         .entry_nr = nr,
     };
-    unsigned vectors = table_base ? 1 : nr;
     int rc, pirq = INVALID_PIRQ;
 
     /* Get a PIRQ. */
@@ -763,18 +767,6 @@  static int vpci_msi_enable(const struct pci_dev *pdev, uint32_t data,
         return rc;
     }
 
-    pcidevs_lock();
-    rc = vpci_msi_update(pdev, data, address, vectors, pirq, mask);
-    if ( rc )
-    {
-        spin_lock(&pdev->domain->event_lock);
-        unmap_domain_pirq(pdev->domain, pirq);
-        spin_unlock(&pdev->domain->event_lock);
-        pcidevs_unlock();
-        return rc;
-    }
-    pcidevs_unlock();
-
     return pirq;
 }
 
@@ -784,25 +776,28 @@  int vpci_msi_arch_enable(struct vpci_msi *msi, const struct pci_dev *pdev,
     int rc;
 
     ASSERT(msi->arch.pirq == INVALID_PIRQ);
-    rc = vpci_msi_enable(pdev, msi->data, msi->address, vectors, 0, msi->mask);
-    if ( rc >= 0 )
-    {
-        msi->arch.pirq = rc;
-        rc = 0;
-    }
+    rc = vpci_msi_enable(pdev, vectors, 0);
+    if ( rc < 0 )
+        return rc;
+    msi->arch.pirq = rc;
 
-    return rc;
+    pcidevs_lock();
+    msi->arch.bound = !vpci_msi_update(pdev, msi->data, msi->address, vectors,
+                                       msi->arch.pirq, msi->mask);
+    pcidevs_unlock();
+
+    return 0;
 }
 
 static void vpci_msi_disable(const struct pci_dev *pdev, int pirq,
-                             unsigned int nr)
+                             unsigned int nr, bool bound)
 {
     unsigned int i;
 
     ASSERT(pirq != INVALID_PIRQ);
 
     pcidevs_lock();
-    for ( i = 0; i < nr; i++ )
+    for ( i = 0; i < nr && bound; i++ )
     {
         struct xen_domctl_bind_pt_irq bind = {
             .machine_irq = pirq + i,
@@ -822,7 +817,7 @@  static void vpci_msi_disable(const struct pci_dev *pdev, int pirq,
 
 void vpci_msi_arch_disable(struct vpci_msi *msi, const struct pci_dev *pdev)
 {
-    vpci_msi_disable(pdev, msi->arch.pirq, msi->vectors);
+    vpci_msi_disable(pdev, msi->arch.pirq, msi->vectors, msi->arch.bound);
     msi->arch.pirq = INVALID_PIRQ;
 }
 
@@ -857,14 +852,22 @@  int vpci_msix_arch_enable_entry(struct vpci_msix_entry *entry,
     int rc;
 
     ASSERT(entry->arch.pirq == INVALID_PIRQ);
-    rc = vpci_msi_enable(pdev, entry->data, entry->addr,
-                         vmsix_entry_nr(pdev->vpci->msix, entry),
-                         table_base, entry->masked);
-    if ( rc >= 0 )
+    rc = vpci_msi_enable(pdev, vmsix_entry_nr(pdev->vpci->msix, entry),
+                         table_base);
+    if ( rc < 0 )
+        return rc;
+
+    entry->arch.pirq = rc;
+
+    pcidevs_lock();
+    rc = vpci_msi_update(pdev, entry->data, entry->addr, 1, entry->arch.pirq,
+                         entry->masked);
+    if ( rc )
     {
-        entry->arch.pirq = rc;
-        rc = 0;
+        vpci_msi_disable(pdev, entry->arch.pirq, 1, false);
+        entry->arch.pirq = INVALID_PIRQ;
     }
+    pcidevs_unlock();
 
     return rc;
 }
@@ -875,7 +878,7 @@  int vpci_msix_arch_disable_entry(struct vpci_msix_entry *entry,
     if ( entry->arch.pirq == INVALID_PIRQ )
         return -ENOENT;
 
-    vpci_msi_disable(pdev, entry->arch.pirq, 1);
+    vpci_msi_disable(pdev, entry->arch.pirq, 1, true);
     entry->arch.pirq = INVALID_PIRQ;
 
     return 0;
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 65db438d24..5757a7aed2 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -85,8 +85,7 @@  static void update_msi(const struct pci_dev *pdev, struct vpci_msi *msi)
     if ( !msi->enabled )
         return;
 
-    if ( vpci_msi_arch_update(msi, pdev) )
-        msi->enabled = false;
+    vpci_msi_arch_update(msi, pdev);
 }
 
 /* Handlers for the address field (32bit or low part of a 64bit address). */
diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h
index 9453b9b2b7..3d2e877110 100644
--- a/xen/include/asm-x86/hvm/io.h
+++ b/xen/include/asm-x86/hvm/io.h
@@ -130,6 +130,7 @@  static inline void msixtbl_init(struct domain *d) {}
 /* Arch-specific MSI data for vPCI. */
 struct vpci_arch_msi {
     int pirq;
+    bool bound;
 };
 
 /* Arch-specific MSI-X entry data for vPCI. */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 5295d4c990..9f5b5d52e1 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -160,8 +160,7 @@  int __must_check vpci_msi_arch_enable(struct vpci_msi *msi,
                                       const struct pci_dev *pdev,
                                       unsigned int vectors);
 void vpci_msi_arch_disable(struct vpci_msi *msi, const struct pci_dev *pdev);
-int __must_check vpci_msi_arch_update(struct vpci_msi *msi,
-                                      const struct pci_dev *pdev);
+void vpci_msi_arch_update(struct vpci_msi *msi, const struct pci_dev *pdev);
 void vpci_msi_arch_init(struct vpci_msi *msi);
 void vpci_msi_arch_print(const struct vpci_msi *msi);