diff mbox

i386: pci-assign: Fix MSI-X table size

Message ID 1466282525-3018-1-git-send-email-ido@wizery.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ido Yariv June 18, 2016, 8:42 p.m. UTC
The current code creates a whole page mmio region for the MSI-X table
size.

However, the page containing the MSI-X table may contain other registers
not related to MSI-X. Creating an mmio region for the whole page masks
such registers and may break drivers in the guest OS.

Since maximal number of entries is known, use that instead to deduce the
table size when setting up the mmio region.

Signed-off-by: Ido Yariv <ido@wizery.com>
---
 hw/i386/kvm/pci-assign.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Paolo Bonzini June 20, 2016, 2:54 p.m. UTC | #1
On 18/06/2016 22:42, Ido Yariv wrote:
> The current code creates a whole page mmio region for the MSI-X table
> size.
> 
> However, the page containing the MSI-X table may contain other registers
> not related to MSI-X. Creating an mmio region for the whole page masks
> such registers and may break drivers in the guest OS.
> 
> Since maximal number of entries is known, use that instead to deduce the
> table size when setting up the mmio region.
> 
> Signed-off-by: Ido Yariv <ido@wizery.com>

I can take this patch, but I'd like to warn you that pci-assign is
deprecated (and replaced by VFIO).  I seem to recall VFIO does this
correctly, but it would be great if you could check that.

Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just
to limit the number of things that could break.

Paolo
Ido Yariv June 20, 2016, 3:40 p.m. UTC | #2
Hi Paolo,

On Mon, Jun 20, 2016 at 04:54:17PM +0200, Paolo Bonzini wrote:
> On 18/06/2016 22:42, Ido Yariv wrote:
> > The current code creates a whole page mmio region for the MSI-X table
> > size.
> > 
> > However, the page containing the MSI-X table may contain other registers
> > not related to MSI-X. Creating an mmio region for the whole page masks
> > such registers and may break drivers in the guest OS.
> > 
> > Since maximal number of entries is known, use that instead to deduce the
> > table size when setting up the mmio region.
> > 
> > Signed-off-by: Ido Yariv <ido@wizery.com>
> 
> I can take this patch, but I'd like to warn you that pci-assign is
> deprecated (and replaced by VFIO).  I seem to recall VFIO does this
> correctly, but it would be great if you could check that.

Just gave it a quick shot and everything seems to be working fine with
VFIO, thanks!

> Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just
> to limit the number of things that could break.

I believe there might be another issue with this. The number of entries
can be up to 2048, so the MSI-X table size can theoretically exceed one
page and take up to 8 pages.

We can just modify MSIX_PAGE_SIZE to 0x8000, but wouldn't it be better
to just map exactly what we need?

Cheers,
Ido.
Alex Williamson June 20, 2016, 5:51 p.m. UTC | #3
On Mon, 20 Jun 2016 11:40:17 -0400
Ido Yariv <ido@wizery.com> wrote:

> Hi Paolo,
> 
> On Mon, Jun 20, 2016 at 04:54:17PM +0200, Paolo Bonzini wrote:
> > On 18/06/2016 22:42, Ido Yariv wrote:  
> > > The current code creates a whole page mmio region for the MSI-X table
> > > size.
> > > 
> > > However, the page containing the MSI-X table may contain other registers
> > > not related to MSI-X. Creating an mmio region for the whole page masks
> > > such registers and may break drivers in the guest OS.
> > > 
> > > Since maximal number of entries is known, use that instead to deduce the
> > > table size when setting up the mmio region.
> > > 
> > > Signed-off-by: Ido Yariv <ido@wizery.com>  
> > 
> > I can take this patch, but I'd like to warn you that pci-assign is
> > deprecated (and replaced by VFIO).  I seem to recall VFIO does this
> > correctly, but it would be great if you could check that.  
> 
> Just gave it a quick shot and everything seems to be working fine with
> VFIO, thanks!
> 
> > Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just
> > to limit the number of things that could break.  
> 
> I believe there might be another issue with this. The number of entries
> can be up to 2048, so the MSI-X table size can theoretically exceed one
> page and take up to 8 pages.
> 
> We can just modify MSIX_PAGE_SIZE to 0x8000, but wouldn't it be better
> to just map exactly what we need?

Or, let's just let pci-assign be broken and use vfio-pci :-D  We should
probably actually put an error_report() in pci-assign indicating that
it's deprecated and subject to removal to ferret out the remaining
users.  Fixing anything in it sets the wrong precedent.  Thanks,

Alex
Michael S. Tsirkin June 21, 2016, 4:10 a.m. UTC | #4
On Sat, Jun 18, 2016 at 04:42:05PM -0400, Ido Yariv wrote:
> The current code creates a whole page mmio region for the MSI-X table
> size.
> 
> However, the page containing the MSI-X table may contain other registers
> not related to MSI-X. Creating an mmio region for the whole page masks
> such registers and may break drivers in the guest OS.
> 
> Since maximal number of entries is known, use that instead to deduce the
> table size when setting up the mmio region.
> 
> Signed-off-by: Ido Yariv <ido@wizery.com>

I personally don't want to spend cycles maintaining
the old pci-assign but if someone does I don't have
an issue with that.

I remember why I coded up MSIX_PAGE_SIZE - this was before
the new memory API which made it very tricky to support
sub-page mappings.

The PCI spec stringly recommends against sharing a page between page
table and other registers but I guess if a device violates that rule
it's a better idea to make it work slowly than fail.

Patch looks good to me so:

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  hw/i386/kvm/pci-assign.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
> index f9c9014..98997d1 100644
> --- a/hw/i386/kvm/pci-assign.c
> +++ b/hw/i386/kvm/pci-assign.c
> @@ -36,8 +36,6 @@
>  #include "kvm_i386.h"
>  #include "hw/pci/pci-assign.h"
>  
> -#define MSIX_PAGE_SIZE 0x1000
> -
>  /* From linux/ioport.h */
>  #define IORESOURCE_IO       0x00000100  /* Resource type */
>  #define IORESOURCE_MEM      0x00000200
> @@ -122,6 +120,7 @@ typedef struct AssignedDevice {
>      int *msi_virq;
>      MSIXTableEntry *msix_table;
>      hwaddr msix_table_addr;
> +    uint16_t msix_table_size;
>      uint16_t msix_max;
>      MemoryRegion mmio;
>      char *configfd_name;
> @@ -1310,6 +1309,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev, Error **errp)
>          bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK;
>          msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK;
>          dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry;
> +        dev->msix_table_size = msix_max * sizeof(MSIXTableEntry);
>          dev->msix_max = msix_max;
>      }
>  
> @@ -1633,7 +1633,7 @@ static void assigned_dev_msix_reset(AssignedDevice *dev)
>          return;
>      }
>  
> -    memset(dev->msix_table, 0, MSIX_PAGE_SIZE);
> +    memset(dev->msix_table, 0, dev->msix_table_size);
>  
>      for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) {
>          entry->ctrl = cpu_to_le32(0x1); /* Masked */
> @@ -1642,8 +1642,8 @@ static void assigned_dev_msix_reset(AssignedDevice *dev)
>  
>  static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
>  {
> -    dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE,
> -                           MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
> +    dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE,
> +                           MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
>      if (dev->msix_table == MAP_FAILED) {
>          error_setg_errno(errp, errno, "failed to allocate msix_table");
>          dev->msix_table = NULL;
> @@ -1653,7 +1653,7 @@ static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
>      assigned_dev_msix_reset(dev);
>  
>      memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops,
> -                          dev, "assigned-dev-msix", MSIX_PAGE_SIZE);
> +                          dev, "assigned-dev-msix", dev->msix_table_size);
>  }
>  
>  static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
> @@ -1662,7 +1662,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
>          return;
>      }
>  
> -    if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) {
> +    if (munmap(dev->msix_table, dev->msix_table_size) == -1) {
>          error_report("error unmapping msix_table! %s", strerror(errno));
>      }
>      dev->msix_table = NULL;
> -- 
> 2.5.5
diff mbox

Patch

diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
index f9c9014..98997d1 100644
--- a/hw/i386/kvm/pci-assign.c
+++ b/hw/i386/kvm/pci-assign.c
@@ -36,8 +36,6 @@ 
 #include "kvm_i386.h"
 #include "hw/pci/pci-assign.h"
 
-#define MSIX_PAGE_SIZE 0x1000
-
 /* From linux/ioport.h */
 #define IORESOURCE_IO       0x00000100  /* Resource type */
 #define IORESOURCE_MEM      0x00000200
@@ -122,6 +120,7 @@  typedef struct AssignedDevice {
     int *msi_virq;
     MSIXTableEntry *msix_table;
     hwaddr msix_table_addr;
+    uint16_t msix_table_size;
     uint16_t msix_max;
     MemoryRegion mmio;
     char *configfd_name;
@@ -1310,6 +1309,7 @@  static int assigned_device_pci_cap_init(PCIDevice *pci_dev, Error **errp)
         bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK;
         msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK;
         dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry;
+        dev->msix_table_size = msix_max * sizeof(MSIXTableEntry);
         dev->msix_max = msix_max;
     }
 
@@ -1633,7 +1633,7 @@  static void assigned_dev_msix_reset(AssignedDevice *dev)
         return;
     }
 
-    memset(dev->msix_table, 0, MSIX_PAGE_SIZE);
+    memset(dev->msix_table, 0, dev->msix_table_size);
 
     for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) {
         entry->ctrl = cpu_to_le32(0x1); /* Masked */
@@ -1642,8 +1642,8 @@  static void assigned_dev_msix_reset(AssignedDevice *dev)
 
 static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
 {
-    dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE,
-                           MAP_ANONYMOUS|MAP_PRIVATE, 0, 0);
+    dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE,
+                           MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
     if (dev->msix_table == MAP_FAILED) {
         error_setg_errno(errp, errno, "failed to allocate msix_table");
         dev->msix_table = NULL;
@@ -1653,7 +1653,7 @@  static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp)
     assigned_dev_msix_reset(dev);
 
     memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops,
-                          dev, "assigned-dev-msix", MSIX_PAGE_SIZE);
+                          dev, "assigned-dev-msix", dev->msix_table_size);
 }
 
 static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
@@ -1662,7 +1662,7 @@  static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
         return;
     }
 
-    if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) {
+    if (munmap(dev->msix_table, dev->msix_table_size) == -1) {
         error_report("error unmapping msix_table! %s", strerror(errno));
     }
     dev->msix_table = NULL;