Message ID | 1466282525-3018-1-git-send-email-ido@wizery.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 18/06/2016 22:42, Ido Yariv wrote: > The current code creates a whole page mmio region for the MSI-X table > size. > > However, the page containing the MSI-X table may contain other registers > not related to MSI-X. Creating an mmio region for the whole page masks > such registers and may break drivers in the guest OS. > > Since maximal number of entries is known, use that instead to deduce the > table size when setting up the mmio region. > > Signed-off-by: Ido Yariv <ido@wizery.com> I can take this patch, but I'd like to warn you that pci-assign is deprecated (and replaced by VFIO). I seem to recall VFIO does this correctly, but it would be great if you could check that. Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just to limit the number of things that could break. Paolo
Hi Paolo, On Mon, Jun 20, 2016 at 04:54:17PM +0200, Paolo Bonzini wrote: > On 18/06/2016 22:42, Ido Yariv wrote: > > The current code creates a whole page mmio region for the MSI-X table > > size. > > > > However, the page containing the MSI-X table may contain other registers > > not related to MSI-X. Creating an mmio region for the whole page masks > > such registers and may break drivers in the guest OS. > > > > Since maximal number of entries is known, use that instead to deduce the > > table size when setting up the mmio region. > > > > Signed-off-by: Ido Yariv <ido@wizery.com> > > I can take this patch, but I'd like to warn you that pci-assign is > deprecated (and replaced by VFIO). I seem to recall VFIO does this > correctly, but it would be great if you could check that. Just gave it a quick shot and everything seems to be working fine with VFIO, thanks! > Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just > to limit the number of things that could break. I believe there might be another issue with this. The number of entries can be up to 2048, so the MSI-X table size can theoretically exceed one page and take up to 8 pages. We can just modify MSIX_PAGE_SIZE to 0x8000, but wouldn't it be better to just map exactly what we need? Cheers, Ido.
On Mon, 20 Jun 2016 11:40:17 -0400 Ido Yariv <ido@wizery.com> wrote: > Hi Paolo, > > On Mon, Jun 20, 2016 at 04:54:17PM +0200, Paolo Bonzini wrote: > > On 18/06/2016 22:42, Ido Yariv wrote: > > > The current code creates a whole page mmio region for the MSI-X table > > > size. > > > > > > However, the page containing the MSI-X table may contain other registers > > > not related to MSI-X. Creating an mmio region for the whole page masks > > > such registers and may break drivers in the guest OS. > > > > > > Since maximal number of entries is known, use that instead to deduce the > > > table size when setting up the mmio region. > > > > > > Signed-off-by: Ido Yariv <ido@wizery.com> > > > > I can take this patch, but I'd like to warn you that pci-assign is > > deprecated (and replaced by VFIO). I seem to recall VFIO does this > > correctly, but it would be great if you could check that. > > Just gave it a quick shot and everything seems to be working fine with > VFIO, thanks! > > > Also, I would prefer the mmap/munmap to keep using MSIX_PAGE_SIZE, just > > to limit the number of things that could break. > > I believe there might be another issue with this. The number of entries > can be up to 2048, so the MSI-X table size can theoretically exceed one > page and take up to 8 pages. > > We can just modify MSIX_PAGE_SIZE to 0x8000, but wouldn't it be better > to just map exactly what we need? Or, let's just let pci-assign be broken and use vfio-pci :-D We should probably actually put an error_report() in pci-assign indicating that it's deprecated and subject to removal to ferret out the remaining users. Fixing anything in it sets the wrong precedent. Thanks, Alex
On Sat, Jun 18, 2016 at 04:42:05PM -0400, Ido Yariv wrote: > The current code creates a whole page mmio region for the MSI-X table > size. > > However, the page containing the MSI-X table may contain other registers > not related to MSI-X. Creating an mmio region for the whole page masks > such registers and may break drivers in the guest OS. > > Since maximal number of entries is known, use that instead to deduce the > table size when setting up the mmio region. > > Signed-off-by: Ido Yariv <ido@wizery.com> I personally don't want to spend cycles maintaining the old pci-assign but if someone does I don't have an issue with that. I remember why I coded up MSIX_PAGE_SIZE - this was before the new memory API which made it very tricky to support sub-page mappings. The PCI spec stringly recommends against sharing a page between page table and other registers but I guess if a device violates that rule it's a better idea to make it work slowly than fail. Patch looks good to me so: Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > --- > hw/i386/kvm/pci-assign.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c > index f9c9014..98997d1 100644 > --- a/hw/i386/kvm/pci-assign.c > +++ b/hw/i386/kvm/pci-assign.c > @@ -36,8 +36,6 @@ > #include "kvm_i386.h" > #include "hw/pci/pci-assign.h" > > -#define MSIX_PAGE_SIZE 0x1000 > - > /* From linux/ioport.h */ > #define IORESOURCE_IO 0x00000100 /* Resource type */ > #define IORESOURCE_MEM 0x00000200 > @@ -122,6 +120,7 @@ typedef struct AssignedDevice { > int *msi_virq; > MSIXTableEntry *msix_table; > hwaddr msix_table_addr; > + uint16_t msix_table_size; > uint16_t msix_max; > MemoryRegion mmio; > char *configfd_name; > @@ -1310,6 +1309,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev, Error **errp) > bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK; > msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK; > dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry; > + dev->msix_table_size = msix_max * sizeof(MSIXTableEntry); > dev->msix_max = msix_max; > } > > @@ -1633,7 +1633,7 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) > return; > } > > - memset(dev->msix_table, 0, MSIX_PAGE_SIZE); > + memset(dev->msix_table, 0, dev->msix_table_size); > > for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) { > entry->ctrl = cpu_to_le32(0x1); /* Masked */ > @@ -1642,8 +1642,8 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) > > static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) > { > - dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE, > - MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); > + dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE, > + MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); > if (dev->msix_table == MAP_FAILED) { > error_setg_errno(errp, errno, "failed to allocate msix_table"); > dev->msix_table = NULL; > @@ -1653,7 +1653,7 @@ static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) > assigned_dev_msix_reset(dev); > > memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops, > - dev, "assigned-dev-msix", MSIX_PAGE_SIZE); > + dev, "assigned-dev-msix", dev->msix_table_size); > } > > static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) > @@ -1662,7 +1662,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) > return; > } > > - if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) { > + if (munmap(dev->msix_table, dev->msix_table_size) == -1) { > error_report("error unmapping msix_table! %s", strerror(errno)); > } > dev->msix_table = NULL; > -- > 2.5.5
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index f9c9014..98997d1 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -36,8 +36,6 @@ #include "kvm_i386.h" #include "hw/pci/pci-assign.h" -#define MSIX_PAGE_SIZE 0x1000 - /* From linux/ioport.h */ #define IORESOURCE_IO 0x00000100 /* Resource type */ #define IORESOURCE_MEM 0x00000200 @@ -122,6 +120,7 @@ typedef struct AssignedDevice { int *msi_virq; MSIXTableEntry *msix_table; hwaddr msix_table_addr; + uint16_t msix_table_size; uint16_t msix_max; MemoryRegion mmio; char *configfd_name; @@ -1310,6 +1309,7 @@ static int assigned_device_pci_cap_init(PCIDevice *pci_dev, Error **errp) bar_nr = msix_table_entry & PCI_MSIX_FLAGS_BIRMASK; msix_table_entry &= ~PCI_MSIX_FLAGS_BIRMASK; dev->msix_table_addr = pci_region[bar_nr].base_addr + msix_table_entry; + dev->msix_table_size = msix_max * sizeof(MSIXTableEntry); dev->msix_max = msix_max; } @@ -1633,7 +1633,7 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) return; } - memset(dev->msix_table, 0, MSIX_PAGE_SIZE); + memset(dev->msix_table, 0, dev->msix_table_size); for (i = 0, entry = dev->msix_table; i < dev->msix_max; i++, entry++) { entry->ctrl = cpu_to_le32(0x1); /* Masked */ @@ -1642,8 +1642,8 @@ static void assigned_dev_msix_reset(AssignedDevice *dev) static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) { - dev->msix_table = mmap(NULL, MSIX_PAGE_SIZE, PROT_READ|PROT_WRITE, - MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); + dev->msix_table = mmap(NULL, dev->msix_table_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); if (dev->msix_table == MAP_FAILED) { error_setg_errno(errp, errno, "failed to allocate msix_table"); dev->msix_table = NULL; @@ -1653,7 +1653,7 @@ static void assigned_dev_register_msix_mmio(AssignedDevice *dev, Error **errp) assigned_dev_msix_reset(dev); memory_region_init_io(&dev->mmio, OBJECT(dev), &assigned_dev_msix_mmio_ops, - dev, "assigned-dev-msix", MSIX_PAGE_SIZE); + dev, "assigned-dev-msix", dev->msix_table_size); } static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) @@ -1662,7 +1662,7 @@ static void assigned_dev_unregister_msix_mmio(AssignedDevice *dev) return; } - if (munmap(dev->msix_table, MSIX_PAGE_SIZE) == -1) { + if (munmap(dev->msix_table, dev->msix_table_size) == -1) { error_report("error unmapping msix_table! %s", strerror(errno)); } dev->msix_table = NULL;
The current code creates a whole page mmio region for the MSI-X table size. However, the page containing the MSI-X table may contain other registers not related to MSI-X. Creating an mmio region for the whole page masks such registers and may break drivers in the guest OS. Since maximal number of entries is known, use that instead to deduce the table size when setting up the mmio region. Signed-off-by: Ido Yariv <ido@wizery.com> --- hw/i386/kvm/pci-assign.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)