[RESEND] fix MSI injection on Xen
diff mbox

Message ID alpine.DEB.2.02.1601131434120.13564@kaball.uk.xensource.com
State New, archived
Headers show

Commit Message

Stefano Stabellini Jan. 13, 2016, 2:59 p.m. UTC
On Xen MSIs can be remapped into pirqs, which are a type of event
channels. It's mostly for the benefit of PCI passthrough devices, to
avoid the overhead of interacting with the emulated lapic.

However remapping interrupts and MSIs is also supported for emulated
devices, such as the e1000 and virtio-net.

When an interrupt or an MSI is remapped into a pirq, masking and
unmasking is done by masking and unmasking the event channel. The
masking bit on the PCI config space or MSI-X table should be ignored,
but it isn't at the moment.

As a consequence emulated devices which use MSI or MSI-X, such as
virtio-net, don't work properly (the guest doesn't receive any
notifications). The mechanism was working properly when xen_apic was
introduced, but I haven't narrowed down which commit in particular is
causing the regression.

Fix the issue by ignoring the masking bit for MSI and MSI-X which have
been remapped into pirqs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Comments

Konrad Rzeszutek Wilk Jan. 27, 2016, 5:41 p.m. UTC | #1
On Wed, Jan 13, 2016 at 02:59:09PM +0000, Stefano Stabellini wrote:
> On Xen MSIs can be remapped into pirqs, which are a type of event
> channels. It's mostly for the benefit of PCI passthrough devices, to
> avoid the overhead of interacting with the emulated lapic.
> 
> However remapping interrupts and MSIs is also supported for emulated
> devices, such as the e1000 and virtio-net.
> 
> When an interrupt or an MSI is remapped into a pirq, masking and
> unmasking is done by masking and unmasking the event channel. The
> masking bit on the PCI config space or MSI-X table should be ignored,
> but it isn't at the moment.
> 
> As a consequence emulated devices which use MSI or MSI-X, such as
> virtio-net, don't work properly (the guest doesn't receive any
> notifications). The mechanism was working properly when xen_apic was

By xen_apic I presume the Linux Xen APIC code?

> introduced, but I haven't narrowed down which commit in particular is
> causing the regression.

It sounds like the issue was due to the Xen APIC code? Or am I misreading it?
Stefano Stabellini Jan. 28, 2016, 10:32 a.m. UTC | #2
On Wed, 27 Jan 2016, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 13, 2016 at 02:59:09PM +0000, Stefano Stabellini wrote:
> > On Xen MSIs can be remapped into pirqs, which are a type of event
> > channels. It's mostly for the benefit of PCI passthrough devices, to
> > avoid the overhead of interacting with the emulated lapic.
> > 
> > However remapping interrupts and MSIs is also supported for emulated
> > devices, such as the e1000 and virtio-net.
> > 
> > When an interrupt or an MSI is remapped into a pirq, masking and
> > unmasking is done by masking and unmasking the event channel. The
> > masking bit on the PCI config space or MSI-X table should be ignored,
> > but it isn't at the moment.
> > 
> > As a consequence emulated devices which use MSI or MSI-X, such as
> > virtio-net, don't work properly (the guest doesn't receive any
> > notifications). The mechanism was working properly when xen_apic was
> 
> By xen_apic I presume the Linux Xen APIC code?

Nope, I mean hw/i386/xen/xen_apic.c in QEMU.


> > introduced, but I haven't narrowed down which commit in particular is
> > causing the regression.
> 
> It sounds like the issue was due to the Xen APIC code? Or am I misreading it?

When xen_apic was introduced in QEMU, MSIs did work correctly. After all
xen_apic was created to make emulated MSIs work on Xen, it wouldn't have
been possible to test it otherwise.
Stefano Stabellini Feb. 4, 2016, 5:05 p.m. UTC | #3
Hi Michael,

do you have any comments on this?

On Wed, 13 Jan 2016, Stefano Stabellini wrote:
> On Xen MSIs can be remapped into pirqs, which are a type of event
> channels. It's mostly for the benefit of PCI passthrough devices, to
> avoid the overhead of interacting with the emulated lapic.
> 
> However remapping interrupts and MSIs is also supported for emulated
> devices, such as the e1000 and virtio-net.
> 
> When an interrupt or an MSI is remapped into a pirq, masking and
> unmasking is done by masking and unmasking the event channel. The
> masking bit on the PCI config space or MSI-X table should be ignored,
> but it isn't at the moment.
> 
> As a consequence emulated devices which use MSI or MSI-X, such as
> virtio-net, don't work properly (the guest doesn't receive any
> notifications). The mechanism was working properly when xen_apic was
> introduced, but I haven't narrowed down which commit in particular is
> causing the regression.
> 
> Fix the issue by ignoring the masking bit for MSI and MSI-X which have
> been remapped into pirqs.
> 
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> 
> diff --git a/hw/pci/msi.c b/hw/pci/msi.c
> index f9c0484..3998725 100644
> --- a/hw/pci/msi.c
> +++ b/hw/pci/msi.c
> @@ -19,6 +19,7 @@
>   */
>  
>  #include "hw/pci/msi.h"
> +#include "hw/xen/xen.h"
>  #include "qemu/range.h"
>  
>  /* PCI_MSI_ADDRESS_LO */
> @@ -253,13 +254,19 @@ void msi_reset(PCIDevice *dev)
>  static bool msi_is_masked(const PCIDevice *dev, unsigned int vector)
>  {
>      uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
> -    uint32_t mask;
> +    uint32_t mask, data;
> +    bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
>      assert(vector < PCI_MSI_VECTORS_MAX);
>  
>      if (!(flags & PCI_MSI_FLAGS_MASKBIT)) {
>          return false;
>      }
>  
> +    data = pci_get_word(dev->config + msi_data_off(dev, msi64bit));
> +    if (xen_is_pirq_msi(data)) {
> +        return false;
> +    }
> +
>      mask = pci_get_long(dev->config +
>                          msi_mask_off(dev, flags & PCI_MSI_FLAGS_64BIT));
>      return mask & (1U << vector);
> diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> index 7716bf3..96281c2 100644
> --- a/hw/pci/msix.c
> +++ b/hw/pci/msix.c
> @@ -18,6 +18,7 @@
>  #include "hw/pci/msi.h"
>  #include "hw/pci/msix.h"
>  #include "hw/pci/pci.h"
> +#include "hw/xen/xen.h"
>  #include "qemu/range.h"
>  
>  #define MSIX_CAP_LENGTH 12
> @@ -77,8 +78,15 @@ static void msix_clr_pending(PCIDevice *dev, int vector)
>  
>  static bool msix_vector_masked(PCIDevice *dev, unsigned int vector, bool fmask)
>  {
> -    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL;
> -    return fmask || dev->msix_table[offset] & PCI_MSIX_ENTRY_CTRL_MASKBIT;
> +    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE;
> +    uint32_t *data = (uint32_t *)&dev->msix_table[offset + PCI_MSIX_ENTRY_DATA];
> +    /* MSIs on Xen can be remapped into pirqs. In those cases, masking
> +     * and unmasking go through the PV evtchn path. */
> +    if (xen_is_pirq_msi(*data)) {
> +        return false;
> +    }
> +    return fmask || dev->msix_table[offset + PCI_MSIX_ENTRY_VECTOR_CTRL] &
> +        PCI_MSIX_ENTRY_CTRL_MASKBIT;
>  }
>  
>  bool msix_is_masked(PCIDevice *dev, unsigned int vector)
> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> index 82de2bc..375707e 100644
> --- a/hw/xen/xen_pt_msi.c
> +++ b/hw/xen/xen_pt_msi.c
> @@ -113,9 +113,7 @@ static int msi_msix_setup(XenPCIPassthroughState *s,
>  
>      assert((!is_msix && msix_entry == 0) || is_msix);
>  
> -    if (gvec == 0) {
> -        /* if gvec is 0, the guest is asking for a particular pirq that
> -         * is passed as dest_id */
> +    if (xen_is_pirq_msi(data)) {
>          *ppirq = msi_ext_dest_id(addr >> 32) | msi_dest_id(addr);
>          if (!*ppirq) {
>              /* this probably identifies an misconfiguration of the guest,
> diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
> index 4356af4..b15b2f5 100644
> --- a/include/hw/xen/xen.h
> +++ b/include/hw/xen/xen.h
> @@ -32,6 +32,7 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num);
>  void xen_piix3_set_irq(void *opaque, int irq_num, int level);
>  void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len);
>  void xen_hvm_inject_msi(uint64_t addr, uint32_t data);
> +int xen_is_pirq_msi(uint32_t msi_data);
>  
>  qemu_irq *xen_interrupt_controller_init(void);
>  
> diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
> index 46867d8..ce22a82 100644
> --- a/xen-hvm-stub.c
> +++ b/xen-hvm-stub.c
> @@ -30,6 +30,11 @@ void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
>  {
>  }
>  
> +int xen_is_pirq_msi(uint32_t msi_data)
> +{
> +    return 0;
> +}
> +
>  void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
>  {
>  }
> diff --git a/xen-hvm.c b/xen-hvm.c
> index 3371c4e..21dd301 100644
> --- a/xen-hvm.c
> +++ b/xen-hvm.c
> @@ -12,6 +12,7 @@
>  
>  #include "hw/pci/pci.h"
>  #include "hw/i386/pc.h"
> +#include "hw/i386/apic-msidef.h"
>  #include "hw/xen/xen_common.h"
>  #include "hw/xen/xen_backend.h"
>  #include "qmp-commands.h"
> @@ -156,6 +157,14 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
>      }
>  }
>  
> +int xen_is_pirq_msi(uint32_t msi_data)
> +{
> +    /* If vector is 0, the msi is remapped into a pirq, passed as
> +     * dest_id.
> +     */
> +    return ((msi_data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT) == 0;
> +}
> +
>  void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
>  {
>      xen_xc_hvm_inject_msi(xen_xc, xen_domid, addr, data);
>
Michael S. Tsirkin Feb. 4, 2016, 6:16 p.m. UTC | #4
On Thu, Feb 04, 2016 at 05:05:46PM +0000, Stefano Stabellini wrote:
> Hi Michael,
> 
> do you have any comments on this?

I dislike how it spreads xen specific stuff around,
but I don't have a better idea at the moment, so
I applied this.

> On Wed, 13 Jan 2016, Stefano Stabellini wrote:
> > On Xen MSIs can be remapped into pirqs, which are a type of event
> > channels. It's mostly for the benefit of PCI passthrough devices, to
> > avoid the overhead of interacting with the emulated lapic.
> > 
> > However remapping interrupts and MSIs is also supported for emulated
> > devices, such as the e1000 and virtio-net.
> > 
> > When an interrupt or an MSI is remapped into a pirq, masking and
> > unmasking is done by masking and unmasking the event channel. The
> > masking bit on the PCI config space or MSI-X table should be ignored,
> > but it isn't at the moment.
> > 
> > As a consequence emulated devices which use MSI or MSI-X, such as
> > virtio-net, don't work properly (the guest doesn't receive any
> > notifications). The mechanism was working properly when xen_apic was
> > introduced, but I haven't narrowed down which commit in particular is
> > causing the regression.
> > 
> > Fix the issue by ignoring the masking bit for MSI and MSI-X which have
> > been remapped into pirqs.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > 
> > 
> > diff --git a/hw/pci/msi.c b/hw/pci/msi.c
> > index f9c0484..3998725 100644
> > --- a/hw/pci/msi.c
> > +++ b/hw/pci/msi.c
> > @@ -19,6 +19,7 @@
> >   */
> >  
> >  #include "hw/pci/msi.h"
> > +#include "hw/xen/xen.h"
> >  #include "qemu/range.h"
> >  
> >  /* PCI_MSI_ADDRESS_LO */
> > @@ -253,13 +254,19 @@ void msi_reset(PCIDevice *dev)
> >  static bool msi_is_masked(const PCIDevice *dev, unsigned int vector)
> >  {
> >      uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
> > -    uint32_t mask;
> > +    uint32_t mask, data;
> > +    bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
> >      assert(vector < PCI_MSI_VECTORS_MAX);
> >  
> >      if (!(flags & PCI_MSI_FLAGS_MASKBIT)) {
> >          return false;
> >      }
> >  
> > +    data = pci_get_word(dev->config + msi_data_off(dev, msi64bit));
> > +    if (xen_is_pirq_msi(data)) {
> > +        return false;
> > +    }
> > +
> >      mask = pci_get_long(dev->config +
> >                          msi_mask_off(dev, flags & PCI_MSI_FLAGS_64BIT));
> >      return mask & (1U << vector);
> > diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> > index 7716bf3..96281c2 100644
> > --- a/hw/pci/msix.c
> > +++ b/hw/pci/msix.c
> > @@ -18,6 +18,7 @@
> >  #include "hw/pci/msi.h"
> >  #include "hw/pci/msix.h"
> >  #include "hw/pci/pci.h"
> > +#include "hw/xen/xen.h"
> >  #include "qemu/range.h"
> >  
> >  #define MSIX_CAP_LENGTH 12
> > @@ -77,8 +78,15 @@ static void msix_clr_pending(PCIDevice *dev, int vector)
> >  
> >  static bool msix_vector_masked(PCIDevice *dev, unsigned int vector, bool fmask)
> >  {
> > -    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL;
> > -    return fmask || dev->msix_table[offset] & PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > +    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE;
> > +    uint32_t *data = (uint32_t *)&dev->msix_table[offset + PCI_MSIX_ENTRY_DATA];
> > +    /* MSIs on Xen can be remapped into pirqs. In those cases, masking
> > +     * and unmasking go through the PV evtchn path. */
> > +    if (xen_is_pirq_msi(*data)) {
> > +        return false;
> > +    }
> > +    return fmask || dev->msix_table[offset + PCI_MSIX_ENTRY_VECTOR_CTRL] &
> > +        PCI_MSIX_ENTRY_CTRL_MASKBIT;
> >  }
> >  
> >  bool msix_is_masked(PCIDevice *dev, unsigned int vector)
> > diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> > index 82de2bc..375707e 100644
> > --- a/hw/xen/xen_pt_msi.c
> > +++ b/hw/xen/xen_pt_msi.c
> > @@ -113,9 +113,7 @@ static int msi_msix_setup(XenPCIPassthroughState *s,
> >  
> >      assert((!is_msix && msix_entry == 0) || is_msix);
> >  
> > -    if (gvec == 0) {
> > -        /* if gvec is 0, the guest is asking for a particular pirq that
> > -         * is passed as dest_id */
> > +    if (xen_is_pirq_msi(data)) {
> >          *ppirq = msi_ext_dest_id(addr >> 32) | msi_dest_id(addr);
> >          if (!*ppirq) {
> >              /* this probably identifies an misconfiguration of the guest,
> > diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
> > index 4356af4..b15b2f5 100644
> > --- a/include/hw/xen/xen.h
> > +++ b/include/hw/xen/xen.h
> > @@ -32,6 +32,7 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num);
> >  void xen_piix3_set_irq(void *opaque, int irq_num, int level);
> >  void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len);
> >  void xen_hvm_inject_msi(uint64_t addr, uint32_t data);
> > +int xen_is_pirq_msi(uint32_t msi_data);
> >  
> >  qemu_irq *xen_interrupt_controller_init(void);
> >  
> > diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
> > index 46867d8..ce22a82 100644
> > --- a/xen-hvm-stub.c
> > +++ b/xen-hvm-stub.c
> > @@ -30,6 +30,11 @@ void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
> >  {
> >  }
> >  
> > +int xen_is_pirq_msi(uint32_t msi_data)
> > +{
> > +    return 0;
> > +}
> > +
> >  void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
> >  {
> >  }
> > diff --git a/xen-hvm.c b/xen-hvm.c
> > index 3371c4e..21dd301 100644
> > --- a/xen-hvm.c
> > +++ b/xen-hvm.c
> > @@ -12,6 +12,7 @@
> >  
> >  #include "hw/pci/pci.h"
> >  #include "hw/i386/pc.h"
> > +#include "hw/i386/apic-msidef.h"
> >  #include "hw/xen/xen_common.h"
> >  #include "hw/xen/xen_backend.h"
> >  #include "qmp-commands.h"
> > @@ -156,6 +157,14 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
> >      }
> >  }
> >  
> > +int xen_is_pirq_msi(uint32_t msi_data)
> > +{
> > +    /* If vector is 0, the msi is remapped into a pirq, passed as
> > +     * dest_id.
> > +     */
> > +    return ((msi_data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT) == 0;
> > +}
> > +
> >  void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
> >  {
> >      xen_xc_hvm_inject_msi(xen_xc, xen_domid, addr, data);
> >
Stefano Stabellini Feb. 8, 2016, 11:18 a.m. UTC | #5
On Thu, 4 Feb 2016, Michael S. Tsirkin wrote:
> On Thu, Feb 04, 2016 at 05:05:46PM +0000, Stefano Stabellini wrote:
> > Hi Michael,
> > 
> > do you have any comments on this?
> 
> I dislike how it spreads xen specific stuff around,
> but I don't have a better idea at the moment, so
> I applied this.

Thank you


> > On Wed, 13 Jan 2016, Stefano Stabellini wrote:
> > > On Xen MSIs can be remapped into pirqs, which are a type of event
> > > channels. It's mostly for the benefit of PCI passthrough devices, to
> > > avoid the overhead of interacting with the emulated lapic.
> > > 
> > > However remapping interrupts and MSIs is also supported for emulated
> > > devices, such as the e1000 and virtio-net.
> > > 
> > > When an interrupt or an MSI is remapped into a pirq, masking and
> > > unmasking is done by masking and unmasking the event channel. The
> > > masking bit on the PCI config space or MSI-X table should be ignored,
> > > but it isn't at the moment.
> > > 
> > > As a consequence emulated devices which use MSI or MSI-X, such as
> > > virtio-net, don't work properly (the guest doesn't receive any
> > > notifications). The mechanism was working properly when xen_apic was
> > > introduced, but I haven't narrowed down which commit in particular is
> > > causing the regression.
> > > 
> > > Fix the issue by ignoring the masking bit for MSI and MSI-X which have
> > > been remapped into pirqs.
> > > 
> > > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > > 
> > > 
> > > diff --git a/hw/pci/msi.c b/hw/pci/msi.c
> > > index f9c0484..3998725 100644
> > > --- a/hw/pci/msi.c
> > > +++ b/hw/pci/msi.c
> > > @@ -19,6 +19,7 @@
> > >   */
> > >  
> > >  #include "hw/pci/msi.h"
> > > +#include "hw/xen/xen.h"
> > >  #include "qemu/range.h"
> > >  
> > >  /* PCI_MSI_ADDRESS_LO */
> > > @@ -253,13 +254,19 @@ void msi_reset(PCIDevice *dev)
> > >  static bool msi_is_masked(const PCIDevice *dev, unsigned int vector)
> > >  {
> > >      uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
> > > -    uint32_t mask;
> > > +    uint32_t mask, data;
> > > +    bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
> > >      assert(vector < PCI_MSI_VECTORS_MAX);
> > >  
> > >      if (!(flags & PCI_MSI_FLAGS_MASKBIT)) {
> > >          return false;
> > >      }
> > >  
> > > +    data = pci_get_word(dev->config + msi_data_off(dev, msi64bit));
> > > +    if (xen_is_pirq_msi(data)) {
> > > +        return false;
> > > +    }
> > > +
> > >      mask = pci_get_long(dev->config +
> > >                          msi_mask_off(dev, flags & PCI_MSI_FLAGS_64BIT));
> > >      return mask & (1U << vector);
> > > diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> > > index 7716bf3..96281c2 100644
> > > --- a/hw/pci/msix.c
> > > +++ b/hw/pci/msix.c
> > > @@ -18,6 +18,7 @@
> > >  #include "hw/pci/msi.h"
> > >  #include "hw/pci/msix.h"
> > >  #include "hw/pci/pci.h"
> > > +#include "hw/xen/xen.h"
> > >  #include "qemu/range.h"
> > >  
> > >  #define MSIX_CAP_LENGTH 12
> > > @@ -77,8 +78,15 @@ static void msix_clr_pending(PCIDevice *dev, int vector)
> > >  
> > >  static bool msix_vector_masked(PCIDevice *dev, unsigned int vector, bool fmask)
> > >  {
> > > -    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL;
> > > -    return fmask || dev->msix_table[offset] & PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > > +    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE;
> > > +    uint32_t *data = (uint32_t *)&dev->msix_table[offset + PCI_MSIX_ENTRY_DATA];
> > > +    /* MSIs on Xen can be remapped into pirqs. In those cases, masking
> > > +     * and unmasking go through the PV evtchn path. */
> > > +    if (xen_is_pirq_msi(*data)) {
> > > +        return false;
> > > +    }
> > > +    return fmask || dev->msix_table[offset + PCI_MSIX_ENTRY_VECTOR_CTRL] &
> > > +        PCI_MSIX_ENTRY_CTRL_MASKBIT;
> > >  }
> > >  
> > >  bool msix_is_masked(PCIDevice *dev, unsigned int vector)
> > > diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> > > index 82de2bc..375707e 100644
> > > --- a/hw/xen/xen_pt_msi.c
> > > +++ b/hw/xen/xen_pt_msi.c
> > > @@ -113,9 +113,7 @@ static int msi_msix_setup(XenPCIPassthroughState *s,
> > >  
> > >      assert((!is_msix && msix_entry == 0) || is_msix);
> > >  
> > > -    if (gvec == 0) {
> > > -        /* if gvec is 0, the guest is asking for a particular pirq that
> > > -         * is passed as dest_id */
> > > +    if (xen_is_pirq_msi(data)) {
> > >          *ppirq = msi_ext_dest_id(addr >> 32) | msi_dest_id(addr);
> > >          if (!*ppirq) {
> > >              /* this probably identifies an misconfiguration of the guest,
> > > diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
> > > index 4356af4..b15b2f5 100644
> > > --- a/include/hw/xen/xen.h
> > > +++ b/include/hw/xen/xen.h
> > > @@ -32,6 +32,7 @@ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num);
> > >  void xen_piix3_set_irq(void *opaque, int irq_num, int level);
> > >  void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len);
> > >  void xen_hvm_inject_msi(uint64_t addr, uint32_t data);
> > > +int xen_is_pirq_msi(uint32_t msi_data);
> > >  
> > >  qemu_irq *xen_interrupt_controller_init(void);
> > >  
> > > diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
> > > index 46867d8..ce22a82 100644
> > > --- a/xen-hvm-stub.c
> > > +++ b/xen-hvm-stub.c
> > > @@ -30,6 +30,11 @@ void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
> > >  {
> > >  }
> > >  
> > > +int xen_is_pirq_msi(uint32_t msi_data)
> > > +{
> > > +    return 0;
> > > +}
> > > +
> > >  void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
> > >  {
> > >  }
> > > diff --git a/xen-hvm.c b/xen-hvm.c
> > > index 3371c4e..21dd301 100644
> > > --- a/xen-hvm.c
> > > +++ b/xen-hvm.c
> > > @@ -12,6 +12,7 @@
> > >  
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/i386/pc.h"
> > > +#include "hw/i386/apic-msidef.h"
> > >  #include "hw/xen/xen_common.h"
> > >  #include "hw/xen/xen_backend.h"
> > >  #include "qmp-commands.h"
> > > @@ -156,6 +157,14 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
> > >      }
> > >  }
> > >  
> > > +int xen_is_pirq_msi(uint32_t msi_data)
> > > +{
> > > +    /* If vector is 0, the msi is remapped into a pirq, passed as
> > > +     * dest_id.
> > > +     */
> > > +    return ((msi_data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT) == 0;
> > > +}
> > > +
> > >  void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
> > >  {
> > >      xen_xc_hvm_inject_msi(xen_xc, xen_domid, addr, data);
> > > 
>

Patch
diff mbox

diff --git a/hw/pci/msi.c b/hw/pci/msi.c
index f9c0484..3998725 100644
--- a/hw/pci/msi.c
+++ b/hw/pci/msi.c
@@ -19,6 +19,7 @@ 
  */
 
 #include "hw/pci/msi.h"
+#include "hw/xen/xen.h"
 #include "qemu/range.h"
 
 /* PCI_MSI_ADDRESS_LO */
@@ -253,13 +254,19 @@  void msi_reset(PCIDevice *dev)
 static bool msi_is_masked(const PCIDevice *dev, unsigned int vector)
 {
     uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
-    uint32_t mask;
+    uint32_t mask, data;
+    bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
     assert(vector < PCI_MSI_VECTORS_MAX);
 
     if (!(flags & PCI_MSI_FLAGS_MASKBIT)) {
         return false;
     }
 
+    data = pci_get_word(dev->config + msi_data_off(dev, msi64bit));
+    if (xen_is_pirq_msi(data)) {
+        return false;
+    }
+
     mask = pci_get_long(dev->config +
                         msi_mask_off(dev, flags & PCI_MSI_FLAGS_64BIT));
     return mask & (1U << vector);
diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 7716bf3..96281c2 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -18,6 +18,7 @@ 
 #include "hw/pci/msi.h"
 #include "hw/pci/msix.h"
 #include "hw/pci/pci.h"
+#include "hw/xen/xen.h"
 #include "qemu/range.h"
 
 #define MSIX_CAP_LENGTH 12
@@ -77,8 +78,15 @@  static void msix_clr_pending(PCIDevice *dev, int vector)
 
 static bool msix_vector_masked(PCIDevice *dev, unsigned int vector, bool fmask)
 {
-    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE + PCI_MSIX_ENTRY_VECTOR_CTRL;
-    return fmask || dev->msix_table[offset] & PCI_MSIX_ENTRY_CTRL_MASKBIT;
+    unsigned offset = vector * PCI_MSIX_ENTRY_SIZE;
+    uint32_t *data = (uint32_t *)&dev->msix_table[offset + PCI_MSIX_ENTRY_DATA];
+    /* MSIs on Xen can be remapped into pirqs. In those cases, masking
+     * and unmasking go through the PV evtchn path. */
+    if (xen_is_pirq_msi(*data)) {
+        return false;
+    }
+    return fmask || dev->msix_table[offset + PCI_MSIX_ENTRY_VECTOR_CTRL] &
+        PCI_MSIX_ENTRY_CTRL_MASKBIT;
 }
 
 bool msix_is_masked(PCIDevice *dev, unsigned int vector)
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index 82de2bc..375707e 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -113,9 +113,7 @@  static int msi_msix_setup(XenPCIPassthroughState *s,
 
     assert((!is_msix && msix_entry == 0) || is_msix);
 
-    if (gvec == 0) {
-        /* if gvec is 0, the guest is asking for a particular pirq that
-         * is passed as dest_id */
+    if (xen_is_pirq_msi(data)) {
         *ppirq = msi_ext_dest_id(addr >> 32) | msi_dest_id(addr);
         if (!*ppirq) {
             /* this probably identifies an misconfiguration of the guest,
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 4356af4..b15b2f5 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -32,6 +32,7 @@  int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num);
 void xen_piix3_set_irq(void *opaque, int irq_num, int level);
 void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len);
 void xen_hvm_inject_msi(uint64_t addr, uint32_t data);
+int xen_is_pirq_msi(uint32_t msi_data);
 
 qemu_irq *xen_interrupt_controller_init(void);
 
diff --git a/xen-hvm-stub.c b/xen-hvm-stub.c
index 46867d8..ce22a82 100644
--- a/xen-hvm-stub.c
+++ b/xen-hvm-stub.c
@@ -30,6 +30,11 @@  void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
 {
 }
 
+int xen_is_pirq_msi(uint32_t msi_data)
+{
+    return 0;
+}
+
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
 {
 }
diff --git a/xen-hvm.c b/xen-hvm.c
index 3371c4e..21dd301 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -12,6 +12,7 @@ 
 
 #include "hw/pci/pci.h"
 #include "hw/i386/pc.h"
+#include "hw/i386/apic-msidef.h"
 #include "hw/xen/xen_common.h"
 #include "hw/xen/xen_backend.h"
 #include "qmp-commands.h"
@@ -156,6 +157,14 @@  void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
     }
 }
 
+int xen_is_pirq_msi(uint32_t msi_data)
+{
+    /* If vector is 0, the msi is remapped into a pirq, passed as
+     * dest_id.
+     */
+    return ((msi_data & MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT) == 0;
+}
+
 void xen_hvm_inject_msi(uint64_t addr, uint32_t data)
 {
     xen_xc_hvm_inject_msi(xen_xc, xen_domid, addr, data);