Message ID | 20250310095535.46033-3-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/pci: reduce PCI accesses | expand |
On 10.03.2025 10:55, Roger Pau Monne wrote: > Attempt to reduce the MSI entry writes, and the associated checking whether > memory decoding and MSI-X is enabled for the PCI device, when the MSI data > hasn't changed. > > When using Interrupt Remapping the MSI entry will contain an index into > the remapping table, and it's in such remapping table where the MSI vector > and destination CPU is stored. As such, when using interrupt remapping, > changes to the interrupt affinity shouldn't result in changes to the MSI > entry, and the MSI entry update can be avoided. > > Signal from the IOMMU update_ire_from_msi hook whether the MSI data or > address fields have changed, and thus need writing to the device registers. > Such signaling is done by returning 1 from the function. Otherwise > returning 0 means no update of the MSI fields, and thus no write > required. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> with two purely cosmetic suggestions and an only loosely related question below. > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, > > ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); > > - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > + > + return rc < 0 ? rc : 0; Only tangential here, but: Why does this function have a return type of non-void, when neither caller cares? > --- a/xen/drivers/passthrough/amd/iommu_intr.c > +++ b/xen/drivers/passthrough/amd/iommu_intr.c > @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( > get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); > } > > - return 0; > + return !fresh ? 0 : 1; > } Simply return fresh; ? > @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( > rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, > &msi_desc->remap_index, > msg, &data); > - if ( !rc ) > + if ( rc > 0 ) > { > for ( i = 1; i < nr; ++i ) > msi_desc[i].remap_index = msi_desc->remap_index + i; > --- a/xen/drivers/passthrough/vtd/intremap.c > +++ b/xen/drivers/passthrough/vtd/intremap.c > @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( > unsigned int index, i, nr = 1; > unsigned long flags; > const struct pi_desc *pi_desc = msi_desc->pi_desc; > + bool alloc = false; > > if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) > nr = msi_desc->msi.nvec; > @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( > index = alloc_remap_entry(iommu, nr); > for ( i = 0; i < nr; ++i ) > msi_desc[i].remap_index = index + i; > + alloc = true; > } > else > index = msi_desc->remap_index; > @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( > unmap_vtd_domain_page(iremap_entries); > spin_unlock_irqrestore(&iommu->intremap.lock, flags); > > - return 0; > + return alloc ? 1 : 0; > } Like above, simply return alloc; ? Jan
On Mon, Mar 10, 2025 at 11:51:09AM +0100, Jan Beulich wrote: > On 10.03.2025 10:55, Roger Pau Monne wrote: > > Attempt to reduce the MSI entry writes, and the associated checking whether > > memory decoding and MSI-X is enabled for the PCI device, when the MSI data > > hasn't changed. > > > > When using Interrupt Remapping the MSI entry will contain an index into > > the remapping table, and it's in such remapping table where the MSI vector > > and destination CPU is stored. As such, when using interrupt remapping, > > changes to the interrupt affinity shouldn't result in changes to the MSI > > entry, and the MSI entry update can be avoided. > > > > Signal from the IOMMU update_ire_from_msi hook whether the MSI data or > > address fields have changed, and thus need writing to the device registers. > > Such signaling is done by returning 1 from the function. Otherwise > > returning 0 means no update of the MSI fields, and thus no write > > required. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > with two purely cosmetic suggestions and an only loosely related question below. > > > --- a/xen/arch/x86/hvm/vmx/vmx.c > > +++ b/xen/arch/x86/hvm/vmx/vmx.c > > @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, > > > > ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); > > > > - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > > + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > > + > > + return rc < 0 ? rc : 0; > > Only tangential here, but: Why does this function have a return type of > non-void, when neither caller cares? I'm afraid there's more wrong in vmx_pi_update_irte() that I've just spotted afterwards. vmx_pi_update_irte() passes to iommu_update_ire_from_msi() the msi_desc->msg field, but that field is supposed to always contain the non-translated MSI data, as you correctly pointed out in v1 it's consumed by dump_msi(). vmx_pi_update_irte() using msi_desc->msg to store the translated MSI effectively breaks dump_msi(). Also vmx_pi_update_irte() relies on the IRT index never changing, as otherwise it's missing any logic to update the MSI registers. I will fix that in a pre-patch. > > > --- a/xen/drivers/passthrough/amd/iommu_intr.c > > +++ b/xen/drivers/passthrough/amd/iommu_intr.c > > @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( > > get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); > > } > > > > - return 0; > > + return !fresh ? 0 : 1; > > } > > Simply > > return fresh; > > ? > > > @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( > > rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, > > &msi_desc->remap_index, > > msg, &data); > > - if ( !rc ) > > + if ( rc > 0 ) > > { > > for ( i = 1; i < nr; ++i ) > > msi_desc[i].remap_index = msi_desc->remap_index + i; > > --- a/xen/drivers/passthrough/vtd/intremap.c > > +++ b/xen/drivers/passthrough/vtd/intremap.c > > @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( > > unsigned int index, i, nr = 1; > > unsigned long flags; > > const struct pi_desc *pi_desc = msi_desc->pi_desc; > > + bool alloc = false; > > > > if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) > > nr = msi_desc->msi.nvec; > > @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( > > index = alloc_remap_entry(iommu, nr); > > for ( i = 0; i < nr; ++i ) > > msi_desc[i].remap_index = index + i; > > + alloc = true; > > } > > else > > index = msi_desc->remap_index; > > @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( > > unmap_vtd_domain_page(iremap_entries); > > spin_unlock_irqrestore(&iommu->intremap.lock, flags); > > > > - return 0; > > + return alloc ? 1 : 0; > > } > > Like above, simply > > return alloc; > > ? I wasn't sure whether this was overloading the boolean type and possibly breaking some MISRA rule. I can adjust. Thanks, Roger.
On 10.03.2025 16:42, Roger Pau Monné wrote: > On Mon, Mar 10, 2025 at 11:51:09AM +0100, Jan Beulich wrote: >> On 10.03.2025 10:55, Roger Pau Monne wrote: >>> Attempt to reduce the MSI entry writes, and the associated checking whether >>> memory decoding and MSI-X is enabled for the PCI device, when the MSI data >>> hasn't changed. >>> >>> When using Interrupt Remapping the MSI entry will contain an index into >>> the remapping table, and it's in such remapping table where the MSI vector >>> and destination CPU is stored. As such, when using interrupt remapping, >>> changes to the interrupt affinity shouldn't result in changes to the MSI >>> entry, and the MSI entry update can be avoided. >>> >>> Signal from the IOMMU update_ire_from_msi hook whether the MSI data or >>> address fields have changed, and thus need writing to the device registers. >>> Such signaling is done by returning 1 from the function. Otherwise >>> returning 0 means no update of the MSI fields, and thus no write >>> required. >>> >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> >> Reviewed-by: Jan Beulich <jbeulich@suse.com> >> with two purely cosmetic suggestions and an only loosely related question below. >> >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, >>> >>> ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); >>> >>> - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); >>> + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); >>> + >>> + return rc < 0 ? rc : 0; >> >> Only tangential here, but: Why does this function have a return type of >> non-void, when neither caller cares? > > I'm afraid there's more wrong in vmx_pi_update_irte() that I've just > spotted afterwards. > > vmx_pi_update_irte() passes to iommu_update_ire_from_msi() the > msi_desc->msg field, but that field is supposed to always contain the > non-translated MSI data, as you correctly pointed out in v1 it's > consumed by dump_msi(). vmx_pi_update_irte() using msi_desc->msg to > store the translated MSI effectively breaks dump_msi(). Oh, indeed - it violates what write_msi_msg() specifically checks by an assertion. > Also vmx_pi_update_irte() relies on the IRT index never changing, as > otherwise it's missing any logic to update the MSI registers. Isn't this a valid assumption here? msi_msg_to_remap_entry() will only ever allocate a new index if none was previously allocated. >>> --- a/xen/drivers/passthrough/amd/iommu_intr.c >>> +++ b/xen/drivers/passthrough/amd/iommu_intr.c >>> @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( >>> get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); >>> } >>> >>> - return 0; >>> + return !fresh ? 0 : 1; >>> } >> >> Simply >> >> return fresh; >> >> ? >> >>> @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( >>> rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, >>> &msi_desc->remap_index, >>> msg, &data); >>> - if ( !rc ) >>> + if ( rc > 0 ) >>> { >>> for ( i = 1; i < nr; ++i ) >>> msi_desc[i].remap_index = msi_desc->remap_index + i; >>> --- a/xen/drivers/passthrough/vtd/intremap.c >>> +++ b/xen/drivers/passthrough/vtd/intremap.c >>> @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( >>> unsigned int index, i, nr = 1; >>> unsigned long flags; >>> const struct pi_desc *pi_desc = msi_desc->pi_desc; >>> + bool alloc = false; >>> >>> if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) >>> nr = msi_desc->msi.nvec; >>> @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( >>> index = alloc_remap_entry(iommu, nr); >>> for ( i = 0; i < nr; ++i ) >>> msi_desc[i].remap_index = index + i; >>> + alloc = true; >>> } >>> else >>> index = msi_desc->remap_index; >>> @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( >>> unmap_vtd_domain_page(iremap_entries); >>> spin_unlock_irqrestore(&iommu->intremap.lock, flags); >>> >>> - return 0; >>> + return alloc ? 1 : 0; >>> } >> >> Like above, simply >> >> return alloc; >> >> ? > > I wasn't sure whether this was overloading the boolean type and > possibly breaking some MISRA rule. I can adjust. What we are to do with Misra's essential type system is entirely unclear at this point, aiui. Jan
On Tue, Mar 11, 2025 at 08:35:35AM +0100, Jan Beulich wrote: > On 10.03.2025 16:42, Roger Pau Monné wrote: > > On Mon, Mar 10, 2025 at 11:51:09AM +0100, Jan Beulich wrote: > >> On 10.03.2025 10:55, Roger Pau Monne wrote: > >>> Attempt to reduce the MSI entry writes, and the associated checking whether > >>> memory decoding and MSI-X is enabled for the PCI device, when the MSI data > >>> hasn't changed. > >>> > >>> When using Interrupt Remapping the MSI entry will contain an index into > >>> the remapping table, and it's in such remapping table where the MSI vector > >>> and destination CPU is stored. As such, when using interrupt remapping, > >>> changes to the interrupt affinity shouldn't result in changes to the MSI > >>> entry, and the MSI entry update can be avoided. > >>> > >>> Signal from the IOMMU update_ire_from_msi hook whether the MSI data or > >>> address fields have changed, and thus need writing to the device registers. > >>> Such signaling is done by returning 1 from the function. Otherwise > >>> returning 0 means no update of the MSI fields, and thus no write > >>> required. > >>> > >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >> > >> Reviewed-by: Jan Beulich <jbeulich@suse.com> > >> with two purely cosmetic suggestions and an only loosely related question below. > >> > >>> --- a/xen/arch/x86/hvm/vmx/vmx.c > >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c > >>> @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, > >>> > >>> ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); > >>> > >>> - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > >>> + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); > >>> + > >>> + return rc < 0 ? rc : 0; > >> > >> Only tangential here, but: Why does this function have a return type of > >> non-void, when neither caller cares? > > > > I'm afraid there's more wrong in vmx_pi_update_irte() that I've just > > spotted afterwards. > > > > vmx_pi_update_irte() passes to iommu_update_ire_from_msi() the > > msi_desc->msg field, but that field is supposed to always contain the > > non-translated MSI data, as you correctly pointed out in v1 it's > > consumed by dump_msi(). vmx_pi_update_irte() using msi_desc->msg to > > store the translated MSI effectively breaks dump_msi(). > > Oh, indeed - it violates what write_msi_msg() specifically checks by > an assertion. Indeed. I have a pre-patch to fix this. > > Also vmx_pi_update_irte() relies on the IRT index never changing, as > > otherwise it's missing any logic to update the MSI registers. > > Isn't this a valid assumption here? msi_msg_to_remap_entry() will only > ever allocate a new index if none was previously allocated. Yes, but I think it needs to be accounted for, I've now switched this to: rc = iommu_update_ire_from_msi(msi_desc, &msg); if ( rc > 0 ) { /* * Callers of vmx_pi_update_irte() won't propagate the updated MSI * fields to the hardware, must assert there are no changes. */ ASSERT_UNREACHABLE(); rc = -EILSEQ; } return rc; Which I think better reflects the expectations of the function. > >>> --- a/xen/drivers/passthrough/amd/iommu_intr.c > >>> +++ b/xen/drivers/passthrough/amd/iommu_intr.c > >>> @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( > >>> get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); > >>> } > >>> > >>> - return 0; > >>> + return !fresh ? 0 : 1; > >>> } > >> > >> Simply > >> > >> return fresh; > >> > >> ? > >> > >>> @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( > >>> rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, > >>> &msi_desc->remap_index, > >>> msg, &data); > >>> - if ( !rc ) > >>> + if ( rc > 0 ) > >>> { > >>> for ( i = 1; i < nr; ++i ) > >>> msi_desc[i].remap_index = msi_desc->remap_index + i; > >>> --- a/xen/drivers/passthrough/vtd/intremap.c > >>> +++ b/xen/drivers/passthrough/vtd/intremap.c > >>> @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( > >>> unsigned int index, i, nr = 1; > >>> unsigned long flags; > >>> const struct pi_desc *pi_desc = msi_desc->pi_desc; > >>> + bool alloc = false; > >>> > >>> if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) > >>> nr = msi_desc->msi.nvec; > >>> @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( > >>> index = alloc_remap_entry(iommu, nr); > >>> for ( i = 0; i < nr; ++i ) > >>> msi_desc[i].remap_index = index + i; > >>> + alloc = true; > >>> } > >>> else > >>> index = msi_desc->remap_index; > >>> @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( > >>> unmap_vtd_domain_page(iremap_entries); > >>> spin_unlock_irqrestore(&iommu->intremap.lock, flags); > >>> > >>> - return 0; > >>> + return alloc ? 1 : 0; > >>> } > >> > >> Like above, simply > >> > >> return alloc; > >> > >> ? > > > > I wasn't sure whether this was overloading the boolean type and > > possibly breaking some MISRA rule. I can adjust. > > What we are to do with Misra's essential type system is entirely unclear at > this point, aiui. Thanks, given the findings above I will post v4 with the extra adjustments. Roger.
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 51ff7f12f5c0..1bca8c8b670d 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -283,8 +283,12 @@ static int hpet_msi_write(struct hpet_event_channel *ch, struct msi_msg *msg) { int rc = iommu_update_ire_from_msi(&ch->msi, msg); - if ( rc ) + if ( rc < 0 ) return rc; + /* + * Always propagate writes, to avoid having to pass a flag for handling + * a forceful write in the resume from suspension case. + */ } hpet_write32(msg->data, HPET_Tn_ROUTE(ch->idx)); diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 0241303b4bf4..764d2ff9517a 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -415,7 +415,9 @@ static int cf_check vmx_pi_update_irte(const struct vcpu *v, ASSERT_PDEV_LIST_IS_READ_LOCKED(msi_desc->dev->domain); - return iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); + rc = iommu_update_ire_from_msi(msi_desc, &msi_desc->msg); + + return rc < 0 ? rc : 0; unlock_out: spin_unlock_irq(&desc->lock); diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c index 6c11d76015fb..163ccf874720 100644 --- a/xen/arch/x86/msi.c +++ b/xen/arch/x86/msi.c @@ -184,7 +184,8 @@ void msi_compose_msg(unsigned vector, const cpumask_t *cpu_mask, struct msi_msg MSI_DATA_VECTOR(vector); } -static int write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) +static int write_msi_msg(struct msi_desc *entry, struct msi_msg *msg, + bool force) { entry->msg = *msg; @@ -194,7 +195,7 @@ static int write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) ASSERT(msg != &entry->msg); rc = iommu_update_ire_from_msi(entry, msg); - if ( rc ) + if ( rc < 0 || (rc == 0 && !force) ) return rc; } @@ -259,7 +260,7 @@ void cf_check set_msi_affinity(struct irq_desc *desc, const cpumask_t *mask) msg.address_lo |= MSI_ADDR_DEST_ID(dest); msg.dest32 = dest; - write_msi_msg(msi_desc, &msg); + write_msi_msg(msi_desc, &msg, false); } void __msi_set_enable(pci_sbdf_t sbdf, int pos, int enable) @@ -522,7 +523,7 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc, desc->msi_desc = msidesc; desc->handler = handler; msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg); - ret = write_msi_msg(msidesc, &msg); + ret = write_msi_msg(msidesc, &msg, false); if ( unlikely(ret) ) { desc->handler = &no_irq_type; @@ -1403,7 +1404,7 @@ int pci_restore_msi_state(struct pci_dev *pdev) type = entry->msi_attrib.type; msg = entry->msg; - write_msi_msg(entry, &msg); + write_msi_msg(entry, &msg, true); for ( i = 0; ; ) { diff --git a/xen/drivers/passthrough/amd/iommu_intr.c b/xen/drivers/passthrough/amd/iommu_intr.c index c0273059cb1d..07b21c6043ef 100644 --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -492,7 +492,7 @@ static int update_intremap_entry_from_msi_msg( get_ivrs_mappings(iommu->seg)[alias_id].intremap_table); } - return 0; + return !fresh ? 0 : 1; } static struct amd_iommu *_find_iommu_for_device(int seg, int bdf) @@ -546,7 +546,7 @@ int cf_check amd_iommu_msi_msg_update_ire( rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, &msi_desc->remap_index, msg, &data); - if ( !rc ) + if ( rc > 0 ) { for ( i = 1; i < nr; ++i ) msi_desc[i].remap_index = msi_desc->remap_index + i; diff --git a/xen/drivers/passthrough/vtd/intremap.c b/xen/drivers/passthrough/vtd/intremap.c index 1aeaeb5ec595..a9d96fcdbac8 100644 --- a/xen/drivers/passthrough/vtd/intremap.c +++ b/xen/drivers/passthrough/vtd/intremap.c @@ -506,6 +506,7 @@ static int msi_msg_to_remap_entry( unsigned int index, i, nr = 1; unsigned long flags; const struct pi_desc *pi_desc = msi_desc->pi_desc; + bool alloc = false; if ( msi_desc->msi_attrib.type == PCI_CAP_ID_MSI ) nr = msi_desc->msi.nvec; @@ -529,6 +530,7 @@ static int msi_msg_to_remap_entry( index = alloc_remap_entry(iommu, nr); for ( i = 0; i < nr; ++i ) msi_desc[i].remap_index = index + i; + alloc = true; } else index = msi_desc->remap_index; @@ -601,7 +603,7 @@ static int msi_msg_to_remap_entry( unmap_vtd_domain_page(iremap_entries); spin_unlock_irqrestore(&iommu->intremap.lock, flags); - return 0; + return alloc ? 1 : 0; } int cf_check msi_msg_write_remap_rte( diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 77a514019cc6..984f0735d4a9 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -435,6 +435,12 @@ extern struct page_list_head iommu_pt_cleanup_list; bool arch_iommu_use_permitted(const struct domain *d); #ifdef CONFIG_X86 +/* + * Return values: + * - < 0 on error. + * - 0 on success and no need to write msi_msg to the hardware. + * - 1 on success and msi_msg must be propagated to the hardware. + */ static inline int iommu_update_ire_from_msi( struct msi_desc *msi_desc, struct msi_msg *msg) {
Attempt to reduce the MSI entry writes, and the associated checking whether memory decoding and MSI-X is enabled for the PCI device, when the MSI data hasn't changed. When using Interrupt Remapping the MSI entry will contain an index into the remapping table, and it's in such remapping table where the MSI vector and destination CPU is stored. As such, when using interrupt remapping, changes to the interrupt affinity shouldn't result in changes to the MSI entry, and the MSI entry update can be avoided. Signal from the IOMMU update_ire_from_msi hook whether the MSI data or address fields have changed, and thus need writing to the device registers. Such signaling is done by returning 1 from the function. Otherwise returning 0 means no update of the MSI fields, and thus no write required. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Cc: Ross Lagerwall <ross.lagerwall@citrix.com> --- Changes since v2: - New approach. Changes since v1: - Add more comments. - Simplify dma_msi_set_affinity(). --- xen/arch/x86/hpet.c | 6 +++++- xen/arch/x86/hvm/vmx/vmx.c | 4 +++- xen/arch/x86/msi.c | 11 ++++++----- xen/drivers/passthrough/amd/iommu_intr.c | 4 ++-- xen/drivers/passthrough/vtd/intremap.c | 4 +++- xen/include/xen/iommu.h | 6 ++++++ 6 files changed, 25 insertions(+), 10 deletions(-)