diff mbox series

[1/2] iommu/intel: Avoid SAC address trick for PCIe devices

Message ID e583fc6dd1fb4ffc90310ff4372ee776f9cc7a3c.1594207679.git.robin.murphy@arm.com (mailing list archive)
State Not Applicable, archived
Headers show
Series [1/2] iommu/intel: Avoid SAC address trick for PCIe devices | expand

Commit Message

Robin Murphy July 8, 2020, 11:32 a.m. UTC
For devices stuck behind a conventional PCI bus, saving extra cycles at
33MHz is probably fairly significant. However since native PCI Express
is now the norm for high-performance devices, the optimisation to always
prefer 32-bit addresses for the sake of avoiding DAC is starting to look
rather anachronistic. Technically 32-bit addresses do have shorter TLPs
on PCIe, but unless the device is saturating its link bandwidth with
small transfers it seems unlikely that the difference is appreciable.

What definitely is appreciable, however, is that the IOVA allocator
doesn't behave all that well once the 32-bit space starts getting full.
As DMA working sets get bigger, this optimisation increasingly backfires
and adds considerable overhead to the dma_map path for use-cases like
high-bandwidth networking.

As such, let's simply take it out of consideration for PCIe devices.
Technically this might work out suboptimal for a PCIe device stuck
behind a conventional PCI bridge, or for PCI-X devices that also have
native 64-bit addressing, but neither of those are likely to be found
in performance-critical parts of modern systems.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/intel/iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig July 8, 2020, 12:26 p.m. UTC | #1
Looks pretty sensible.

> -	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32)) {
> +	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32) &&
> +	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev))) {

The only thing I wonder is if it is worth to add a little helper
for this check, but with everything moving to dma-iommu that might
not be needed.

Btw, does anyone know what the status of the intel-iommu conversion
to dma-iommu is?
Joerg Roedel July 10, 2020, 2:20 p.m. UTC | #2
On Wed, Jul 08, 2020 at 12:32:41PM +0100, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking.
> 
> As such, let's simply take it out of consideration for PCIe devices.
> Technically this might work out suboptimal for a PCIe device stuck
> behind a conventional PCI bridge, or for PCI-X devices that also have
> native 64-bit addressing, but neither of those are likely to be found
> in performance-critical parts of modern systems.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/intel/iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied both, thanks.
diff mbox series

Patch

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406..21b11de23a53 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3348,7 +3348,8 @@  static unsigned long intel_alloc_iova(struct device *dev,
 	/* Ensure we reserve the whole size-aligned region */
 	nrpages = __roundup_pow_of_two(nrpages);
 
-	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32)) {
+	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32) &&
+	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev))) {
 		/*
 		 * First try to allocate an io virtual address in
 		 * DMA_BIT_MASK(32) and if that fails then try allocating