mbox series

[v3,0/4] Make the iommu driver no-snoop block feature consistent

Message ID 0-v3-2cf356649677+a32-intel_no_snoop_jgg@nvidia.com (mailing list archive)
Headers show
Series Make the iommu driver no-snoop block feature consistent | expand

Message

Jason Gunthorpe April 11, 2022, 3:16 p.m. UTC
PCIe defines a 'no-snoop' bit in each the TLP which is usually implemented
by a platform as bypassing elements in the DMA coherent CPU cache
hierarchy. A driver can command a device to set this bit on some of its
transactions as a micro-optimization.

However, the driver is now responsible to synchronize the CPU cache with
the DMA that bypassed it. On x86 this may be done through the wbinvd
instruction, and the i915 GPU driver is the only Linux DMA driver that
calls it.

The problem comes that KVM on x86 will normally disable the wbinvd
instruction in the guest and render it a NOP. As the driver running in the
guest is not aware the wbinvd doesn't work it may still cause the device
to set the no-snoop bit and the platform will bypass the CPU cache.
Without a working wbinvd there is no way to re-synchronize the CPU cache
and the driver in the VM has data corruption.

Thus, we see a general direction on x86 that the IOMMU HW is able to block
the no-snoop bit in the TLP. This NOP's the optimization and allows KVM to
to NOP the wbinvd without causing any data corruption.

This control for Intel IOMMU was exposed by using IOMMU_CACHE and
IOMMU_CAP_CACHE_COHERENCY, however these two values now have multiple
meanings and usages beyond blocking no-snoop and the whole thing has
become confused. AMD IOMMU has the same feature and same IOPTE bits
however it unconditionally blocks no-snoop.

Change it so that:
 - IOMMU_CACHE is only about the DMA coherence of normal DMAs from a
   device. It is used by the DMA API/VFIO/etc when the user of the
   iommu_domain will not be doing manual cache coherency operations.

 - IOMMU_CAP_CACHE_COHERENCY indicates if IOMMU_CACHE can be used with the
   device.

 - The new optional domain op enforce_cache_coherency() will cause the
   entire domain to block no-snoop requests - ie there is no way for any
   device attached to the domain to opt out of the IOMMU_CACHE behavior.
   This is permanent on the domain and must apply to any future devices
   attached to it.

Ideally an iommu driver should implement enforce_cache_coherency() so that
DMA API domains allow the no-snoop optimization. This leaves it available
to kernel drivers like i915. VFIO will call enforce_cache_coherency()
before establishing any mappings and the domain should then permanently
block no-snoop.

If enforce_cache_coherency() fails VFIO will communicate back through to
KVM into the arch code via kvm_arch_register_noncoherent_dma()
(only implemented by x86) which triggers a working wbinvd to be made
available to the VM.

While other iommu drivers are certainly welcome to implement
enforce_cache_coherency(), it is not clear there is any benefit in doing
so right now.

This is on github: https://github.com/jgunthorpe/linux/commits/intel_no_snoop

Joerg, I think we are ready with this now - if there are no futher
comments I won't resend it unless there is some conflict with rc3. Thanks
everyone!

v3:
 - Rename dmar_domain->enforce_no_snoop to dmar_domain->force_snooping
 - Accumulate acks
 - Rebase to 5.18-rc2
 - Revise commit language
v2: https://lore.kernel.org/r/0-v2-f090ae795824+6ad-intel_no_snoop_jgg@nvidia.com
 - Abandon removing IOMMU_CAP_CACHE_COHERENCY - instead make it the cap
   flag that indicates IOMMU_CACHE is supported
 - Put the VFIO tests for IOMMU_CACHE at VFIO device registration
 - In the Intel driver remove the domain->iommu_snooping value, this is
   global not per-domain
v1: https://lore.kernel.org/r/0-v1-ef02c60ddb76+12ca2-intel_no_snoop_jgg@nvidia.com

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Jason Gunthorpe (4):
  iommu: Introduce the domain op enforce_cache_coherency()
  vfio: Move the Intel no-snoop control off of IOMMU_CACHE
  iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for
    IOMMU_CACHE
  vfio: Require that devices support DMA cache coherence

 drivers/iommu/amd/iommu.c       |  7 +++++++
 drivers/iommu/intel/iommu.c     | 17 +++++++++++++----
 drivers/vfio/vfio.c             |  7 +++++++
 drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
 include/linux/intel-iommu.h     |  2 +-
 include/linux/iommu.h           |  7 +++++--
 6 files changed, 52 insertions(+), 18 deletions(-)


base-commit: ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e

Comments

Joerg Roedel April 28, 2022, 8:56 a.m. UTC | #1
On Mon, Apr 11, 2022 at 12:16:04PM -0300, Jason Gunthorpe wrote:
> Jason Gunthorpe (4):
>   iommu: Introduce the domain op enforce_cache_coherency()
>   vfio: Move the Intel no-snoop control off of IOMMU_CACHE
>   iommu: Redefine IOMMU_CAP_CACHE_COHERENCY as the cap flag for
>     IOMMU_CACHE
>   vfio: Require that devices support DMA cache coherence
> 
>  drivers/iommu/amd/iommu.c       |  7 +++++++
>  drivers/iommu/intel/iommu.c     | 17 +++++++++++++----
>  drivers/vfio/vfio.c             |  7 +++++++
>  drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
>  include/linux/intel-iommu.h     |  2 +-
>  include/linux/iommu.h           |  7 +++++--
>  6 files changed, 52 insertions(+), 18 deletions(-)

Applied to core branch, thanks everyone.