mbox series

[v1,0/4] vfio/iommu: Flag to allow userspace to set DMA buffers system cacheable

Message ID 20240826071641.2691374-1-manojvishy@google.com (mailing list archive)
Headers show
Series vfio/iommu: Flag to allow userspace to set DMA buffers system cacheable | expand

Message

Manoj Vishwanathan Aug. 26, 2024, 7:16 a.m. UTC
Hi maintainers,

This RFC patch introduces the ability for userspace to control whether
device (DMA) buffers are marked as cacheable, enabling them to utilize
the system-level cache.

The specific changes made in this patch are:

* Introduce a new flag in `include/linux/iommu.h`: 
    * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
* Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:
    * Allows userspace to set the cacheable attribute to PTE when mapping DMA regions using the VFIO interface.
* Update `vfio_iommu_type1.c`:
    * Handle the `VFIO_DMA_MAP_FLAG_SYS_CACHE` flag during DMA mapping operations.
    * Set the `IOMMU_SYS_CACHE` flag in the IOMMU page table entry if the `VFIO_DMA_MAP_FLAG_SYS_CACHE` is set.

* arm/smmu/io-pgtable-arm: Set the MAIR for SYS_CACHE

The reasoning behind these changes is to provide userspace with finer-grained control over memory access patterns for devices,
potentially improving performance in scenarios where caching is beneficial. We saw in some of the use cases where the buffers were
for transient data ( in and out without processing).

I have tested this patch on certain arm64 machines and observed the following:

* There is 14-21% improvement in jitter measurements, where the buffer on System Level Cache vs DDR buffers
* There was not much of an improvement in latency in the diration of the tests that I have tried.

I am open to feedback and suggestions for further improvements. Please let me know if you have any questions or concerns.
Also, I am working on adding a check in the VFIO layer to ensure that if there is no architecture supported implementation for
sys cache, if should not apply them.

Thanks,
Manoj Vishwanathan

Manoj Vishwanathan (4):
  iommu: Add IOMMU_SYS_CACHE flag for system cache control
  iommu/io-pgtable-arm: Force outer cache for page-level MAIR via user
    flag
  vfio: Add VFIO_DMA_MAP_FLAG_SYS_CACHE to control device access to
    system cache
  vfio/type1: Add support for VFIO_DMA_MAP_FLAG_SYS_CACHE

 drivers/iommu/io-pgtable-arm.c  | 3 +++
 drivers/vfio/vfio_iommu_type1.c | 5 +++--
 include/linux/iommu.h           | 6 ++++++
 include/uapi/linux/vfio.h       | 1 +
 4 files changed, 13 insertions(+), 2 deletions(-)

Comments

Alex Williamson Aug. 26, 2024, 5:04 p.m. UTC | #1
On Mon, 26 Aug 2024 07:16:37 +0000
Manoj Vishwanathan <manojvishy@google.com> wrote:

> Hi maintainers,
> 
> This RFC patch introduces the ability for userspace to control whether
> device (DMA) buffers are marked as cacheable, enabling them to utilize
> the system-level cache.
> 
> The specific changes made in this patch are:
> 
> * Introduce a new flag in `include/linux/iommu.h`: 
>     * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
> * Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:
>     * Allows userspace to set the cacheable attribute to PTE when mapping DMA regions using the VFIO interface.
> * Update `vfio_iommu_type1.c`:
>     * Handle the `VFIO_DMA_MAP_FLAG_SYS_CACHE` flag during DMA mapping operations.
>     * Set the `IOMMU_SYS_CACHE` flag in the IOMMU page table entry if
> the `VFIO_DMA_MAP_FLAG_SYS_CACHE` is set.

We intend to eventually drop vfio type1 in favor of using IOMMUFD,
therefore all new type1 features need to first be available in IOMMUFD.
Once there we may consider use cases for providing the feature in the
legacy type1 interface and the IOMMUFD compatibility interface.  Thanks,

Alex

> * arm/smmu/io-pgtable-arm: Set the MAIR for SYS_CACHE
> 
> The reasoning behind these changes is to provide userspace with
> finer-grained control over memory access patterns for devices,
> potentially improving performance in scenarios where caching is
> beneficial. We saw in some of the use cases where the buffers were
> for transient data ( in and out without processing).
> 
> I have tested this patch on certain arm64 machines and observed the
> following:
> 
> * There is 14-21% improvement in jitter measurements, where the
> buffer on System Level Cache vs DDR buffers
> * There was not much of an improvement in latency in the diration of
> the tests that I have tried.
> 
> I am open to feedback and suggestions for further improvements.
> Please let me know if you have any questions or concerns. Also, I am
> working on adding a check in the VFIO layer to ensure that if there
> is no architecture supported implementation for sys cache, if should
> not apply them.
> 
> Thanks,
> Manoj Vishwanathan
> 
> Manoj Vishwanathan (4):
>   iommu: Add IOMMU_SYS_CACHE flag for system cache control
>   iommu/io-pgtable-arm: Force outer cache for page-level MAIR via user
>     flag
>   vfio: Add VFIO_DMA_MAP_FLAG_SYS_CACHE to control device access to
>     system cache
>   vfio/type1: Add support for VFIO_DMA_MAP_FLAG_SYS_CACHE
> 
>  drivers/iommu/io-pgtable-arm.c  | 3 +++
>  drivers/vfio/vfio_iommu_type1.c | 5 +++--
>  include/linux/iommu.h           | 6 ++++++
>  include/uapi/linux/vfio.h       | 1 +
>  4 files changed, 13 insertions(+), 2 deletions(-)
>
Manoj Vishwanathan Aug. 26, 2024, 5:36 p.m. UTC | #2
On Mon, Aug 26, 2024 at 10:04 AM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Mon, 26 Aug 2024 07:16:37 +0000
> Manoj Vishwanathan <manojvishy@google.com> wrote:
>
> > Hi maintainers,
> >
> > This RFC patch introduces the ability for userspace to control whether
> > device (DMA) buffers are marked as cacheable, enabling them to utilize
> > the system-level cache.
> >
> > The specific changes made in this patch are:
> >
> > * Introduce a new flag in `include/linux/iommu.h`:
> >     * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
> > * Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:
> >     * Allows userspace to set the cacheable attribute to PTE when mapping DMA regions using the VFIO interface.
> > * Update `vfio_iommu_type1.c`:
> >     * Handle the `VFIO_DMA_MAP_FLAG_SYS_CACHE` flag during DMA mapping operations.
> >     * Set the `IOMMU_SYS_CACHE` flag in the IOMMU page table entry if
> > the `VFIO_DMA_MAP_FLAG_SYS_CACHE` is set.
>
> We intend to eventually drop vfio type1 in favor of using IOMMUFD,
> therefore all new type1 features need to first be available in IOMMUFD.
> Once there we may consider use cases for providing the feature in the
> legacy type1 interface and the IOMMUFD compatibility interface.  Thanks,
>
> Alex
Thank You, Alex! I will redirect this patch to iommufd in the next version.
> > * arm/smmu/io-pgtable-arm: Set the MAIR for SYS_CACHE
> >
> > The reasoning behind these changes is to provide userspace with
> > finer-grained control over memory access patterns for devices,
> > potentially improving performance in scenarios where caching is
> > beneficial. We saw in some of the use cases where the buffers were
> > for transient data ( in and out without processing).
> >
> > I have tested this patch on certain arm64 machines and observed the
> > following:
> >
> > * There is 14-21% improvement in jitter measurements, where the
> > buffer on System Level Cache vs DDR buffers
> > * There was not much of an improvement in latency in the diration of
> > the tests that I have tried.
> >
> > I am open to feedback and suggestions for further improvements.
> > Please let me know if you have any questions or concerns. Also, I am
> > working on adding a check in the VFIO layer to ensure that if there
> > is no architecture supported implementation for sys cache, if should
> > not apply them.
> >
> > Thanks,
> > Manoj Vishwanathan
> >
> > Manoj Vishwanathan (4):
> >   iommu: Add IOMMU_SYS_CACHE flag for system cache control
> >   iommu/io-pgtable-arm: Force outer cache for page-level MAIR via user
> >     flag
> >   vfio: Add VFIO_DMA_MAP_FLAG_SYS_CACHE to control device access to
> >     system cache
> >   vfio/type1: Add support for VFIO_DMA_MAP_FLAG_SYS_CACHE
> >
> >  drivers/iommu/io-pgtable-arm.c  | 3 +++
> >  drivers/vfio/vfio_iommu_type1.c | 5 +++--
> >  include/linux/iommu.h           | 6 ++++++
> >  include/uapi/linux/vfio.h       | 1 +
> >  4 files changed, 13 insertions(+), 2 deletions(-)
> >
>
Jason Gunthorpe Aug. 26, 2024, 11:17 p.m. UTC | #3
On Mon, Aug 26, 2024 at 11:04:47AM -0600, Alex Williamson wrote:
> On Mon, 26 Aug 2024 07:16:37 +0000
> Manoj Vishwanathan <manojvishy@google.com> wrote:
> 
> > Hi maintainers,
> > 
> > This RFC patch introduces the ability for userspace to control whether
> > device (DMA) buffers are marked as cacheable, enabling them to utilize
> > the system-level cache.
> > 
> > The specific changes made in this patch are:
> > 
> > * Introduce a new flag in `include/linux/iommu.h`: 
> >     * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
> > * Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:

You'll need a much better description of what this is supposed to do
when you resend it.

IOMMU_CACHE already largely means that pages should be cached.

So I don't know what ARM's "INC_OCACHE" actually is doing. Causing
writes to land in a cache somewhere in hierarchy? Something platform
specific? I have no idea. By your description it sounds similar to the
x86 data placement stuff, whatever that was called, and the more
modern TPH approach.

Jason
Robin Murphy Aug. 27, 2024, 5:31 p.m. UTC | #4
On 27/08/2024 12:17 am, Jason Gunthorpe wrote:
> On Mon, Aug 26, 2024 at 11:04:47AM -0600, Alex Williamson wrote:
>> On Mon, 26 Aug 2024 07:16:37 +0000
>> Manoj Vishwanathan <manojvishy@google.com> wrote:
>>
>>> Hi maintainers,
>>>
>>> This RFC patch introduces the ability for userspace to control whether
>>> device (DMA) buffers are marked as cacheable, enabling them to utilize
>>> the system-level cache.
>>>
>>> The specific changes made in this patch are:
>>>
>>> * Introduce a new flag in `include/linux/iommu.h`:
>>>      * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
>>> * Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:
> 
> You'll need a much better description of what this is supposed to do
> when you resend it.
> 
> IOMMU_CACHE already largely means that pages should be cached.
> 
> So I don't know what ARM's "INC_OCACHE" actually is doing. Causing
> writes to land in a cache somewhere in hierarchy? Something platform
> specific?

Kinda both - the Inner Non-Cacheable attribute means it's still 
fundamentally non-snooping and non-coherent with CPU caches, but the 
Outer Write-back Write-allocate attribute can still control allocation 
in a system cache downstream of the point of coherency if the platform 
is built with such a thing (it's not overly common).

However, as you point out, this is in direct conflict with the Inner 
Write-back Write-allocate attribute implied by the IOMMU_CACHE which 
VFIO adds in further down in vfio_iommu_map(). Plus the way it's 
actually implemented in patch #2, IOMMU_CACHE still takes precedence and 
would lead to this new value being completely ignored, so there's a lot 
which smells suspicious here...

Thanks,
Robin.

> I have no idea. By your description it sounds similar to the
> x86 data placement stuff, whatever that was called, and the more
> modern TPH approach.
> 
> Jason
Manoj Vishwanathan Aug. 28, 2024, 2:55 a.m. UTC | #5
On Tue, Aug 27, 2024 at 10:31 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 27/08/2024 12:17 am, Jason Gunthorpe wrote:
> > On Mon, Aug 26, 2024 at 11:04:47AM -0600, Alex Williamson wrote:
> >> On Mon, 26 Aug 2024 07:16:37 +0000
> >> Manoj Vishwanathan <manojvishy@google.com> wrote:
> >>
> >>> Hi maintainers,
> >>>
> >>> This RFC patch introduces the ability for userspace to control whether
> >>> device (DMA) buffers are marked as cacheable, enabling them to utilize
> >>> the system-level cache.
> >>>
> >>> The specific changes made in this patch are:
> >>>
> >>> * Introduce a new flag in `include/linux/iommu.h`:
> >>>      * `IOMMU_SYS_CACHE` -  Indicates if the associated page should be cached in the system's cache hierarchy.
> >>> * Add `VFIO_DMA_MAP_FLAG_SYS_CACHE` to `include/uapi/linux/vfio.h`:
> >
> > You'll need a much better description of what this is supposed to do
> > when you resend it.
> >
Thanks Jason, I will add more information before re-sending the patch.
> > IOMMU_CACHE already largely means that pages should be cached.
> >
> > So I don't know what ARM's "INC_OCACHE" actually is doing. Causing
> > writes to land in a cache somewhere in hierarchy? Something platform
> > specific?
>
> Kinda both - the Inner Non-Cacheable attribute means it's still
> fundamentally non-snooping and non-coherent with CPU caches, but the
> Outer Write-back Write-allocate attribute can still control allocation
> in a system cache downstream of the point of coherency if the platform
> is built with such a thing (it's not overly common).
>
> However, as you point out, this is in direct conflict with the Inner
> Write-back Write-allocate attribute implied by the IOMMU_CACHE which
> VFIO adds in further down in vfio_iommu_map(). Plus the way it's
> actually implemented in patch #2, IOMMU_CACHE still takes precedence and
> would lead to this new value being completely ignored, so there's a lot
> which smells suspicious here...
>
Thanks for your quick feedback.
I tested this with a 5.15-based kernel and applied the patch to get
early results.
I see the issue with IOMMU_CACHE overriding IOMMU_SYS_CACHE in patch #2.
This would likely be a problem on 5.15 as well, and I need to
investigate further
to understand how we observed better performance in our tests
while trying to exercise the system cache.
Let me do some more testing and get back.
Thanks,
Manoj
> Thanks,
> Robin.
>
> > I have no idea. By your description it sounds similar to the
> > x86 data placement stuff, whatever that was called, and the more
> > modern TPH approach.
> >
> > Jason