diff mbox series

[v4,07/12] iommufd: Add data structure for Intel VT-d stage-1 cache invalidation

Message ID 20230724111335.107427-8-yi.l.liu@intel.com (mailing list archive)
State New
Headers show
Series Add Intel VT-d nested translation | expand

Commit Message

Yi Liu July 24, 2023, 11:13 a.m. UTC
This adds the data structure for flushing iotlb for the nested domain
allocated with IOMMU_HWPT_TYPE_VTD_S1 type.

Cache invalidation path is performance path, so it's better to avoid
memory allocation in such path. To achieve it, this path reuses the
ucmd_buffer to copy user data. So the new data structures are added in
the ucmd_buffer union to avoid overflow.

This only supports invalidating IOTLB, but no for device-TLB as device-TLB
invalidation will be covered automatically in the IOTLB invalidation if the
underlying IOMMU driver has enabled ATS for the affected device.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/main.c |  6 ++++
 include/uapi/linux/iommufd.h | 58 ++++++++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

Comments

Tian, Kevin Aug. 2, 2023, 7:41 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, July 24, 2023 7:13 PM
> 
> This adds the data structure for flushing iotlb for the nested domain
> allocated with IOMMU_HWPT_TYPE_VTD_S1 type.
> 
> Cache invalidation path is performance path, so it's better to avoid
> memory allocation in such path. To achieve it, this path reuses the
> ucmd_buffer to copy user data. So the new data structures are added in
> the ucmd_buffer union to avoid overflow.

this patch has nothing to do with ucmd_buffer

> +
> +/**
> + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
> + *                                       (IOMMU_HWPT_TYPE_VTD_S1)
> + * @flags: Must be 0
> + * @entry_size: Size in bytes of each cache invalidation request
> + * @entry_nr_uptr: User pointer to the number of invalidation requests.
> + *                 Kernel reads it to get the number of requests and
> + *                 updates the buffer with the number of requests that
> + *                 have been processed successfully. This pointer must
> + *                 point to a __u32 type of memory location.
> + * @inv_data_uptr: Pointer to the cache invalidation requests
> + *
> + * The Intel VT-d specific invalidation data for a set of cache invalidation
> + * requests. Kernel loops the requests one-by-one and stops when failure
> + * is encountered. The number of handled requests is reported to user by
> + * writing the buffer pointed by @entry_nr_uptr.
> + */
> +struct iommu_hwpt_vtd_s1_invalidate {
> +	__u32 flags;
> +	__u32 entry_size;
> +	__aligned_u64 entry_nr_uptr;
> +	__aligned_u64 inv_data_uptr;
> +};
> +

I wonder whether this array can be defined directly in the common
struct iommu_hwpt_invalidate so there is no need for underlying
iommu driver to further deal with user buffers, including various
minsz/backward compat. check. 

smmu may not require it by using a native queue format. But that
could be considered as a special case of 1-entry array. With careful
coding the added cost should be negligible.

Jason, your thought?
Jason Gunthorpe Aug. 2, 2023, 1:47 p.m. UTC | #2
On Wed, Aug 02, 2023 at 07:41:05AM +0000, Tian, Kevin wrote:
> > +/**
> > + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
> > + *                                       (IOMMU_HWPT_TYPE_VTD_S1)
> > + * @flags: Must be 0
> > + * @entry_size: Size in bytes of each cache invalidation request
> > + * @entry_nr_uptr: User pointer to the number of invalidation requests.
> > + *                 Kernel reads it to get the number of requests and
> > + *                 updates the buffer with the number of requests that
> > + *                 have been processed successfully. This pointer must
> > + *                 point to a __u32 type of memory location.
> > + * @inv_data_uptr: Pointer to the cache invalidation requests
> > + *
> > + * The Intel VT-d specific invalidation data for a set of cache invalidation
> > + * requests. Kernel loops the requests one-by-one and stops when failure
> > + * is encountered. The number of handled requests is reported to user by
> > + * writing the buffer pointed by @entry_nr_uptr.
> > + */
> > +struct iommu_hwpt_vtd_s1_invalidate {
> > +	__u32 flags;
> > +	__u32 entry_size;
> > +	__aligned_u64 entry_nr_uptr;
> > +	__aligned_u64 inv_data_uptr;
> > +};
> > +
> 
> I wonder whether this array can be defined directly in the common
> struct iommu_hwpt_invalidate so there is no need for underlying
> iommu driver to further deal with user buffers, including various
> minsz/backward compat. check. 

You want to have an array and another chunk of data?

What is the array for? To do batching?

It means we have to allocate memory on this path, that doesn't seem
like the right direction for a performance improvement..

Having the driver copy in a loop might be better

Jason
Tian, Kevin Aug. 3, 2023, 12:38 a.m. UTC | #3
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, August 2, 2023 9:48 PM
> 
> On Wed, Aug 02, 2023 at 07:41:05AM +0000, Tian, Kevin wrote:
> > > +/**
> > > + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
> > > + *                                       (IOMMU_HWPT_TYPE_VTD_S1)
> > > + * @flags: Must be 0
> > > + * @entry_size: Size in bytes of each cache invalidation request
> > > + * @entry_nr_uptr: User pointer to the number of invalidation requests.
> > > + *                 Kernel reads it to get the number of requests and
> > > + *                 updates the buffer with the number of requests that
> > > + *                 have been processed successfully. This pointer must
> > > + *                 point to a __u32 type of memory location.
> > > + * @inv_data_uptr: Pointer to the cache invalidation requests
> > > + *
> > > + * The Intel VT-d specific invalidation data for a set of cache invalidation
> > > + * requests. Kernel loops the requests one-by-one and stops when
> failure
> > > + * is encountered. The number of handled requests is reported to user
> by
> > > + * writing the buffer pointed by @entry_nr_uptr.
> > > + */
> > > +struct iommu_hwpt_vtd_s1_invalidate {
> > > +	__u32 flags;
> > > +	__u32 entry_size;
> > > +	__aligned_u64 entry_nr_uptr;
> > > +	__aligned_u64 inv_data_uptr;
> > > +};
> > > +
> >
> > I wonder whether this array can be defined directly in the common
> > struct iommu_hwpt_invalidate so there is no need for underlying
> > iommu driver to further deal with user buffers, including various
> > minsz/backward compat. check.
> 
> You want to have an array and another chunk of data?
> 
> What is the array for? To do batching?

yes, it's for batching

> 
> It means we have to allocate memory on this path, that doesn't seem
> like the right direction for a performance improvement..

It reuses the ucmd_buffer to avoid memory allocation:

@@ -485,6 +485,12 @@ union ucmd_buffer {
 #ifdef CONFIG_IOMMUFD_TEST
 	struct iommu_test_cmd test;
 #endif
+	/*
+	 * hwpt_type specific structure used in the cache invalidation
+	 * path.
+	 */
+	struct iommu_hwpt_vtd_s1_invalidate vtd;
+	struct iommu_hwpt_vtd_s1_invalidate_desc req_vtd;
 };

I don't quite like this way.

> 
> Having the driver copy in a loop might be better
> 

Can you elaborate?
Yi Liu Aug. 4, 2023, 1:04 p.m. UTC | #4
> From: Tian, Kevin <kevin.tian@intel.com>
> Sent: Thursday, August 3, 2023 8:39 AM
> 
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, August 2, 2023 9:48 PM
> >
> > On Wed, Aug 02, 2023 at 07:41:05AM +0000, Tian, Kevin wrote:
> > > > +/**
> > > > + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
> > > > + *                                       (IOMMU_HWPT_TYPE_VTD_S1)
> > > > + * @flags: Must be 0
> > > > + * @entry_size: Size in bytes of each cache invalidation request
> > > > + * @entry_nr_uptr: User pointer to the number of invalidation requests.
> > > > + *                 Kernel reads it to get the number of requests and
> > > > + *                 updates the buffer with the number of requests that
> > > > + *                 have been processed successfully. This pointer must
> > > > + *                 point to a __u32 type of memory location.
> > > > + * @inv_data_uptr: Pointer to the cache invalidation requests
> > > > + *
> > > > + * The Intel VT-d specific invalidation data for a set of cache invalidation
> > > > + * requests. Kernel loops the requests one-by-one and stops when
> > failure
> > > > + * is encountered. The number of handled requests is reported to user
> > by
> > > > + * writing the buffer pointed by @entry_nr_uptr.
> > > > + */
> > > > +struct iommu_hwpt_vtd_s1_invalidate {
> > > > +	__u32 flags;
> > > > +	__u32 entry_size;
> > > > +	__aligned_u64 entry_nr_uptr;
> > > > +	__aligned_u64 inv_data_uptr;
> > > > +};
> > > > +
> > >
> > > I wonder whether this array can be defined directly in the common
> > > struct iommu_hwpt_invalidate so there is no need for underlying
> > > iommu driver to further deal with user buffers, including various
> > > minsz/backward compat. check.
> >
> > You want to have an array and another chunk of data?
> >
> > What is the array for? To do batching?
> 
> yes, it's for batching
> 
> >
> > It means we have to allocate memory on this path, that doesn't seem
> > like the right direction for a performance improvement..
> 
> It reuses the ucmd_buffer to avoid memory allocation:

I guess your point is to copy each invalidation descriptor in the common
layer and pass the descriptor to iommu driver. right?

> @@ -485,6 +485,12 @@ union ucmd_buffer {
>  #ifdef CONFIG_IOMMUFD_TEST
>  	struct iommu_test_cmd test;
>  #endif
> +	/*
> +	 * hwpt_type specific structure used in the cache invalidation
> +	 * path.
> +	 */
> +	struct iommu_hwpt_vtd_s1_invalidate vtd;
> +	struct iommu_hwpt_vtd_s1_invalidate_desc req_vtd;
>  };
> 
> I don't quite like this way.

This is because each descriptor is stored in the uncmd_buffer. So
Need to put the struct iommu_hwpt_vtd_s1_invalidate_desc here.

> >
> > Having the driver copy in a loop might be better
> >
> 
> Can you elaborate?

I think Jason means the way in patch 09.

Regards,
Yi Liu
Jason Gunthorpe Aug. 4, 2023, 2:03 p.m. UTC | #5
On Fri, Aug 04, 2023 at 01:04:57PM +0000, Liu, Yi L wrote:
> > > Having the driver copy in a loop might be better
> > >
> > 
> > Can you elaborate?
> 
> I think Jason means the way in patch 09.

Yeah, you can't reuse the stack buffer for an array, so patch 9 copies
each element uniquely.

This is more calls to copy_to_user, which has some cost

But we avoid a memory allocation

Patch 9 should not abuse the user_data, cast it to the inv_info and
just put req on the stack:

	struct iommu_hwpt_vtd_s1_invalidate *inv_info = user_data;
	struct iommu_hwpt_vtd_s1_invalidate_desc req;

But I'm not sure about this entry_size logic, what happens if the
entry_size is larger than the kernel supports? I think it should
fail..

Jason
Yi Liu Aug. 7, 2023, 2:02 p.m. UTC | #6
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, August 4, 2023 10:04 PM
> 
> On Fri, Aug 04, 2023 at 01:04:57PM +0000, Liu, Yi L wrote:
> > > > Having the driver copy in a loop might be better
> > > >
> > >
> > > Can you elaborate?
> >
> > I think Jason means the way in patch 09.
> 
> Yeah, you can't reuse the stack buffer for an array, so patch 9 copies
> each element uniquely.
> 
> This is more calls to copy_to_user, which has some cost
> 
> But we avoid a memory allocation

Yes.

> Patch 9 should not abuse the user_data, cast it to the inv_info and
> just put req on the stack:
> 
> 	struct iommu_hwpt_vtd_s1_invalidate *inv_info = user_data;
> 	struct iommu_hwpt_vtd_s1_invalidate_desc req;

Sure. The way in patch 09 is a bit tricky. The above is better and clearer. 
diff mbox series

Patch

diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index d49837397dfa..b927ace7f3af 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -485,6 +485,12 @@  union ucmd_buffer {
 #ifdef CONFIG_IOMMUFD_TEST
 	struct iommu_test_cmd test;
 #endif
+	/*
+	 * hwpt_type specific structure used in the cache invalidation
+	 * path.
+	 */
+	struct iommu_hwpt_vtd_s1_invalidate vtd;
+	struct iommu_hwpt_vtd_s1_invalidate_desc req_vtd;
 };
 
 struct iommufd_ioctl_op {
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 90b0d3f603a7..2c1241448c87 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -523,6 +523,64 @@  struct iommu_resv_iova_ranges {
 };
 #define IOMMU_RESV_IOVA_RANGES _IO(IOMMUFD_TYPE, IOMMUFD_CMD_RESV_IOVA_RANGES)
 
+/**
+ * enum iommu_hwpt_vtd_s1_invalidate_flags - Flags for Intel VT-d
+ *                                           stage-1 cache invalidation
+ * @IOMMU_VTD_QI_FLAGS_LEAF: The LEAF flag indicates whether only the
+ *                           leaf PTE caching needs to be invalidated
+ *                           and other paging structure caches can be
+ *                           preserved.
+ */
+enum iommu_hwpt_vtd_s1_invalidate_flags {
+	IOMMU_VTD_QI_FLAGS_LEAF = 1 << 0,
+};
+
+/**
+ * struct iommu_hwpt_vtd_s1_invalidate_desc - Intel VT-d stage-1 cache
+ *                                            invalidation descriptor
+ * @addr: The start address of the addresses to be invalidated.
+ * @npages: Number of contiguous 4K pages to be invalidated.
+ * @flags: Combination of enum iommu_hwpt_vtd_s1_invalidate_flags
+ * @__reserved: Must be 0
+ *
+ * The Intel VT-d specific invalidation data for user-managed stage-1 cache
+ * invalidation under nested translation. Userspace uses this structure to
+ * tell host about the impacted caches after modifying the stage-1 page table.
+ *
+ * Invalidating all the caches related to the hw_pagetable by setting @addr
+ * to be 0 and @npages to be __aligned_u64(-1).
+ */
+struct iommu_hwpt_vtd_s1_invalidate_desc {
+	__aligned_u64 addr;
+	__aligned_u64 npages;
+	__u32 flags;
+	__u32 __reserved;
+};
+
+/**
+ * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation
+ *                                       (IOMMU_HWPT_TYPE_VTD_S1)
+ * @flags: Must be 0
+ * @entry_size: Size in bytes of each cache invalidation request
+ * @entry_nr_uptr: User pointer to the number of invalidation requests.
+ *                 Kernel reads it to get the number of requests and
+ *                 updates the buffer with the number of requests that
+ *                 have been processed successfully. This pointer must
+ *                 point to a __u32 type of memory location.
+ * @inv_data_uptr: Pointer to the cache invalidation requests
+ *
+ * The Intel VT-d specific invalidation data for a set of cache invalidation
+ * requests. Kernel loops the requests one-by-one and stops when failure
+ * is encountered. The number of handled requests is reported to user by
+ * writing the buffer pointed by @entry_nr_uptr.
+ */
+struct iommu_hwpt_vtd_s1_invalidate {
+	__u32 flags;
+	__u32 entry_size;
+	__aligned_u64 entry_nr_uptr;
+	__aligned_u64 inv_data_uptr;
+};
+
 /**
  * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE)
  * @size: sizeof(struct iommu_hwpt_invalidate)