Message ID | 7e4302064e0d02137c1b1e139342affc0485ed3f.1730836219.git.nicolinc@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iommufd: Add vIOMMU infrastructure (Part-1) | expand |
On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote: > With the introduction of the new object and its infrastructure, update the > doc to reflect that and add a new graph. > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > --- > Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++- > 1 file changed, 68 insertions(+), 1 deletion(-) > > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst > index 2deba93bf159..a8b7766c2849 100644 > --- a/Documentation/userspace-api/iommufd.rst > +++ b/Documentation/userspace-api/iommufd.rst > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace: > space usually has mappings from guest-level I/O virtual addresses to guest- > level physical addresses. > > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, > + passed to or shared with a VM. It may be some HW-accelerated virtualization > + features and some SW resources used by the VM. For examples: > + * Security namespace for guest owned ID, e.g. guest-controlled cache tags > + * Non-device-affiliated event reporting, e.g. invalidation queue errors > + * Access to a sharable nesting parent pagetable across physical IOMMUs > + * Virtualization of various platforms IDs, e.g. RIDs and others > + * Delivery of paravirtualized invalidation > + * Direct assigned invalidation queues > + * Direct assigned interrupts The bullet list above is outputted in htmldocs build as long-running paragraph instead. > + Such a vIOMMU object generally has the access to a nesting parent pagetable > + to support some HW-accelerated virtualization features. So, a vIOMMU object > + must be created given a nesting parent HWPT_PAGING object, and then it would > + encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used > + to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING. Thanks.
On Thu, Nov 07, 2024 at 07:56:31AM +0700, Bagas Sanjaya wrote: > On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote: > > With the introduction of the new object and its infrastructure, update the > > doc to reflect that and add a new graph. > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > --- > > Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++- > > 1 file changed, 68 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst > > index 2deba93bf159..a8b7766c2849 100644 > > --- a/Documentation/userspace-api/iommufd.rst > > +++ b/Documentation/userspace-api/iommufd.rst > > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace: > > space usually has mappings from guest-level I/O virtual addresses to guest- > > level physical addresses. > > > > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, > > + passed to or shared with a VM. It may be some HW-accelerated virtualization > > + features and some SW resources used by the VM. For examples: > > + * Security namespace for guest owned ID, e.g. guest-controlled cache tags > > + * Non-device-affiliated event reporting, e.g. invalidation queue errors > > + * Access to a sharable nesting parent pagetable across physical IOMMUs > > + * Virtualization of various platforms IDs, e.g. RIDs and others > > + * Delivery of paravirtualized invalidation > > + * Direct assigned invalidation queues > > + * Direct assigned interrupts > > The bullet list above is outputted in htmldocs build as long-running paragraph > instead. Oh, I overlooked this list. Would the following change be okay? ------------------------------------------------- diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst index 0ef22b3ca30b..011cbc71b6f5 100644 --- a/Documentation/userspace-api/iommufd.rst +++ b/Documentation/userspace-api/iommufd.rst @@ -68,2 +68,3 @@ Following IOMMUFD objects are exposed to userspace: features and some SW resources used by the VM. For examples: + * Security namespace for guest owned ID, e.g. guest-controlled cache tags @@ -75,2 +76,3 @@ Following IOMMUFD objects are exposed to userspace: * Direct assigned interrupts + Such a vIOMMU object generally has the access to a nesting parent pagetable ------------------------------------------------- The outputted html is showing a list with this. Thanks! Nicolin
On Wed, Nov 06, 2024 at 05:35:45PM -0800, Nicolin Chen wrote: > On Thu, Nov 07, 2024 at 07:56:31AM +0700, Bagas Sanjaya wrote: > > On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote: > > > With the introduction of the new object and its infrastructure, update the > > > doc to reflect that and add a new graph. > > > > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> > > > --- > > > Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++- > > > 1 file changed, 68 insertions(+), 1 deletion(-) > > > > > > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst > > > index 2deba93bf159..a8b7766c2849 100644 > > > --- a/Documentation/userspace-api/iommufd.rst > > > +++ b/Documentation/userspace-api/iommufd.rst > > > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace: > > > space usually has mappings from guest-level I/O virtual addresses to guest- > > > level physical addresses. > > > > > > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, > > > + passed to or shared with a VM. It may be some HW-accelerated virtualization > > > + features and some SW resources used by the VM. For examples: > > > + * Security namespace for guest owned ID, e.g. guest-controlled cache tags > > > + * Non-device-affiliated event reporting, e.g. invalidation queue errors > > > + * Access to a sharable nesting parent pagetable across physical IOMMUs > > > + * Virtualization of various platforms IDs, e.g. RIDs and others > > > + * Delivery of paravirtualized invalidation > > > + * Direct assigned invalidation queues > > > + * Direct assigned interrupts > > > > The bullet list above is outputted in htmldocs build as long-running paragraph > > instead. > > Oh, I overlooked this list. > > Would the following change be okay? > > ------------------------------------------------- > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst > index 0ef22b3ca30b..011cbc71b6f5 100644 > --- a/Documentation/userspace-api/iommufd.rst > +++ b/Documentation/userspace-api/iommufd.rst > @@ -68,2 +68,3 @@ Following IOMMUFD objects are exposed to userspace: > features and some SW resources used by the VM. For examples: > + > * Security namespace for guest owned ID, e.g. guest-controlled cache tags > @@ -75,2 +76,3 @@ Following IOMMUFD objects are exposed to userspace: > * Direct assigned interrupts > + > Such a vIOMMU object generally has the access to a nesting parent pagetable > ------------------------------------------------- > > The outputted html is showing a list with this. Yup, that's right!
diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst index 2deba93bf159..a8b7766c2849 100644 --- a/Documentation/userspace-api/iommufd.rst +++ b/Documentation/userspace-api/iommufd.rst @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace: space usually has mappings from guest-level I/O virtual addresses to guest- level physical addresses. +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance, + passed to or shared with a VM. It may be some HW-accelerated virtualization + features and some SW resources used by the VM. For examples: + * Security namespace for guest owned ID, e.g. guest-controlled cache tags + * Non-device-affiliated event reporting, e.g. invalidation queue errors + * Access to a sharable nesting parent pagetable across physical IOMMUs + * Virtualization of various platforms IDs, e.g. RIDs and others + * Delivery of paravirtualized invalidation + * Direct assigned invalidation queues + * Direct assigned interrupts + Such a vIOMMU object generally has the access to a nesting parent pagetable + to support some HW-accelerated virtualization features. So, a vIOMMU object + must be created given a nesting parent HWPT_PAGING object, and then it would + encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used + to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING. + + .. note:: + + The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a + VM. A VM can have one giant virtualized IOMMU running on a machine having + multiple physical IOMMUs, in which case the VMM will dispatch the requests + or configurations from this single virtualized IOMMU instance to multiple + vIOMMU objects created for individual slices of different physical IOMMUs. + In other words, a vIOMMU object is always a representation of one physical + IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full + virtualization features from physical IOMMUs, it is suggested to build the + same number of virtualized IOMMUs as the number of physical IOMMUs, so the + passed-through devices would be connected to their own virtualized IOMMUs + backed by corresponding vIOMMU objects, in which case a guest OS would do + the "dispatch" naturally instead of VMM trappings. + All user-visible objects are destroyed via the IOMMU_DESTROY uAPI. The diagrams below show relationships between user-visible objects and kernel @@ -101,6 +132,28 @@ creating the objects and links:: |------------>|iommu_domain|<----|iommu_domain|<----|device| |____________| |____________| |______| + _______________________________________________________________________ + | iommufd (with vIOMMU) | + | | + | [5] | + | _____________ | + | | | | + | |----------------| vIOMMU | | + | | | | | + | | | | | + | | [1] | | [4] [2] | + | | ______ | | _____________ ________ | + | | | | | [3] | | | | | | + | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | + | | |______| |_____________| |_____________| |________| | + | | | | | | | + |______|________|______________|__________________|_______________|_____| + | | | | | + ______v_____ | ______v_____ ______v_____ ___v__ + | struct | | PFN | (paging) | | (nested) | |struct| + |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| + |____________| storage|____________| |____________| |______| + 1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can hold multiple IOAS objects. IOAS is the most generic object and does not expose interfaces that are specific to single IOMMU drivers. All operations @@ -132,7 +185,8 @@ creating the objects and links:: flag is set. 4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC - uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object + uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a + nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object must be a nesting parent manually allocated via the same uAPI previously with an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The @@ -149,6 +203,18 @@ creating the objects and links:: created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc. +5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC + uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU) + and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The + iommufd core will link the vIOMMU object to the struct iommu_device that the + struct device is behind. And an IOMMU driver can implement a viommu_alloc op + to allocate its own vIOMMU data structure embedding the core-level structure + iommufd_viommu and some driver-specific data. If necessary, the driver can + also configure its HW virtualization feature for that vIOMMU (and thus for + the VM). Successful completion of this operation sets up the linkages between + the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used + as a nesting parent object to allocate an HWPT_NESTED object described above. + A device can only bind to an iommufd due to DMA ownership claim and attach to at most one IOAS object (no support of PASID yet). @@ -161,6 +227,7 @@ User visible objects are backed by following datastructures: - iommufd_device for IOMMUFD_OBJ_DEVICE. - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING. - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED. +- iommufd_viommu for IOMMUFD_OBJ_VIOMMU. Several terminologies when looking at these datastructures: