diff mbox series

[v7,13/13] Documentation: userspace-api: iommufd: Update vIOMMU

Message ID 7e4302064e0d02137c1b1e139342affc0485ed3f.1730836219.git.nicolinc@nvidia.com (mailing list archive)
State New
Headers show
Series iommufd: Add vIOMMU infrastructure (Part-1) | expand

Commit Message

Nicolin Chen Nov. 5, 2024, 8:04 p.m. UTC
With the introduction of the new object and its infrastructure, update the
doc to reflect that and add a new graph.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
 1 file changed, 68 insertions(+), 1 deletion(-)

Comments

Bagas Sanjaya Nov. 7, 2024, 12:56 a.m. UTC | #1
On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote:
> With the introduction of the new object and its infrastructure, update the
> doc to reflect that and add a new graph.
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> ---
>  Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
>  1 file changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> index 2deba93bf159..a8b7766c2849 100644
> --- a/Documentation/userspace-api/iommufd.rst
> +++ b/Documentation/userspace-api/iommufd.rst
> @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
>    space usually has mappings from guest-level I/O virtual addresses to guest-
>    level physical addresses.
>  
> +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
> +  passed to or shared with a VM. It may be some HW-accelerated virtualization
> +  features and some SW resources used by the VM. For examples:
> +  * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> +  * Non-device-affiliated event reporting, e.g. invalidation queue errors
> +  * Access to a sharable nesting parent pagetable across physical IOMMUs
> +  * Virtualization of various platforms IDs, e.g. RIDs and others
> +  * Delivery of paravirtualized invalidation
> +  * Direct assigned invalidation queues
> +  * Direct assigned interrupts

The bullet list above is outputted in htmldocs build as long-running paragraph
instead.

> +  Such a vIOMMU object generally has the access to a nesting parent pagetable
> +  to support some HW-accelerated virtualization features. So, a vIOMMU object
> +  must be created given a nesting parent HWPT_PAGING object, and then it would
> +  encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used
> +  to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING.

Thanks.
Nicolin Chen Nov. 7, 2024, 1:35 a.m. UTC | #2
On Thu, Nov 07, 2024 at 07:56:31AM +0700, Bagas Sanjaya wrote:
> On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote:
> > With the introduction of the new object and its infrastructure, update the
> > doc to reflect that and add a new graph.
> > 
> > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > ---
> >  Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
> >  1 file changed, 68 insertions(+), 1 deletion(-)
> > 
> > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> > index 2deba93bf159..a8b7766c2849 100644
> > --- a/Documentation/userspace-api/iommufd.rst
> > +++ b/Documentation/userspace-api/iommufd.rst
> > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
> >    space usually has mappings from guest-level I/O virtual addresses to guest-
> >    level physical addresses.
> >  
> > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
> > +  passed to or shared with a VM. It may be some HW-accelerated virtualization
> > +  features and some SW resources used by the VM. For examples:
> > +  * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > +  * Non-device-affiliated event reporting, e.g. invalidation queue errors
> > +  * Access to a sharable nesting parent pagetable across physical IOMMUs
> > +  * Virtualization of various platforms IDs, e.g. RIDs and others
> > +  * Delivery of paravirtualized invalidation
> > +  * Direct assigned invalidation queues
> > +  * Direct assigned interrupts
> 
> The bullet list above is outputted in htmldocs build as long-running paragraph
> instead.

Oh, I overlooked this list.

Would the following change be okay?

-------------------------------------------------
diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index 0ef22b3ca30b..011cbc71b6f5 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -68,2 +68,3 @@ Following IOMMUFD objects are exposed to userspace:
   features and some SW resources used by the VM. For examples:
+
   * Security namespace for guest owned ID, e.g. guest-controlled cache tags
@@ -75,2 +76,3 @@ Following IOMMUFD objects are exposed to userspace:
   * Direct assigned interrupts
+
   Such a vIOMMU object generally has the access to a nesting parent pagetable
-------------------------------------------------

The outputted html is showing a list with this.

Thanks!
Nicolin
Bagas Sanjaya Nov. 7, 2024, 3:20 a.m. UTC | #3
On Wed, Nov 06, 2024 at 05:35:45PM -0800, Nicolin Chen wrote:
> On Thu, Nov 07, 2024 at 07:56:31AM +0700, Bagas Sanjaya wrote:
> > On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote:
> > > With the introduction of the new object and its infrastructure, update the
> > > doc to reflect that and add a new graph.
> > > 
> > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > ---
> > >  Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
> > >  1 file changed, 68 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> > > index 2deba93bf159..a8b7766c2849 100644
> > > --- a/Documentation/userspace-api/iommufd.rst
> > > +++ b/Documentation/userspace-api/iommufd.rst
> > > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
> > >    space usually has mappings from guest-level I/O virtual addresses to guest-
> > >    level physical addresses.
> > >  
> > > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
> > > +  passed to or shared with a VM. It may be some HW-accelerated virtualization
> > > +  features and some SW resources used by the VM. For examples:
> > > +  * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > > +  * Non-device-affiliated event reporting, e.g. invalidation queue errors
> > > +  * Access to a sharable nesting parent pagetable across physical IOMMUs
> > > +  * Virtualization of various platforms IDs, e.g. RIDs and others
> > > +  * Delivery of paravirtualized invalidation
> > > +  * Direct assigned invalidation queues
> > > +  * Direct assigned interrupts
> > 
> > The bullet list above is outputted in htmldocs build as long-running paragraph
> > instead.
> 
> Oh, I overlooked this list.
> 
> Would the following change be okay?
> 
> -------------------------------------------------
> diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> index 0ef22b3ca30b..011cbc71b6f5 100644
> --- a/Documentation/userspace-api/iommufd.rst
> +++ b/Documentation/userspace-api/iommufd.rst
> @@ -68,2 +68,3 @@ Following IOMMUFD objects are exposed to userspace:
>    features and some SW resources used by the VM. For examples:
> +
>    * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> @@ -75,2 +76,3 @@ Following IOMMUFD objects are exposed to userspace:
>    * Direct assigned interrupts
> +
>    Such a vIOMMU object generally has the access to a nesting parent pagetable
> -------------------------------------------------
> 
> The outputted html is showing a list with this.

Yup, that's right!
Nicolin Chen Nov. 7, 2024, 4:04 a.m. UTC | #4
On Thu, Nov 07, 2024 at 10:20:49AM +0700, Bagas Sanjaya wrote:
> On Wed, Nov 06, 2024 at 05:35:45PM -0800, Nicolin Chen wrote:
> > On Thu, Nov 07, 2024 at 07:56:31AM +0700, Bagas Sanjaya wrote:
> > > On Tue, Nov 05, 2024 at 12:04:29PM -0800, Nicolin Chen wrote:
> > > > With the introduction of the new object and its infrastructure, update the
> > > > doc to reflect that and add a new graph.
> > > > 
> > > > Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> > > > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > > > ---
> > > >  Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
> > > >  1 file changed, 68 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> > > > index 2deba93bf159..a8b7766c2849 100644
> > > > --- a/Documentation/userspace-api/iommufd.rst
> > > > +++ b/Documentation/userspace-api/iommufd.rst
> > > > @@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
> > > >    space usually has mappings from guest-level I/O virtual addresses to guest-
> > > >    level physical addresses.
> > > >  
> > > > +- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
> > > > +  passed to or shared with a VM. It may be some HW-accelerated virtualization
> > > > +  features and some SW resources used by the VM. For examples:
> > > > +  * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > > > +  * Non-device-affiliated event reporting, e.g. invalidation queue errors
> > > > +  * Access to a sharable nesting parent pagetable across physical IOMMUs
> > > > +  * Virtualization of various platforms IDs, e.g. RIDs and others
> > > > +  * Delivery of paravirtualized invalidation
> > > > +  * Direct assigned invalidation queues
> > > > +  * Direct assigned interrupts
> > > 
> > > The bullet list above is outputted in htmldocs build as long-running paragraph
> > > instead.
> > 
> > Oh, I overlooked this list.
> > 
> > Would the following change be okay?
> > 
> > -------------------------------------------------
> > diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
> > index 0ef22b3ca30b..011cbc71b6f5 100644
> > --- a/Documentation/userspace-api/iommufd.rst
> > +++ b/Documentation/userspace-api/iommufd.rst
> > @@ -68,2 +68,3 @@ Following IOMMUFD objects are exposed to userspace:
> >    features and some SW resources used by the VM. For examples:
> > +
> >    * Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > @@ -75,2 +76,3 @@ Following IOMMUFD objects are exposed to userspace:
> >    * Direct assigned interrupts
> > +
> >    Such a vIOMMU object generally has the access to a nesting parent pagetable
> > -------------------------------------------------
> > 
> > The outputted html is showing a list with this.
> 
> Yup, that's right!

Thank you! Would it be possible for you to give a Reviewed-by,
given the condition of squashing this diff?

Likely, Jason will help squash it when taking this v7 via his
iommufd tree. So, we might not respin a v8.

Nicolin
diff mbox series

Patch

diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index 2deba93bf159..a8b7766c2849 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -63,6 +63,37 @@  Following IOMMUFD objects are exposed to userspace:
   space usually has mappings from guest-level I/O virtual addresses to guest-
   level physical addresses.
 
+- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
+  passed to or shared with a VM. It may be some HW-accelerated virtualization
+  features and some SW resources used by the VM. For examples:
+  * Security namespace for guest owned ID, e.g. guest-controlled cache tags
+  * Non-device-affiliated event reporting, e.g. invalidation queue errors
+  * Access to a sharable nesting parent pagetable across physical IOMMUs
+  * Virtualization of various platforms IDs, e.g. RIDs and others
+  * Delivery of paravirtualized invalidation
+  * Direct assigned invalidation queues
+  * Direct assigned interrupts
+  Such a vIOMMU object generally has the access to a nesting parent pagetable
+  to support some HW-accelerated virtualization features. So, a vIOMMU object
+  must be created given a nesting parent HWPT_PAGING object, and then it would
+  encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used
+  to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING.
+
+  .. note::
+
+     The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a
+     VM. A VM can have one giant virtualized IOMMU running on a machine having
+     multiple physical IOMMUs, in which case the VMM will dispatch the requests
+     or configurations from this single virtualized IOMMU instance to multiple
+     vIOMMU objects created for individual slices of different physical IOMMUs.
+     In other words, a vIOMMU object is always a representation of one physical
+     IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full
+     virtualization features from physical IOMMUs, it is suggested to build the
+     same number of virtualized IOMMUs as the number of physical IOMMUs, so the
+     passed-through devices would be connected to their own virtualized IOMMUs
+     backed by corresponding vIOMMU objects, in which case a guest OS would do
+     the "dispatch" naturally instead of VMM trappings.
+
 All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
 
 The diagrams below show relationships between user-visible objects and kernel
@@ -101,6 +132,28 @@  creating the objects and links::
            |------------>|iommu_domain|<----|iommu_domain|<----|device|
                          |____________|     |____________|     |______|
 
+  _______________________________________________________________________
+ |                      iommufd (with vIOMMU)                            |
+ |                                                                       |
+ |                             [5]                                       |
+ |                        _____________                                  |
+ |                       |             |                                 |
+ |      |----------------|    vIOMMU   |                                 |
+ |      |                |             |                                 |
+ |      |                |             |                                 |
+ |      |      [1]       |             |          [4]             [2]    |
+ |      |     ______     |             |     _____________     ________  |
+ |      |    |      |    |     [3]     |    |             |   |        | |
+ |      |    | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
+ |      |    |______|    |_____________|    |_____________|   |________| |
+ |      |        |              |                  |               |     |
+ |______|________|______________|__________________|_______________|_____|
+        |        |              |                  |               |
+  ______v_____   |        ______v_____       ______v_____       ___v__
+ |   struct   |  |  PFN  |  (paging)  |     |  (nested)  |     |struct|
+ |iommu_device|  |------>|iommu_domain|<----|iommu_domain|<----|device|
+ |____________|   storage|____________|     |____________|     |______|
+
 1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can
    hold multiple IOAS objects. IOAS is the most generic object and does not
    expose interfaces that are specific to single IOMMU drivers. All operations
@@ -132,7 +185,8 @@  creating the objects and links::
      flag is set.
 
 4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC
-   uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object
+   uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a
+   nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object
    to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object
    must be a nesting parent manually allocated via the same uAPI previously with
    an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The
@@ -149,6 +203,18 @@  creating the objects and links::
       created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type
       of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc.
 
+5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC
+   uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU)
+   and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The
+   iommufd core will link the vIOMMU object to the struct iommu_device that the
+   struct device is behind. And an IOMMU driver can implement a viommu_alloc op
+   to allocate its own vIOMMU data structure embedding the core-level structure
+   iommufd_viommu and some driver-specific data. If necessary, the driver can
+   also configure its HW virtualization feature for that vIOMMU (and thus for
+   the VM). Successful completion of this operation sets up the linkages between
+   the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used
+   as a nesting parent object to allocate an HWPT_NESTED object described above.
+
 A device can only bind to an iommufd due to DMA ownership claim and attach to at
 most one IOAS object (no support of PASID yet).
 
@@ -161,6 +227,7 @@  User visible objects are backed by following datastructures:
 - iommufd_device for IOMMUFD_OBJ_DEVICE.
 - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
 - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.
+- iommufd_viommu for IOMMUFD_OBJ_VIOMMU.
 
 Several terminologies when looking at these datastructures: