diff mbox series

[v4,02/15] iommu: Report domain nesting info

Message ID 1593861989-35920-3-git-send-email-yi.l.liu@intel.com (mailing list archive)
State New, archived
Headers show
Series vfio: expose virtual Shared Virtual Addressing to VMs | expand

Commit Message

Yi Liu July 4, 2020, 11:26 a.m. UTC
IOMMUs that support nesting translation needs report the capability info
to userspace, e.g. the format of first level/stage paging structures.

This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
nesting info after setting DOMAIN_ATTR_NESTING.

Cc: Kevin Tian <kevin.tian@intel.com>
CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Eric Auger <eric.auger@redhat.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
v3 -> v4:
*) split the SMMU driver changes to be a separate patch
*) move the @addr_width and @pasid_bits from vendor specific
   part to generic part.
*) tweak the description for the @features field of struct
   iommu_nesting_info.
*) add description on the @data[] field of struct iommu_nesting_info

v2 -> v3:
*) remvoe cap/ecap_mask in iommu_nesting_info.
*) reuse DOMAIN_ATTR_NESTING to get nesting info.
*) return an empty iommu_nesting_info for SMMU drivers per Jean'
   suggestion.
---
 include/uapi/linux/iommu.h | 78 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

Comments

Eric Auger July 6, 2020, 9:34 a.m. UTC | #1
On 7/4/20 1:26 PM, Liu Yi L wrote:
> IOMMUs that support nesting translation needs report the capability info
need to report
> to userspace, e.g. the format of first level/stage paging structures.
> 
> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
> nesting info after setting DOMAIN_ATTR_NESTING.
> 
> Cc: Kevin Tian <kevin.tian@intel.com>
> CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Eric Auger <eric.auger@redhat.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joerg Roedel <joro@8bytes.org>
> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
> v3 -> v4:
> *) split the SMMU driver changes to be a separate patch
> *) move the @addr_width and @pasid_bits from vendor specific
>    part to generic part.
> *) tweak the description for the @features field of struct
>    iommu_nesting_info.
> *) add description on the @data[] field of struct iommu_nesting_info
> 
> v2 -> v3:
> *) remvoe cap/ecap_mask in iommu_nesting_info.
> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
>    suggestion.
> ---
>  include/uapi/linux/iommu.h | 78 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> index 1afc661..1bfc032 100644
> --- a/include/uapi/linux/iommu.h
> +++ b/include/uapi/linux/iommu.h
> @@ -332,4 +332,82 @@ struct iommu_gpasid_bind_data {
>  	} vendor;
>  };
>  
> +/*
> + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> + *				user space should check it before using
> + *				nesting capability.
alignment?
> + *
> + * @size:	size of the whole structure
> + * @format:	PASID table entry format, the same definition with
> + *		@format of struct iommu_gpasid_bind_data.
the same definition as struct iommu_gpasid_bind_data @format?
> + * @features:	supported nesting features.
> + * @flags:	currently reserved for future extension.
> + * @addr_width:	The output addr width of first level/stage translation
> + * @pasid_bits:	Maximum supported PASID bits, 0 represents no PASID
> + *		support.
> + * @data:	vendor specific cap info. data[] structure type can be deduced
> + *		from @format field.
> + *
> + * +===============+======================================================+
> + * | feature       |  Notes                                               |
> + * +===============+======================================================+
> + * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
> + * |               |  device. When a device is assigned to userspace or   |
> + * |               |  VM, proper uAPI (userspace driver framework uAPI,   |
> + * |               |  e.g. VFIO) must be used to allocate/free PASIDs for |
> + * |               |  the assigned device.                                |
> + * +---------------+------------------------------------------------------+
> + * | BIND_PGTBL    |  The owner of the first level/stage page table must  |
> + * |               |  explicitly bind the page table to associated PASID  |
> + * |               |  (either the one specified in bind request or the    |
> + * |               |  default PASID of iommu domain), through userspace   |
> + * |               |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
> + * +---------------+------------------------------------------------------+
> + * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
> + * |               |  explicitly invalidate the IOMMU cache through uAPI  |
> + * |               |  provided by userspace driver framework (e.g. VFIO)  |
> + * |               |  according to vendor-specific requirement when       |
> + * |               |  changing the page table.                            |
> + * +---------------+------------------------------------------------------+
Do you foresee cases where BIND_PGTBL and CACHE_INVLD shouldn't be
exposed as features?
> + *
> + * @data[] types defined for @format:
> + * +================================+=====================================+
> + * | @format                        | @data[]                             |
> + * +================================+=====================================+
> + * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd       |
> + * +--------------------------------+-------------------------------------+
> + *
> + */
> +struct iommu_nesting_info {
> +	__u32	size;
> +	__u32	format;
> +	__u32	features;
> +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
> +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
> +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 << 2)
In other structs the values seem to be defined before the field
> +	__u32	flags;
> +	__u16	addr_width;
> +	__u16	pasid_bits;
> +	__u32	padding;
> +	__u8	data[];
> +};
> +
> +/*
> + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info
> + *
spurious line
> + *
> + * @flags:	VT-d specific flags. Currently reserved for future
> + *		extension.
> + * @cap_reg:	Describe basic capabilities as defined in VT-d capability
> + *		register.
> + * @ecap_reg:	Describe the extended capabilities as defined in VT-d
> + *		extended capability register.
> + */
> +struct iommu_nesting_info_vtd {
> +	__u32	flags;
> +	__u32	padding;
> +	__u64	cap_reg;
> +	__u64	ecap_reg;
> +};
> +
>  #endif /* _UAPI_IOMMU_H */
> 
Thanks

Eric
Yi Liu July 6, 2020, 12:20 p.m. UTC | #2
Hi Eric,

> From: Auger Eric <eric.auger@redhat.com>
> Sent: Monday, July 6, 2020 5:34 PM
> 
> On 7/4/20 1:26 PM, Liu Yi L wrote:
> > IOMMUs that support nesting translation needs report the capability info
> need to report
> > to userspace, e.g. the format of first level/stage paging structures.
> >
> > This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
> > nesting info after setting DOMAIN_ATTR_NESTING.
> >
> > Cc: Kevin Tian <kevin.tian@intel.com>
> > CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Eric Auger <eric.auger@redhat.com>
> > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Cc: Joerg Roedel <joro@8bytes.org>
> > Cc: Lu Baolu <baolu.lu@linux.intel.com>
> > Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> > ---
> > v3 -> v4:
> > *) split the SMMU driver changes to be a separate patch
> > *) move the @addr_width and @pasid_bits from vendor specific
> >    part to generic part.
> > *) tweak the description for the @features field of struct
> >    iommu_nesting_info.
> > *) add description on the @data[] field of struct iommu_nesting_info
> >
> > v2 -> v3:
> > *) remvoe cap/ecap_mask in iommu_nesting_info.
> > *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> > *) return an empty iommu_nesting_info for SMMU drivers per Jean'
> >    suggestion.
> > ---
> >  include/uapi/linux/iommu.h | 78
> ++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 78 insertions(+)
> >
> > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> > index 1afc661..1bfc032 100644
> > --- a/include/uapi/linux/iommu.h
> > +++ b/include/uapi/linux/iommu.h
> > @@ -332,4 +332,82 @@ struct iommu_gpasid_bind_data {
> >  	} vendor;
> >  };
> >
> > +/*
> > + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> > + *				user space should check it before using
> > + *				nesting capability.
> alignment?

oh, yes, will do it.

> > + *
> > + * @size:	size of the whole structure
> > + * @format:	PASID table entry format, the same definition with
> > + *		@format of struct iommu_gpasid_bind_data.
> the same definition as struct iommu_gpasid_bind_data @format?

right. yours is much better.

> > + * @features:	supported nesting features.
> > + * @flags:	currently reserved for future extension.
> > + * @addr_width:	The output addr width of first level/stage translation
> > + * @pasid_bits:	Maximum supported PASID bits, 0 represents no PASID
> > + *		support.
> > + * @data:	vendor specific cap info. data[] structure type can be deduced
> > + *		from @format field.
> > + *
> > + *
> +===============+===================================================
> ===+
> > + * | feature       |  Notes                                               |
> > + *
> +===============+===================================================
> ===+
> > + * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
> > + * |               |  device. When a device is assigned to userspace or   |
> > + * |               |  VM, proper uAPI (userspace driver framework uAPI,   |
> > + * |               |  e.g. VFIO) must be used to allocate/free PASIDs for |
> > + * |               |  the assigned device.                                |
> > + * +---------------+------------------------------------------------------+
> > + * | BIND_PGTBL    |  The owner of the first level/stage page table must  |
> > + * |               |  explicitly bind the page table to associated PASID  |
> > + * |               |  (either the one specified in bind request or the    |
> > + * |               |  default PASID of iommu domain), through userspace   |
> > + * |               |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
> > + * +---------------+------------------------------------------------------+
> > + * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
> > + * |               |  explicitly invalidate the IOMMU cache through uAPI  |
> > + * |               |  provided by userspace driver framework (e.g. VFIO)  |
> > + * |               |  according to vendor-specific requirement when       |
> > + * |               |  changing the page table.                            |
> > + * +---------------+------------------------------------------------------+
> Do you foresee cases where BIND_PGTBL and CACHE_INVLD shouldn't be
> exposed as features?

sorry, I didn't quite get it. could you explain a little bit more. :-)

> > + *
> > + * @data[] types defined for @format:
> > + *
> +================================+==================================
> ===+
> > + * | @format                        | @data[]                             |
> > + *
> +================================+==================================
> ===+
> > + * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd       |
> > + * +--------------------------------+-------------------------------------+
> > + *
> > + */
> > +struct iommu_nesting_info {
> > +	__u32	size;
> > +	__u32	format;
> > +	__u32	features;
> > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
> > +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
> > +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 << 2)
> In other structs the values seem to be defined before the field

not sure. :-) I mimics the below struct from uapi/vfio.h

struct vfio_iommu_type1_dma_map {
        __u32   argsz;
        __u32   flags;
#define VFIO_DMA_MAP_FLAG_READ (1 << 0)         /* readable from device */
#define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)        /* writable from device */
        __u64   vaddr;                          /* Process virtual address */
        __u64   iova;                           /* IO virtual address */
        __u64   size;                           /* Size of mapping (bytes) */
};

> > +	__u32	flags;
> > +	__u16	addr_width;
> > +	__u16	pasid_bits;
> > +	__u32	padding;
> > +	__u8	data[];
> > +};
> > +
> > +/*
> > + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info
> > + *
> spurious line

yes, will remove this line.

Regards,
Yi Liu

> > + *
> > + * @flags:	VT-d specific flags. Currently reserved for future
> > + *		extension.
> > + * @cap_reg:	Describe basic capabilities as defined in VT-d capability
> > + *		register.
> > + * @ecap_reg:	Describe the extended capabilities as defined in VT-d
> > + *		extended capability register.
> > + */
> > +struct iommu_nesting_info_vtd {
> > +	__u32	flags;
> > +	__u32	padding;
> > +	__u64	cap_reg;
> > +	__u64	ecap_reg;
> > +};
> > +
> >  #endif /* _UAPI_IOMMU_H */
> >
> Thanks
> 
> Eric
Eric Auger July 6, 2020, 1 p.m. UTC | #3
On 7/6/20 2:20 PM, Liu, Yi L wrote:
> Hi Eric,
> 
>> From: Auger Eric <eric.auger@redhat.com>
>> Sent: Monday, July 6, 2020 5:34 PM
>>
>> On 7/4/20 1:26 PM, Liu Yi L wrote:
>>> IOMMUs that support nesting translation needs report the capability info
>> need to report
>>> to userspace, e.g. the format of first level/stage paging structures.
>>>
>>> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get
>>> nesting info after setting DOMAIN_ATTR_NESTING.
>>>
>>> Cc: Kevin Tian <kevin.tian@intel.com>
>>> CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
>>> Cc: Alex Williamson <alex.williamson@redhat.com>
>>> Cc: Eric Auger <eric.auger@redhat.com>
>>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>> Cc: Joerg Roedel <joro@8bytes.org>
>>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
>>> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
>>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
>>> ---
>>> v3 -> v4:
>>> *) split the SMMU driver changes to be a separate patch
>>> *) move the @addr_width and @pasid_bits from vendor specific
>>>    part to generic part.
>>> *) tweak the description for the @features field of struct
>>>    iommu_nesting_info.
>>> *) add description on the @data[] field of struct iommu_nesting_info
>>>
>>> v2 -> v3:
>>> *) remvoe cap/ecap_mask in iommu_nesting_info.
>>> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
>>> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
>>>    suggestion.
>>> ---
>>>  include/uapi/linux/iommu.h | 78
>> ++++++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 78 insertions(+)
>>>
>>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
>>> index 1afc661..1bfc032 100644
>>> --- a/include/uapi/linux/iommu.h
>>> +++ b/include/uapi/linux/iommu.h
>>> @@ -332,4 +332,82 @@ struct iommu_gpasid_bind_data {
>>>  	} vendor;
>>>  };
>>>
>>> +/*
>>> + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
>>> + *				user space should check it before using
>>> + *				nesting capability.
>> alignment?
> 
> oh, yes, will do it.
> 
>>> + *
>>> + * @size:	size of the whole structure
>>> + * @format:	PASID table entry format, the same definition with
>>> + *		@format of struct iommu_gpasid_bind_data.
>> the same definition as struct iommu_gpasid_bind_data @format?
> 
> right. yours is much better.
> 
>>> + * @features:	supported nesting features.
>>> + * @flags:	currently reserved for future extension.
>>> + * @addr_width:	The output addr width of first level/stage translation
>>> + * @pasid_bits:	Maximum supported PASID bits, 0 represents no PASID
>>> + *		support.
>>> + * @data:	vendor specific cap info. data[] structure type can be deduced
>>> + *		from @format field.
>>> + *
>>> + *
>> +===============+===================================================
>> ===+
>>> + * | feature       |  Notes                                               |
>>> + *
>> +===============+===================================================
>> ===+
>>> + * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
>>> + * |               |  device. When a device is assigned to userspace or   |
>>> + * |               |  VM, proper uAPI (userspace driver framework uAPI,   |
>>> + * |               |  e.g. VFIO) must be used to allocate/free PASIDs for |
>>> + * |               |  the assigned device.                                |
>>> + * +---------------+------------------------------------------------------+
>>> + * | BIND_PGTBL    |  The owner of the first level/stage page table must  |
>>> + * |               |  explicitly bind the page table to associated PASID  |
>>> + * |               |  (either the one specified in bind request or the    |
>>> + * |               |  default PASID of iommu domain), through userspace   |
>>> + * |               |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
>>> + * +---------------+------------------------------------------------------+
>>> + * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
>>> + * |               |  explicitly invalidate the IOMMU cache through uAPI  |
>>> + * |               |  provided by userspace driver framework (e.g. VFIO)  |
>>> + * |               |  according to vendor-specific requirement when       |
>>> + * |               |  changing the page table.                            |
>>> + * +---------------+------------------------------------------------------+
>> Do you foresee cases where BIND_PGTBL and CACHE_INVLD shouldn't be
>> exposed as features?
> 
> sorry, I didn't quite get it. could you explain a little bit more. :-)
For SYSWIDE_PASID I understand SMMU won't advertise it. But do you
foresee any nested implementation not requesting the owner of the tables
to bind and invalidate caches. So I understand those 2 features would
always be supported?
> 
>>> + *
>>> + * @data[] types defined for @format:
>>> + *
>> +================================+==================================
>> ===+
>>> + * | @format                        | @data[]                             |
>>> + *
>> +================================+==================================
>> ===+
>>> + * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd       |
>>> + * +--------------------------------+-------------------------------------+
>>> + *
>>> + */
>>> +struct iommu_nesting_info {
>>> +	__u32	size;
>>> +	__u32	format;
>>> +	__u32	features;
>>> +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
>>> +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
>>> +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 << 2)
>> In other structs the values seem to be defined before the field
> 
> not sure. :-) I mimics the below struct from uapi/vfio.h
Yep I noticed that afterwards. In IOMMU uapi it looks the opposite
though. So I would alignto the style in the same file but that's not a
big deal.
> 
> struct vfio_iommu_type1_dma_map {
>         __u32   argsz;
>         __u32   flags;
> #define VFIO_DMA_MAP_FLAG_READ (1 << 0)         /* readable from device */
> #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)        /* writable from device */
>         __u64   vaddr;                          /* Process virtual address */
>         __u64   iova;                           /* IO virtual address */
>         __u64   size;                           /* Size of mapping (bytes) */
> };
> 
>>> +	__u32	flags;
>>> +	__u16	addr_width;
>>> +	__u16	pasid_bits;
>>> +	__u32	padding;
>>> +	__u8	data[];
>>> +};
>>> +
>>> +/*
>>> + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info
>>> + *
>> spurious line
> 
> yes, will remove this line.
> 
> Regards,
> Yi Liu
> 
>>> + *
>>> + * @flags:	VT-d specific flags. Currently reserved for future
>>> + *		extension.
>>> + * @cap_reg:	Describe basic capabilities as defined in VT-d capability
>>> + *		register.
>>> + * @ecap_reg:	Describe the extended capabilities as defined in VT-d
>>> + *		extended capability register.
>>> + */
>>> +struct iommu_nesting_info_vtd {
>>> +	__u32	flags;
>>> +	__u32	padding;
>>> +	__u64	cap_reg;
>>> +	__u64	ecap_reg;
>>> +};
>>> +
>>>  #endif /* _UAPI_IOMMU_H */
>>>
>> Thanks
>>
>> Eric
> 

Thanks

Eric
Yi Liu July 6, 2020, 1:17 p.m. UTC | #4
> From: Auger Eric <eric.auger@redhat.com>
> Sent: Monday, July 6, 2020 9:01 PM
>
> On 7/6/20 2:20 PM, Liu, Yi L wrote:
> > Hi Eric,
> >
> >> From: Auger Eric <eric.auger@redhat.com>
> >> Sent: Monday, July 6, 2020 5:34 PM
> >>
> >> On 7/4/20 1:26 PM, Liu Yi L wrote:
> >>> IOMMUs that support nesting translation needs report the capability
> >>> info
> >> need to report
> >>> to userspace, e.g. the format of first level/stage paging structures.
> >>>
> >>> This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can
> >>> get nesting info after setting DOMAIN_ATTR_NESTING.
> >>>
> >>> Cc: Kevin Tian <kevin.tian@intel.com>
> >>> CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >>> Cc: Alex Williamson <alex.williamson@redhat.com>
> >>> Cc: Eric Auger <eric.auger@redhat.com>
> >>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>> Cc: Joerg Roedel <joro@8bytes.org>
> >>> Cc: Lu Baolu <baolu.lu@linux.intel.com>
> >>> Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
> >>> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> >>> ---
> >>> v3 -> v4:
> >>> *) split the SMMU driver changes to be a separate patch
> >>> *) move the @addr_width and @pasid_bits from vendor specific
> >>>    part to generic part.
> >>> *) tweak the description for the @features field of struct
> >>>    iommu_nesting_info.
> >>> *) add description on the @data[] field of struct iommu_nesting_info
> >>>
> >>> v2 -> v3:
> >>> *) remvoe cap/ecap_mask in iommu_nesting_info.
> >>> *) reuse DOMAIN_ATTR_NESTING to get nesting info.
> >>> *) return an empty iommu_nesting_info for SMMU drivers per Jean'
> >>>    suggestion.
> >>> ---
> >>>  include/uapi/linux/iommu.h | 78
> >> ++++++++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 78 insertions(+)
> >>>
> >>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> >>> index 1afc661..1bfc032 100644
> >>> --- a/include/uapi/linux/iommu.h
> >>> +++ b/include/uapi/linux/iommu.h
> >>> @@ -332,4 +332,82 @@ struct iommu_gpasid_bind_data {
> >>>  	} vendor;
> >>>  };
> >>>
> >>> +/*
> >>> + * struct iommu_nesting_info - Information for nesting-capable IOMMU.
> >>> + *				user space should check it before using
> >>> + *				nesting capability.
> >> alignment?
> >
> > oh, yes, will do it.
> >
> >>> + *
> >>> + * @size:	size of the whole structure
> >>> + * @format:	PASID table entry format, the same definition with
> >>> + *		@format of struct iommu_gpasid_bind_data.
> >> the same definition as struct iommu_gpasid_bind_data @format?
> >
> > right. yours is much better.
> >
> >>> + * @features:	supported nesting features.
> >>> + * @flags:	currently reserved for future extension.
> >>> + * @addr_width:	The output addr width of first level/stage translation
> >>> + * @pasid_bits:	Maximum supported PASID bits, 0 represents no PASID
> >>> + *		support.
> >>> + * @data:	vendor specific cap info. data[] structure type can be deduced
> >>> + *		from @format field.
> >>> + *
> >>> + *
> >>
> +===============+===================================================
> >> ===+
> >>> + * | feature       |  Notes                                               |
> >>> + *
> >>
> +===============+===================================================
> >> ===+
> >>> + * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
> >>> + * |               |  device. When a device is assigned to userspace or   |
> >>> + * |               |  VM, proper uAPI (userspace driver framework uAPI,   |
> >>> + * |               |  e.g. VFIO) must be used to allocate/free PASIDs for |
> >>> + * |               |  the assigned device.                                |
> >>> + * +---------------+------------------------------------------------------+
> >>> + * | BIND_PGTBL    |  The owner of the first level/stage page table must  |
> >>> + * |               |  explicitly bind the page table to associated PASID  |
> >>> + * |               |  (either the one specified in bind request or the    |
> >>> + * |               |  default PASID of iommu domain), through userspace   |
> >>> + * |               |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
> >>> + * +---------------+------------------------------------------------------+
> >>> + * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
> >>> + * |               |  explicitly invalidate the IOMMU cache through uAPI  |
> >>> + * |               |  provided by userspace driver framework (e.g. VFIO)  |
> >>> + * |               |  according to vendor-specific requirement when       |
> >>> + * |               |  changing the page table.                            |
> >>> + *
> >>> + +---------------+-------------------------------------------------
> >>> + -----+
> >> Do you foresee cases where BIND_PGTBL and CACHE_INVLD shouldn't be
> >> exposed as features?
> >
> > sorry, I didn't quite get it. could you explain a little bit more. :-)
> For SYSWIDE_PASID I understand SMMU won't advertise it. But do you foresee any
> nested implementation not requesting the owner of the tables to bind and invalidate
> caches. So I understand those 2 features would always be supported?

I think BIND_PGTBL is optional as ARM will bind guest pasid table to host. As for
CACHE_INVLD, per current nesting implementations, looks all stage-1/level-1
owner should issue cache invalidation when stage-1/level-1 changed. But still
added it here per the comments from Kevin. :-)

"So far this assumption is correct but it may not be true when thinking forward.
For example, a vendor might find a way to allow the owner of 1st-level page
table to directly invalidate cache w/o going through host IOMMU driver. From
this angle I feel explicitly reporting this capability is more robust."

https://lore.kernel.org/linux-iommu/MWHPR11MB1645B09EBDC76514ADC897A68C6F0@MWHPR11MB1645.namprd11.prod.outlook.com/

Regards,
Yi Liu

> >
> >>> + *
> >>> + * @data[] types defined for @format:
> >>> + *
> >>
> +================================+==================================
> >> ===+
> >>> + * | @format                        | @data[]                             |
> >>> + *
> >>
> +================================+==================================
> >> ===+
> >>> + * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd
> |
> >>> + *
> >>> ++--------------------------------+---------------------------------
> >>> +----+
> >>> + *
> >>> + */
> >>> +struct iommu_nesting_info {
> >>> +	__u32	size;
> >>> +	__u32	format;
> >>> +	__u32	features;
> >>> +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
> >>> +#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
> >>> +#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 << 2)
> >> In other structs the values seem to be defined before the field
> >
> > not sure. :-) I mimics the below struct from uapi/vfio.h
> Yep I noticed that afterwards. In IOMMU uapi it looks the opposite though. So I
> would alignto the style in the same file but that's not a big deal.

I see, may align with other iommu uapi. :-)

Regards,
Yi Liu

> >
> > struct vfio_iommu_type1_dma_map {
> >         __u32   argsz;
> >         __u32   flags;
> > #define VFIO_DMA_MAP_FLAG_READ (1 << 0)         /* readable from device */
> > #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1)        /* writable from device */
> >         __u64   vaddr;                          /* Process virtual address */
> >         __u64   iova;                           /* IO virtual address */
> >         __u64   size;                           /* Size of mapping (bytes) */
> > };
> >
> >>> +	__u32	flags;
> >>> +	__u16	addr_width;
> >>> +	__u16	pasid_bits;
> >>> +	__u32	padding;
> >>> +	__u8	data[];
> >>> +};
> >>> +
> >>> +/*
> >>> + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info
> >>> + *
> >> spurious line
> >
> > yes, will remove this line.
> >
> > Regards,
> > Yi Liu
> >
> >>> + *
> >>> + * @flags:	VT-d specific flags. Currently reserved for future
> >>> + *		extension.
> >>> + * @cap_reg:	Describe basic capabilities as defined in VT-d capability
> >>> + *		register.
> >>> + * @ecap_reg:	Describe the extended capabilities as defined in VT-d
> >>> + *		extended capability register.
> >>> + */
> >>> +struct iommu_nesting_info_vtd {
> >>> +	__u32	flags;
> >>> +	__u32	padding;
> >>> +	__u64	cap_reg;
> >>> +	__u64	ecap_reg;
> >>> +};
> >>> +
> >>>  #endif /* _UAPI_IOMMU_H */
> >>>
> >> Thanks
> >>
> >> Eric
> >
> 
> Thanks
> 
> Eric
diff mbox series

Patch

diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
index 1afc661..1bfc032 100644
--- a/include/uapi/linux/iommu.h
+++ b/include/uapi/linux/iommu.h
@@ -332,4 +332,82 @@  struct iommu_gpasid_bind_data {
 	} vendor;
 };
 
+/*
+ * struct iommu_nesting_info - Information for nesting-capable IOMMU.
+ *				user space should check it before using
+ *				nesting capability.
+ *
+ * @size:	size of the whole structure
+ * @format:	PASID table entry format, the same definition with
+ *		@format of struct iommu_gpasid_bind_data.
+ * @features:	supported nesting features.
+ * @flags:	currently reserved for future extension.
+ * @addr_width:	The output addr width of first level/stage translation
+ * @pasid_bits:	Maximum supported PASID bits, 0 represents no PASID
+ *		support.
+ * @data:	vendor specific cap info. data[] structure type can be deduced
+ *		from @format field.
+ *
+ * +===============+======================================================+
+ * | feature       |  Notes                                               |
+ * +===============+======================================================+
+ * | SYSWIDE_PASID |  PASIDs are managed in system-wide, instead of per   |
+ * |               |  device. When a device is assigned to userspace or   |
+ * |               |  VM, proper uAPI (userspace driver framework uAPI,   |
+ * |               |  e.g. VFIO) must be used to allocate/free PASIDs for |
+ * |               |  the assigned device.                                |
+ * +---------------+------------------------------------------------------+
+ * | BIND_PGTBL    |  The owner of the first level/stage page table must  |
+ * |               |  explicitly bind the page table to associated PASID  |
+ * |               |  (either the one specified in bind request or the    |
+ * |               |  default PASID of iommu domain), through userspace   |
+ * |               |  driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). |
+ * +---------------+------------------------------------------------------+
+ * | CACHE_INVLD   |  The owner of the first level/stage page table must  |
+ * |               |  explicitly invalidate the IOMMU cache through uAPI  |
+ * |               |  provided by userspace driver framework (e.g. VFIO)  |
+ * |               |  according to vendor-specific requirement when       |
+ * |               |  changing the page table.                            |
+ * +---------------+------------------------------------------------------+
+ *
+ * @data[] types defined for @format:
+ * +================================+=====================================+
+ * | @format                        | @data[]                             |
+ * +================================+=====================================+
+ * | IOMMU_PASID_FORMAT_INTEL_VTD   | struct iommu_nesting_info_vtd       |
+ * +--------------------------------+-------------------------------------+
+ *
+ */
+struct iommu_nesting_info {
+	__u32	size;
+	__u32	format;
+	__u32	features;
+#define IOMMU_NESTING_FEAT_SYSWIDE_PASID	(1 << 0)
+#define IOMMU_NESTING_FEAT_BIND_PGTBL		(1 << 1)
+#define IOMMU_NESTING_FEAT_CACHE_INVLD		(1 << 2)
+	__u32	flags;
+	__u16	addr_width;
+	__u16	pasid_bits;
+	__u32	padding;
+	__u8	data[];
+};
+
+/*
+ * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info
+ *
+ *
+ * @flags:	VT-d specific flags. Currently reserved for future
+ *		extension.
+ * @cap_reg:	Describe basic capabilities as defined in VT-d capability
+ *		register.
+ * @ecap_reg:	Describe the extended capabilities as defined in VT-d
+ *		extended capability register.
+ */
+struct iommu_nesting_info_vtd {
+	__u32	flags;
+	__u32	padding;
+	__u64	cap_reg;
+	__u64	ecap_reg;
+};
+
 #endif /* _UAPI_IOMMU_H */