Message ID | 1594552870-55687-3-git-send-email-yi.l.liu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfio: expose virtual Shared Virtual Addressing to VMs | expand |
Hi Yi, On 7/12/20 1:20 PM, Liu Yi L wrote: > IOMMUs that support nesting translation needs report the capability info s/needs/need to report > to userspace, e.g. the format of first level/stage paging structures. It gives information about requirements the userspace needs to implement plus other features characterizing the physical implementation. > > This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get > nesting info after setting DOMAIN_ATTR_NESTING. I guess you meant after selecting VFIO_TYPE1_NESTING_IOMMU? > > Cc: Kevin Tian <kevin.tian@intel.com> > CC: Jacob Pan <jacob.jun.pan@linux.intel.com> > Cc: Alex Williamson <alex.williamson@redhat.com> > Cc: Eric Auger <eric.auger@redhat.com> > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> > Cc: Joerg Roedel <joro@8bytes.org> > Cc: Lu Baolu <baolu.lu@linux.intel.com> > Signed-off-by: Liu Yi L <yi.l.liu@intel.com> > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> > --- > v4 -> v5: > *) address comments from Eric Auger. > > v3 -> v4: > *) split the SMMU driver changes to be a separate patch > *) move the @addr_width and @pasid_bits from vendor specific > part to generic part. > *) tweak the description for the @features field of struct > iommu_nesting_info. > *) add description on the @data[] field of struct iommu_nesting_info > > v2 -> v3: > *) remvoe cap/ecap_mask in iommu_nesting_info. > *) reuse DOMAIN_ATTR_NESTING to get nesting info. > *) return an empty iommu_nesting_info for SMMU drivers per Jean' > suggestion. > --- > include/uapi/linux/iommu.h | 77 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 77 insertions(+) > > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > index 1afc661..d2a47c4 100644 > --- a/include/uapi/linux/iommu.h > +++ b/include/uapi/linux/iommu.h > @@ -332,4 +332,81 @@ struct iommu_gpasid_bind_data { > } vendor; > }; > > +/* > + * struct iommu_nesting_info - Information for nesting-capable IOMMU. > + * user space should check it before using > + * nesting capability. > + * > + * @size: size of the whole structure > + * @format: PASID table entry format, the same definition as struct > + * iommu_gpasid_bind_data @format. > + * @features: supported nesting features. > + * @flags: currently reserved for future extension. > + * @addr_width: The output addr width of first level/stage translation > + * @pasid_bits: Maximum supported PASID bits, 0 represents no PASID > + * support. > + * @data: vendor specific cap info. data[] structure type can be deduced > + * from @format field. > + * > + * +===============+======================================================+ > + * | feature | Notes | > + * +===============+======================================================+ > + * | SYSWIDE_PASID | PASIDs are managed in system-wide, instead of per | s/in system-wide/system-wide ? > + * | | device. When a device is assigned to userspace or | > + * | | VM, proper uAPI (userspace driver framework uAPI, | > + * | | e.g. VFIO) must be used to allocate/free PASIDs for | > + * | | the assigned device. Isn't it possible to be more explicit, something like: | System-wide PASID management is mandated by the physical IOMMU. All PASIDs allocation must be mediated through the TBD API. > + * +---------------+------------------------------------------------------+ > + * | BIND_PGTBL | The owner of the first level/stage page table must | > + * | | explicitly bind the page table to associated PASID | > + * | | (either the one specified in bind request or the | > + * | | default PASID of iommu domain), through userspace | > + * | | driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). | As per your answer in https://lkml.org/lkml/2020/7/6/383, I now understand ARM would not expose that BIND_PGTBL nesting feature, I still think the above wording is a bit confusing. Maybe you may explicitly talk about the PASID *entry* that needs to be passed from guest to host. On ARM we directly pass the PASID table but when reading the above description I fail to determine if this does not fit that description. > + * +---------------+------------------------------------------------------+ > + * | CACHE_INVLD | The owner of the first level/stage page table must | > + * | | explicitly invalidate the IOMMU cache through uAPI | > + * | | provided by userspace driver framework (e.g. VFIO) | > + * | | according to vendor-specific requirement when | > + * | | changing the page table. | > + * +---------------+------------------------------------------------------+ instead of using the "uAPI provided by userspace driver framework (e.g. VFIO)", can't we use the so-called IOMMU UAPI terminology which now has a userspace documentation? > + * > + * @data[] types defined for @format: > + * +================================+=====================================+ > + * | @format | @data[] | > + * +================================+=====================================+ > + * | IOMMU_PASID_FORMAT_INTEL_VTD | struct iommu_nesting_info_vtd | > + * +--------------------------------+-------------------------------------+ > + * > + */ > +struct iommu_nesting_info { > + __u32 size; shouldn't it be @argsz to fit the iommu uapi convention and take benefit to put the flags field just below? > + __u32 format; > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0) > +#define IOMMU_NESTING_FEAT_BIND_PGTBL (1 << 1) > +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2) > + __u32 features; > + __u32 flags; > + __u16 addr_width; > + __u16 pasid_bits; > + __u32 padding; > + __u8 data[]; > +}; > + > +/* > + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info > + * > + * @flags: VT-d specific flags. Currently reserved for future > + * extension. must be set to 0? > + * @cap_reg: Describe basic capabilities as defined in VT-d capability > + * register. > + * @ecap_reg: Describe the extended capabilities as defined in VT-d > + * extended capability register. > + */ > +struct iommu_nesting_info_vtd { > + __u32 flags; > + __u32 padding; > + __u64 cap_reg; > + __u64 ecap_reg; > +}; > + > #endif /* _UAPI_IOMMU_H */ Thanks Eric >
Hi Eric, > From: Auger Eric <eric.auger@redhat.com> > Sent: Saturday, July 18, 2020 12:29 AM > > Hi Yi, > > On 7/12/20 1:20 PM, Liu Yi L wrote: > > IOMMUs that support nesting translation needs report the capability info > s/needs/need to report yep. > > to userspace, e.g. the format of first level/stage paging structures. > It gives information about requirements the userspace needs to implement > plus other features characterizing the physical implementation. got it. will add it in next version. > > > > This patch reports nesting info by DOMAIN_ATTR_NESTING. Caller can get > > nesting info after setting DOMAIN_ATTR_NESTING. > I guess you meant after selecting VFIO_TYPE1_NESTING_IOMMU? yes, it is. ok, perhaps, it's better to say get nesting info after selecting VFIO_TYPE1_NESTING_IOMMU. > > > > Cc: Kevin Tian <kevin.tian@intel.com> > > CC: Jacob Pan <jacob.jun.pan@linux.intel.com> > > Cc: Alex Williamson <alex.williamson@redhat.com> > > Cc: Eric Auger <eric.auger@redhat.com> > > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> > > Cc: Joerg Roedel <joro@8bytes.org> > > Cc: Lu Baolu <baolu.lu@linux.intel.com> > > Signed-off-by: Liu Yi L <yi.l.liu@intel.com> > > Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> > > --- > > v4 -> v5: > > *) address comments from Eric Auger. > > > > v3 -> v4: > > *) split the SMMU driver changes to be a separate patch > > *) move the @addr_width and @pasid_bits from vendor specific > > part to generic part. > > *) tweak the description for the @features field of struct > > iommu_nesting_info. > > *) add description on the @data[] field of struct iommu_nesting_info > > > > v2 -> v3: > > *) remvoe cap/ecap_mask in iommu_nesting_info. > > *) reuse DOMAIN_ATTR_NESTING to get nesting info. > > *) return an empty iommu_nesting_info for SMMU drivers per Jean' > > suggestion. > > --- > > include/uapi/linux/iommu.h | 77 > ++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 77 insertions(+) > > > > diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h > > index 1afc661..d2a47c4 100644 > > --- a/include/uapi/linux/iommu.h > > +++ b/include/uapi/linux/iommu.h > > @@ -332,4 +332,81 @@ struct iommu_gpasid_bind_data { > > } vendor; > > }; > > > > +/* > > + * struct iommu_nesting_info - Information for nesting-capable IOMMU. > > + * user space should check it before using > > + * nesting capability. > > + * > > + * @size: size of the whole structure > > + * @format: PASID table entry format, the same definition as struct > > + * iommu_gpasid_bind_data @format. > > + * @features: supported nesting features. > > + * @flags: currently reserved for future extension. > > + * @addr_width: The output addr width of first level/stage translation > > + * @pasid_bits: Maximum supported PASID bits, 0 represents no PASID > > + * support. > > + * @data: vendor specific cap info. data[] structure type can be deduced > > + * from @format field. > > + * > > + * > +===============+=================================================== > ===+ > > + * | feature | Notes | > > + * > +===============+=================================================== > ===+ > > + * | SYSWIDE_PASID | PASIDs are managed in system-wide, instead of per | > s/in system-wide/system-wide ? got it. > > + * | | device. When a device is assigned to userspace or | > > + * | | VM, proper uAPI (userspace driver framework uAPI, | > > + * | | e.g. VFIO) must be used to allocate/free PASIDs for | > > + * | | the assigned device. > Isn't it possible to be more explicit, something like: > | > System-wide PASID management is mandated by the physical IOMMU. All > PASIDs allocation must be mediated through the TBD API. yep, I can add it. > > + * +---------------+------------------------------------------------------+ > > + * | BIND_PGTBL | The owner of the first level/stage page table must | > > + * | | explicitly bind the page table to associated PASID | > > + * | | (either the one specified in bind request or the | > > + * | | default PASID of iommu domain), through userspace | > > + * | | driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). | > As per your answer in https://lkml.org/lkml/2020/7/6/383, I now > understand ARM would not expose that BIND_PGTBL nesting feature, yes, that's my point. > I still > think the above wording is a bit confusing. Maybe you may explicitly > talk about the PASID *entry* that needs to be passed from guest to host. > On ARM we directly pass the PASID table but when reading the above > description I fail to determine if this does not fit that description. yes, I can do it. > > + * +---------------+------------------------------------------------------+ > > + * | CACHE_INVLD | The owner of the first level/stage page table must | > > + * | | explicitly invalidate the IOMMU cache through uAPI | > > + * | | provided by userspace driver framework (e.g. VFIO) | > > + * | | according to vendor-specific requirement when | > > + * | | changing the page table. | > > + * +---------------+------------------------------------------------------+ > > instead of using the "uAPI provided by userspace driver framework (e.g. > VFIO)", can't we use the so-called IOMMU UAPI terminology which now has > a userspace documentation? the problem is current IOMMU UAPI definitions is actually embedded in other VFIO UAPI. if it can make the description more clear, I can follow your suggestion. :-) > > > + * > > + * @data[] types defined for @format: > > + * > +================================+================================== > ===+ > > + * | @format | @data[] | > > + * > +================================+================================== > ===+ > > + * | IOMMU_PASID_FORMAT_INTEL_VTD | struct iommu_nesting_info_vtd | > > + * +--------------------------------+-------------------------------------+ > > + * > > + */ > > +struct iommu_nesting_info { > > + __u32 size; > shouldn't it be @argsz to fit the iommu uapi convention and take benefit > to put the flags field just below? make sense. > > + __u32 format; > > +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0) > > +#define IOMMU_NESTING_FEAT_BIND_PGTBL (1 << 1) > > +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2) > > + __u32 features; > > + __u32 flags; > > + __u16 addr_width; > > + __u16 pasid_bits; > > + __u32 padding; > > + __u8 data[]; > > +}; > > + > > +/* > > + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info > > + * > > + * @flags: VT-d specific flags. Currently reserved for future > > + * extension. > must be set to 0? yes. will add it. Thanks, Yi Liu > > + * @cap_reg: Describe basic capabilities as defined in VT-d capability > > + * register. > > + * @ecap_reg: Describe the extended capabilities as defined in VT-d > > + * extended capability register. > > + */ > > +struct iommu_nesting_info_vtd { > > + __u32 flags; > > + __u32 padding; > > + __u64 cap_reg; > > + __u64 ecap_reg; > > +}; > > + > > #endif /* _UAPI_IOMMU_H */ > Thanks > > Eric > >
diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h index 1afc661..d2a47c4 100644 --- a/include/uapi/linux/iommu.h +++ b/include/uapi/linux/iommu.h @@ -332,4 +332,81 @@ struct iommu_gpasid_bind_data { } vendor; }; +/* + * struct iommu_nesting_info - Information for nesting-capable IOMMU. + * user space should check it before using + * nesting capability. + * + * @size: size of the whole structure + * @format: PASID table entry format, the same definition as struct + * iommu_gpasid_bind_data @format. + * @features: supported nesting features. + * @flags: currently reserved for future extension. + * @addr_width: The output addr width of first level/stage translation + * @pasid_bits: Maximum supported PASID bits, 0 represents no PASID + * support. + * @data: vendor specific cap info. data[] structure type can be deduced + * from @format field. + * + * +===============+======================================================+ + * | feature | Notes | + * +===============+======================================================+ + * | SYSWIDE_PASID | PASIDs are managed in system-wide, instead of per | + * | | device. When a device is assigned to userspace or | + * | | VM, proper uAPI (userspace driver framework uAPI, | + * | | e.g. VFIO) must be used to allocate/free PASIDs for | + * | | the assigned device. | + * +---------------+------------------------------------------------------+ + * | BIND_PGTBL | The owner of the first level/stage page table must | + * | | explicitly bind the page table to associated PASID | + * | | (either the one specified in bind request or the | + * | | default PASID of iommu domain), through userspace | + * | | driver framework uAPI (e.g. VFIO_IOMMU_NESTING_OP). | + * +---------------+------------------------------------------------------+ + * | CACHE_INVLD | The owner of the first level/stage page table must | + * | | explicitly invalidate the IOMMU cache through uAPI | + * | | provided by userspace driver framework (e.g. VFIO) | + * | | according to vendor-specific requirement when | + * | | changing the page table. | + * +---------------+------------------------------------------------------+ + * + * @data[] types defined for @format: + * +================================+=====================================+ + * | @format | @data[] | + * +================================+=====================================+ + * | IOMMU_PASID_FORMAT_INTEL_VTD | struct iommu_nesting_info_vtd | + * +--------------------------------+-------------------------------------+ + * + */ +struct iommu_nesting_info { + __u32 size; + __u32 format; +#define IOMMU_NESTING_FEAT_SYSWIDE_PASID (1 << 0) +#define IOMMU_NESTING_FEAT_BIND_PGTBL (1 << 1) +#define IOMMU_NESTING_FEAT_CACHE_INVLD (1 << 2) + __u32 features; + __u32 flags; + __u16 addr_width; + __u16 pasid_bits; + __u32 padding; + __u8 data[]; +}; + +/* + * struct iommu_nesting_info_vtd - Intel VT-d specific nesting info + * + * @flags: VT-d specific flags. Currently reserved for future + * extension. + * @cap_reg: Describe basic capabilities as defined in VT-d capability + * register. + * @ecap_reg: Describe the extended capabilities as defined in VT-d + * extended capability register. + */ +struct iommu_nesting_info_vtd { + __u32 flags; + __u32 padding; + __u64 cap_reg; + __u64 ecap_reg; +}; + #endif /* _UAPI_IOMMU_H */