Message ID | e40ee980-9151-101a-1484-b1710aaeafb0@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IOMMU: superpage support when not sharing pagetables | expand |
On Fri, Sep 24, 2021 at 11:53:59AM +0200, Jan Beulich wrote: > Having a separate flush-all hook has always been puzzling me some. We > will want to be able to force a full flush via accumulated flush flags > from the map/unmap functions. Introduce a respective new flag and fold > all flush handling to use the single remaining hook. > > Note that because of the respective comments in SMMU and IPMMU-VMSA > code, I've folded the two prior hook functions into one. For SMMU-v3, > which lacks a comment towards incapable hardware, I've left both > functions in place on the assumption that selective and full flushes > will eventually want separating. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Just one nit I think. > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl > unsigned long page_count, > unsigned int flush_flags) > { > - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > - ASSERT(flush_flags); > + if ( flush_flags & IOMMU_FLUSHF_all ) > + { > + dfn = INVALID_DFN; > + page_count = 0; Don't we expect callers to already pass an invalid dfn and a 0 page count when doing a full flush? In the equivalent AMD code you didn't set those for the FLUSHF_all case. Thanks, Roger.
On 13.12.2021 16:04, Roger Pau Monné wrote: > On Fri, Sep 24, 2021 at 11:53:59AM +0200, Jan Beulich wrote: >> Having a separate flush-all hook has always been puzzling me some. We >> will want to be able to force a full flush via accumulated flush flags >> from the map/unmap functions. Introduce a respective new flag and fold >> all flush handling to use the single remaining hook. >> >> Note that because of the respective comments in SMMU and IPMMU-VMSA >> code, I've folded the two prior hook functions into one. For SMMU-v3, >> which lacks a comment towards incapable hardware, I've left both >> functions in place on the assumption that selective and full flushes >> will eventually want separating. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Just one nit I think. > >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl >> unsigned long page_count, >> unsigned int flush_flags) >> { >> - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); >> - ASSERT(flush_flags); >> + if ( flush_flags & IOMMU_FLUSHF_all ) >> + { >> + dfn = INVALID_DFN; >> + page_count = 0; > > Don't we expect callers to already pass an invalid dfn and a 0 page > count when doing a full flush? I didn't want to introduce such a requirement. The two arguments should imo be don't-cares with IOMMU_FLUSHF_all, such that callers handing on (or accumulating) flags don't need to apply extra care. > In the equivalent AMD code you didn't set those for the FLUSHF_all > case. There's no similar dependency there in AMD code. For VT-d, iommu_flush_iotlb() needs at least one of the two set this way to actually do a full-address-space flush. (Which, as an aside, I've recently learned is supposedly wrong when cap_isoch() returns true. But that's an orthogonal issue, albeit it may be possible to deal with at the same time as, down the road, limiting the too aggressive flushing done by subsequent patches using this new flag.) I could be talked into setting just page_count to zero here. Jan
On Tue, Dec 14, 2021 at 10:06:37AM +0100, Jan Beulich wrote: > On 13.12.2021 16:04, Roger Pau Monné wrote: > > On Fri, Sep 24, 2021 at 11:53:59AM +0200, Jan Beulich wrote: > >> Having a separate flush-all hook has always been puzzling me some. We > >> will want to be able to force a full flush via accumulated flush flags > >> from the map/unmap functions. Introduce a respective new flag and fold > >> all flush handling to use the single remaining hook. > >> > >> Note that because of the respective comments in SMMU and IPMMU-VMSA > >> code, I've folded the two prior hook functions into one. For SMMU-v3, > >> which lacks a comment towards incapable hardware, I've left both > >> functions in place on the assumption that selective and full flushes > >> will eventually want separating. > >> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> > >> --- a/xen/drivers/passthrough/vtd/iommu.c > >> +++ b/xen/drivers/passthrough/vtd/iommu.c > >> @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl > >> unsigned long page_count, > >> unsigned int flush_flags) > >> { > >> - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > >> - ASSERT(flush_flags); > >> + if ( flush_flags & IOMMU_FLUSHF_all ) > >> + { > >> + dfn = INVALID_DFN; > >> + page_count = 0; > > > > Don't we expect callers to already pass an invalid dfn and a 0 page > > count when doing a full flush? > > I didn't want to introduce such a requirement. The two arguments should > imo be don't-cares with IOMMU_FLUSHF_all, such that callers handing on > (or accumulating) flags don't need to apply extra care. > > > In the equivalent AMD code you didn't set those for the FLUSHF_all > > case. > > There's no similar dependency there in AMD code. For VT-d, > iommu_flush_iotlb() needs at least one of the two set this way to > actually do a full-address-space flush. (Which, as an aside, I've > recently learned is supposedly wrong when cap_isoch() returns true. But > that's an orthogonal issue, albeit it may be possible to deal with at > the same time as, down the road, limiting the too aggressive flushing > done by subsequent patches using this new flag.) I see. AMD flush helper gets the flags as a parameter (because the flush all function is a wrapper around the flush pages one), so there's no need to signal a full flush using the other parameters. > I could be talked into setting just page_count to zero here. No, I think it's fine. Thanks, Roger.
On 24.09.21 12:53, Jan Beulich wrote: Hi Jan > Having a separate flush-all hook has always been puzzling me some. We > will want to be able to force a full flush via accumulated flush flags > from the map/unmap functions. Introduce a respective new flag and fold > all flush handling to use the single remaining hook. > > Note that because of the respective comments in SMMU and IPMMU-VMSA > code, I've folded the two prior hook functions into one. Changes to IPMMU-VMSA lgtm, for SMMU-v2 I think the same. > For SMMU-v3, > which lacks a comment towards incapable hardware, I've left both > functions in place on the assumption that selective and full flushes > will eventually want separating. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > TBD: What we really are going to need is for the map/unmap functions to > specify that a wider region needs flushing than just the one > covered by the present set of (un)maps. This may still be less than > a full flush, but at least as a first step it seemed better to me > to keep things simple and go the flush-all route. > --- > v2: New. > > --- a/xen/drivers/passthrough/amd/iommu.h > +++ b/xen/drivers/passthrough/amd/iommu.h > @@ -242,7 +242,6 @@ int amd_iommu_get_reserved_device_memory > int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn, > unsigned long page_count, > unsigned int flush_flags); > -int __must_check amd_iommu_flush_iotlb_all(struct domain *d); > void amd_iommu_print_entries(const struct amd_iommu *iommu, unsigned int dev_id, > dfn_t dfn); > > --- a/xen/drivers/passthrough/amd/iommu_map.c > +++ b/xen/drivers/passthrough/amd/iommu_map.c > @@ -475,15 +475,18 @@ int amd_iommu_flush_iotlb_pages(struct d > { > unsigned long dfn_l = dfn_x(dfn); > > - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > - ASSERT(flush_flags); > + if ( !(flush_flags & IOMMU_FLUSHF_all) ) > + { > + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > + ASSERT(flush_flags); > + } > > /* Unless a PTE was modified, no flush is required */ > if ( !(flush_flags & IOMMU_FLUSHF_modified) ) > return 0; > > - /* If the range wraps then just flush everything */ > - if ( dfn_l + page_count < dfn_l ) > + /* If so requested or if the range wraps then just flush everything. */ > + if ( (flush_flags & IOMMU_FLUSHF_all) || dfn_l + page_count < dfn_l ) > { > amd_iommu_flush_all_pages(d); > return 0; > @@ -508,13 +511,6 @@ int amd_iommu_flush_iotlb_pages(struct d > > return 0; > } > - > -int amd_iommu_flush_iotlb_all(struct domain *d) > -{ > - amd_iommu_flush_all_pages(d); > - > - return 0; > -} > > int amd_iommu_reserve_domain_unity_map(struct domain *d, > const struct ivrs_unity_map *map, > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -642,7 +642,6 @@ static const struct iommu_ops __initcons > .map_page = amd_iommu_map_page, > .unmap_page = amd_iommu_unmap_page, > .iotlb_flush = amd_iommu_flush_iotlb_pages, > - .iotlb_flush_all = amd_iommu_flush_iotlb_all, > .reassign_device = reassign_device, > .get_device_group_id = amd_iommu_group_id, > .enable_x2apic = iov_enable_xt, > --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c > +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c > @@ -930,13 +930,19 @@ out: > } > > /* Xen IOMMU ops */ > -static int __must_check ipmmu_iotlb_flush_all(struct domain *d) > +static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, > + unsigned long page_count, > + unsigned int flush_flags) > { > struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv; > > + ASSERT(flush_flags); > + > if ( !xen_domain || !xen_domain->root_domain ) > return 0; > > + /* The hardware doesn't support selective TLB flush. */ > + > spin_lock(&xen_domain->lock); > ipmmu_tlb_invalidate(xen_domain->root_domain); > spin_unlock(&xen_domain->lock); > @@ -944,16 +950,6 @@ static int __must_check ipmmu_iotlb_flus > return 0; > } > > -static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, > - unsigned long page_count, > - unsigned int flush_flags) > -{ > - ASSERT(flush_flags); > - > - /* The hardware doesn't support selective TLB flush. */ > - return ipmmu_iotlb_flush_all(d); > -} > - > static struct ipmmu_vmsa_domain *ipmmu_get_cache_domain(struct domain *d, > struct device *dev) > { > @@ -1303,7 +1299,6 @@ static const struct iommu_ops ipmmu_iomm > .hwdom_init = ipmmu_iommu_hwdom_init, > .teardown = ipmmu_iommu_domain_teardown, > .iotlb_flush = ipmmu_iotlb_flush, > - .iotlb_flush_all = ipmmu_iotlb_flush_all, > .assign_device = ipmmu_assign_device, > .reassign_device = ipmmu_reassign_device, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -2649,11 +2649,17 @@ static int force_stage = 2; > */ > static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK; > > -static int __must_check arm_smmu_iotlb_flush_all(struct domain *d) > +static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, > + unsigned long page_count, > + unsigned int flush_flags) > { > struct arm_smmu_xen_domain *smmu_domain = dom_iommu(d)->arch.priv; > struct iommu_domain *cfg; > > + ASSERT(flush_flags); > + > + /* ARM SMMU v1 doesn't have flush by VMA and VMID */ > + > spin_lock(&smmu_domain->lock); > list_for_each_entry(cfg, &smmu_domain->contexts, list) { > /* > @@ -2670,16 +2676,6 @@ static int __must_check arm_smmu_iotlb_f > return 0; > } > > -static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, > - unsigned long page_count, > - unsigned int flush_flags) > -{ > - ASSERT(flush_flags); > - > - /* ARM SMMU v1 doesn't have flush by VMA and VMID */ > - return arm_smmu_iotlb_flush_all(d); > -} > - > static struct iommu_domain *arm_smmu_get_domain(struct domain *d, > struct device *dev) > { > @@ -2879,7 +2875,6 @@ static const struct iommu_ops arm_smmu_i > .add_device = arm_smmu_dt_add_device_generic, > .teardown = arm_smmu_iommu_domain_teardown, > .iotlb_flush = arm_smmu_iotlb_flush, > - .iotlb_flush_all = arm_smmu_iotlb_flush_all, > .assign_device = arm_smmu_assign_dev, > .reassign_device = arm_smmu_reassign_dev, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/arm/smmu-v3.c > +++ b/xen/drivers/passthrough/arm/smmu-v3.c > @@ -3431,7 +3431,6 @@ static const struct iommu_ops arm_smmu_i > .hwdom_init = arm_smmu_iommu_hwdom_init, > .teardown = arm_smmu_iommu_xen_domain_teardown, > .iotlb_flush = arm_smmu_iotlb_flush, > - .iotlb_flush_all = arm_smmu_iotlb_flush_all, > .assign_device = arm_smmu_assign_dev, > .reassign_device = arm_smmu_reassign_dev, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -463,15 +463,12 @@ int iommu_iotlb_flush_all(struct domain > const struct domain_iommu *hd = dom_iommu(d); > int rc; > > - if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush_all || > + if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush || > !flush_flags ) > return 0; > > - /* > - * The operation does a full flush so we don't need to pass the > - * flush_flags in. > - */ > - rc = iommu_call(hd->platform_ops, iotlb_flush_all, d); > + rc = iommu_call(hd->platform_ops, iotlb_flush, d, INVALID_DFN, 0, > + flush_flags | IOMMU_FLUSHF_all); > if ( unlikely(rc) ) > { > if ( !d->is_shutting_down && printk_ratelimit() ) > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl > unsigned long page_count, > unsigned int flush_flags) > { > - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > - ASSERT(flush_flags); > + if ( flush_flags & IOMMU_FLUSHF_all ) > + { > + dfn = INVALID_DFN; > + page_count = 0; > + } > + else > + { > + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > + ASSERT(flush_flags); > + } > > return iommu_flush_iotlb(d, dfn, flush_flags & IOMMU_FLUSHF_modified, > page_count); > } > > -static int __must_check iommu_flush_iotlb_all(struct domain *d) > -{ > - return iommu_flush_iotlb(d, INVALID_DFN, 0, 0); > -} > - > static void queue_free_pt(struct domain *d, mfn_t mfn, unsigned int next_level) > { > if ( next_level > 1 ) > @@ -2841,7 +2844,7 @@ static int __init intel_iommu_quarantine > spin_unlock(&hd->arch.mapping_lock); > > if ( !rc ) > - rc = iommu_flush_iotlb_all(d); > + rc = iommu_flush_iotlb(d, INVALID_DFN, 0, 0); > > /* Pages may be leaked in failure case */ > return rc; > @@ -2874,7 +2877,6 @@ static struct iommu_ops __initdata vtd_o > .resume = vtd_resume, > .crash_shutdown = vtd_crash_shutdown, > .iotlb_flush = iommu_flush_iotlb_pages, > - .iotlb_flush_all = iommu_flush_iotlb_all, > .get_reserved_device_memory = intel_iommu_get_reserved_device_memory, > .dump_page_tables = vtd_dump_page_tables, > }; > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -147,9 +147,11 @@ enum > { > _IOMMU_FLUSHF_added, > _IOMMU_FLUSHF_modified, > + _IOMMU_FLUSHF_all, > }; > #define IOMMU_FLUSHF_added (1u << _IOMMU_FLUSHF_added) > #define IOMMU_FLUSHF_modified (1u << _IOMMU_FLUSHF_modified) > +#define IOMMU_FLUSHF_all (1u << _IOMMU_FLUSHF_all) > > int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn, > unsigned long page_count, unsigned int flags, > @@ -282,7 +284,6 @@ struct iommu_ops { > int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn, > unsigned long page_count, > unsigned int flush_flags); > - int __must_check (*iotlb_flush_all)(struct domain *d); > int (*get_reserved_device_memory)(iommu_grdm_t *, void *); > void (*dump_page_tables)(struct domain *d); > > >
On 15.12.2021 16:28, Oleksandr wrote: > On 24.09.21 12:53, Jan Beulich wrote: >> Having a separate flush-all hook has always been puzzling me some. We >> will want to be able to force a full flush via accumulated flush flags >> from the map/unmap functions. Introduce a respective new flag and fold >> all flush handling to use the single remaining hook. >> >> Note that because of the respective comments in SMMU and IPMMU-VMSA >> code, I've folded the two prior hook functions into one. > > Changes to IPMMU-VMSA lgtm, for SMMU-v2 I think the same. Thanks; I wonder whether I may transform this into some kind of tag. Jan
On 16.12.21 10:49, Jan Beulich wrote: Hi Jan > On 15.12.2021 16:28, Oleksandr wrote: >> On 24.09.21 12:53, Jan Beulich wrote: >>> Having a separate flush-all hook has always been puzzling me some. We >>> will want to be able to force a full flush via accumulated flush flags >>> from the map/unmap functions. Introduce a respective new flag and fold >>> all flush handling to use the single remaining hook. >>> >>> Note that because of the respective comments in SMMU and IPMMU-VMSA >>> code, I've folded the two prior hook functions into one. >> Changes to IPMMU-VMSA lgtm, for SMMU-v2 I think the same. > Thanks; I wonder whether I may transform this into some kind of tag. [IPMMU-VMSA and SMMU-V2 bits] Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Jan >
Hi Jan > On 24 Sep 2021, at 10:53 am, Jan Beulich <jbeulich@suse.com> wrote: > > Having a separate flush-all hook has always been puzzling me some. We > will want to be able to force a full flush via accumulated flush flags > from the map/unmap functions. Introduce a respective new flag and fold > all flush handling to use the single remaining hook. > > Note that because of the respective comments in SMMU and IPMMU-VMSA > code, I've folded the two prior hook functions into one. For SMMU-v3, > which lacks a comment towards incapable hardware, I've left both > functions in place on the assumption that selective and full flushes > will eventually want separating. For SMMUv3 related Changs: Reviewed-by: Rahul Singh <rahul.singh@arm.com> Regards, Rahul > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > TBD: What we really are going to need is for the map/unmap functions to > specify that a wider region needs flushing than just the one > covered by the present set of (un)maps. This may still be less than > a full flush, but at least as a first step it seemed better to me > to keep things simple and go the flush-all route. > --- > v2: New. > > --- a/xen/drivers/passthrough/amd/iommu.h > +++ b/xen/drivers/passthrough/amd/iommu.h > @@ -242,7 +242,6 @@ int amd_iommu_get_reserved_device_memory > int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn, > unsigned long page_count, > unsigned int flush_flags); > -int __must_check amd_iommu_flush_iotlb_all(struct domain *d); > void amd_iommu_print_entries(const struct amd_iommu *iommu, unsigned int dev_id, > dfn_t dfn); > > --- a/xen/drivers/passthrough/amd/iommu_map.c > +++ b/xen/drivers/passthrough/amd/iommu_map.c > @@ -475,15 +475,18 @@ int amd_iommu_flush_iotlb_pages(struct d > { > unsigned long dfn_l = dfn_x(dfn); > > - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > - ASSERT(flush_flags); > + if ( !(flush_flags & IOMMU_FLUSHF_all) ) > + { > + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > + ASSERT(flush_flags); > + } > > /* Unless a PTE was modified, no flush is required */ > if ( !(flush_flags & IOMMU_FLUSHF_modified) ) > return 0; > > - /* If the range wraps then just flush everything */ > - if ( dfn_l + page_count < dfn_l ) > + /* If so requested or if the range wraps then just flush everything. */ > + if ( (flush_flags & IOMMU_FLUSHF_all) || dfn_l + page_count < dfn_l ) > { > amd_iommu_flush_all_pages(d); > return 0; > @@ -508,13 +511,6 @@ int amd_iommu_flush_iotlb_pages(struct d > > return 0; > } > - > -int amd_iommu_flush_iotlb_all(struct domain *d) > -{ > - amd_iommu_flush_all_pages(d); > - > - return 0; > -} > > int amd_iommu_reserve_domain_unity_map(struct domain *d, > const struct ivrs_unity_map *map, > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -642,7 +642,6 @@ static const struct iommu_ops __initcons > .map_page = amd_iommu_map_page, > .unmap_page = amd_iommu_unmap_page, > .iotlb_flush = amd_iommu_flush_iotlb_pages, > - .iotlb_flush_all = amd_iommu_flush_iotlb_all, > .reassign_device = reassign_device, > .get_device_group_id = amd_iommu_group_id, > .enable_x2apic = iov_enable_xt, > --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c > +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c > @@ -930,13 +930,19 @@ out: > } > > /* Xen IOMMU ops */ > -static int __must_check ipmmu_iotlb_flush_all(struct domain *d) > +static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, > + unsigned long page_count, > + unsigned int flush_flags) > { > struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv; > > + ASSERT(flush_flags); > + > if ( !xen_domain || !xen_domain->root_domain ) > return 0; > > + /* The hardware doesn't support selective TLB flush. */ > + > spin_lock(&xen_domain->lock); > ipmmu_tlb_invalidate(xen_domain->root_domain); > spin_unlock(&xen_domain->lock); > @@ -944,16 +950,6 @@ static int __must_check ipmmu_iotlb_flus > return 0; > } > > -static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, > - unsigned long page_count, > - unsigned int flush_flags) > -{ > - ASSERT(flush_flags); > - > - /* The hardware doesn't support selective TLB flush. */ > - return ipmmu_iotlb_flush_all(d); > -} > - > static struct ipmmu_vmsa_domain *ipmmu_get_cache_domain(struct domain *d, > struct device *dev) > { > @@ -1303,7 +1299,6 @@ static const struct iommu_ops ipmmu_iomm > .hwdom_init = ipmmu_iommu_hwdom_init, > .teardown = ipmmu_iommu_domain_teardown, > .iotlb_flush = ipmmu_iotlb_flush, > - .iotlb_flush_all = ipmmu_iotlb_flush_all, > .assign_device = ipmmu_assign_device, > .reassign_device = ipmmu_reassign_device, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -2649,11 +2649,17 @@ static int force_stage = 2; > */ > static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK; > > -static int __must_check arm_smmu_iotlb_flush_all(struct domain *d) > +static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, > + unsigned long page_count, > + unsigned int flush_flags) > { > struct arm_smmu_xen_domain *smmu_domain = dom_iommu(d)->arch.priv; > struct iommu_domain *cfg; > > + ASSERT(flush_flags); > + > + /* ARM SMMU v1 doesn't have flush by VMA and VMID */ > + > spin_lock(&smmu_domain->lock); > list_for_each_entry(cfg, &smmu_domain->contexts, list) { > /* > @@ -2670,16 +2676,6 @@ static int __must_check arm_smmu_iotlb_f > return 0; > } > > -static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, > - unsigned long page_count, > - unsigned int flush_flags) > -{ > - ASSERT(flush_flags); > - > - /* ARM SMMU v1 doesn't have flush by VMA and VMID */ > - return arm_smmu_iotlb_flush_all(d); > -} > - > static struct iommu_domain *arm_smmu_get_domain(struct domain *d, > struct device *dev) > { > @@ -2879,7 +2875,6 @@ static const struct iommu_ops arm_smmu_i > .add_device = arm_smmu_dt_add_device_generic, > .teardown = arm_smmu_iommu_domain_teardown, > .iotlb_flush = arm_smmu_iotlb_flush, > - .iotlb_flush_all = arm_smmu_iotlb_flush_all, > .assign_device = arm_smmu_assign_dev, > .reassign_device = arm_smmu_reassign_dev, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/arm/smmu-v3.c > +++ b/xen/drivers/passthrough/arm/smmu-v3.c > @@ -3431,7 +3431,6 @@ static const struct iommu_ops arm_smmu_i > .hwdom_init = arm_smmu_iommu_hwdom_init, > .teardown = arm_smmu_iommu_xen_domain_teardown, > .iotlb_flush = arm_smmu_iotlb_flush, > - .iotlb_flush_all = arm_smmu_iotlb_flush_all, > .assign_device = arm_smmu_assign_dev, > .reassign_device = arm_smmu_reassign_dev, > .map_page = arm_iommu_map_page, > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -463,15 +463,12 @@ int iommu_iotlb_flush_all(struct domain > const struct domain_iommu *hd = dom_iommu(d); > int rc; > > - if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush_all || > + if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush || > !flush_flags ) > return 0; > > - /* > - * The operation does a full flush so we don't need to pass the > - * flush_flags in. > - */ > - rc = iommu_call(hd->platform_ops, iotlb_flush_all, d); > + rc = iommu_call(hd->platform_ops, iotlb_flush, d, INVALID_DFN, 0, > + flush_flags | IOMMU_FLUSHF_all); > if ( unlikely(rc) ) > { > if ( !d->is_shutting_down && printk_ratelimit() ) > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl > unsigned long page_count, > unsigned int flush_flags) > { > - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > - ASSERT(flush_flags); > + if ( flush_flags & IOMMU_FLUSHF_all ) > + { > + dfn = INVALID_DFN; > + page_count = 0; > + } > + else > + { > + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > + ASSERT(flush_flags); > + } > > return iommu_flush_iotlb(d, dfn, flush_flags & IOMMU_FLUSHF_modified, > page_count); > } > > -static int __must_check iommu_flush_iotlb_all(struct domain *d) > -{ > - return iommu_flush_iotlb(d, INVALID_DFN, 0, 0); > -} > - > static void queue_free_pt(struct domain *d, mfn_t mfn, unsigned int next_level) > { > if ( next_level > 1 ) > @@ -2841,7 +2844,7 @@ static int __init intel_iommu_quarantine > spin_unlock(&hd->arch.mapping_lock); > > if ( !rc ) > - rc = iommu_flush_iotlb_all(d); > + rc = iommu_flush_iotlb(d, INVALID_DFN, 0, 0); > > /* Pages may be leaked in failure case */ > return rc; > @@ -2874,7 +2877,6 @@ static struct iommu_ops __initdata vtd_o > .resume = vtd_resume, > .crash_shutdown = vtd_crash_shutdown, > .iotlb_flush = iommu_flush_iotlb_pages, > - .iotlb_flush_all = iommu_flush_iotlb_all, > .get_reserved_device_memory = intel_iommu_get_reserved_device_memory, > .dump_page_tables = vtd_dump_page_tables, > }; > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -147,9 +147,11 @@ enum > { > _IOMMU_FLUSHF_added, > _IOMMU_FLUSHF_modified, > + _IOMMU_FLUSHF_all, > }; > #define IOMMU_FLUSHF_added (1u << _IOMMU_FLUSHF_added) > #define IOMMU_FLUSHF_modified (1u << _IOMMU_FLUSHF_modified) > +#define IOMMU_FLUSHF_all (1u << _IOMMU_FLUSHF_all) > > int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn, > unsigned long page_count, unsigned int flags, > @@ -282,7 +284,6 @@ struct iommu_ops { > int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn, > unsigned long page_count, > unsigned int flush_flags); > - int __must_check (*iotlb_flush_all)(struct domain *d); > int (*get_reserved_device_memory)(iommu_grdm_t *, void *); > void (*dump_page_tables)(struct domain *d); > >
On 24/09/2021 10:53, Jan Beulich wrote: > Having a separate flush-all hook has always been puzzling me some. We > will want to be able to force a full flush via accumulated flush flags > from the map/unmap functions. Introduce a respective new flag and fold > all flush handling to use the single remaining hook. > > Note that because of the respective comments in SMMU and IPMMU-VMSA > code, I've folded the two prior hook functions into one. For SMMU-v3, > which lacks a comment towards incapable hardware, I've left both > functions in place on the assumption that selective and full flushes > will eventually want separating. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> For the Arm part: Acked-by: Julien Grall <jgrall@amazon.com> Cheers,
On 16.12.2021 12:30, Rahul Singh wrote: >> On 24 Sep 2021, at 10:53 am, Jan Beulich <jbeulich@suse.com> wrote: >> >> Having a separate flush-all hook has always been puzzling me some. We >> will want to be able to force a full flush via accumulated flush flags >> from the map/unmap functions. Introduce a respective new flag and fold >> all flush handling to use the single remaining hook. >> >> Note that because of the respective comments in SMMU and IPMMU-VMSA >> code, I've folded the two prior hook functions into one. For SMMU-v3, >> which lacks a comment towards incapable hardware, I've left both >> functions in place on the assumption that selective and full flushes >> will eventually want separating. > > > For SMMUv3 related Changs: > Reviewed-by: Rahul Singh <rahul.singh@arm.com> Thanks. Any chance of an ack / R-b also for patch 3? Jan
--- a/xen/drivers/passthrough/amd/iommu.h +++ b/xen/drivers/passthrough/amd/iommu.h @@ -242,7 +242,6 @@ int amd_iommu_get_reserved_device_memory int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn, unsigned long page_count, unsigned int flush_flags); -int __must_check amd_iommu_flush_iotlb_all(struct domain *d); void amd_iommu_print_entries(const struct amd_iommu *iommu, unsigned int dev_id, dfn_t dfn); --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -475,15 +475,18 @@ int amd_iommu_flush_iotlb_pages(struct d { unsigned long dfn_l = dfn_x(dfn); - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); - ASSERT(flush_flags); + if ( !(flush_flags & IOMMU_FLUSHF_all) ) + { + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); + ASSERT(flush_flags); + } /* Unless a PTE was modified, no flush is required */ if ( !(flush_flags & IOMMU_FLUSHF_modified) ) return 0; - /* If the range wraps then just flush everything */ - if ( dfn_l + page_count < dfn_l ) + /* If so requested or if the range wraps then just flush everything. */ + if ( (flush_flags & IOMMU_FLUSHF_all) || dfn_l + page_count < dfn_l ) { amd_iommu_flush_all_pages(d); return 0; @@ -508,13 +511,6 @@ int amd_iommu_flush_iotlb_pages(struct d return 0; } - -int amd_iommu_flush_iotlb_all(struct domain *d) -{ - amd_iommu_flush_all_pages(d); - - return 0; -} int amd_iommu_reserve_domain_unity_map(struct domain *d, const struct ivrs_unity_map *map, --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -642,7 +642,6 @@ static const struct iommu_ops __initcons .map_page = amd_iommu_map_page, .unmap_page = amd_iommu_unmap_page, .iotlb_flush = amd_iommu_flush_iotlb_pages, - .iotlb_flush_all = amd_iommu_flush_iotlb_all, .reassign_device = reassign_device, .get_device_group_id = amd_iommu_group_id, .enable_x2apic = iov_enable_xt, --- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c @@ -930,13 +930,19 @@ out: } /* Xen IOMMU ops */ -static int __must_check ipmmu_iotlb_flush_all(struct domain *d) +static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, + unsigned long page_count, + unsigned int flush_flags) { struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv; + ASSERT(flush_flags); + if ( !xen_domain || !xen_domain->root_domain ) return 0; + /* The hardware doesn't support selective TLB flush. */ + spin_lock(&xen_domain->lock); ipmmu_tlb_invalidate(xen_domain->root_domain); spin_unlock(&xen_domain->lock); @@ -944,16 +950,6 @@ static int __must_check ipmmu_iotlb_flus return 0; } -static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn, - unsigned long page_count, - unsigned int flush_flags) -{ - ASSERT(flush_flags); - - /* The hardware doesn't support selective TLB flush. */ - return ipmmu_iotlb_flush_all(d); -} - static struct ipmmu_vmsa_domain *ipmmu_get_cache_domain(struct domain *d, struct device *dev) { @@ -1303,7 +1299,6 @@ static const struct iommu_ops ipmmu_iomm .hwdom_init = ipmmu_iommu_hwdom_init, .teardown = ipmmu_iommu_domain_teardown, .iotlb_flush = ipmmu_iotlb_flush, - .iotlb_flush_all = ipmmu_iotlb_flush_all, .assign_device = ipmmu_assign_device, .reassign_device = ipmmu_reassign_device, .map_page = arm_iommu_map_page, --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -2649,11 +2649,17 @@ static int force_stage = 2; */ static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK; -static int __must_check arm_smmu_iotlb_flush_all(struct domain *d) +static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, + unsigned long page_count, + unsigned int flush_flags) { struct arm_smmu_xen_domain *smmu_domain = dom_iommu(d)->arch.priv; struct iommu_domain *cfg; + ASSERT(flush_flags); + + /* ARM SMMU v1 doesn't have flush by VMA and VMID */ + spin_lock(&smmu_domain->lock); list_for_each_entry(cfg, &smmu_domain->contexts, list) { /* @@ -2670,16 +2676,6 @@ static int __must_check arm_smmu_iotlb_f return 0; } -static int __must_check arm_smmu_iotlb_flush(struct domain *d, dfn_t dfn, - unsigned long page_count, - unsigned int flush_flags) -{ - ASSERT(flush_flags); - - /* ARM SMMU v1 doesn't have flush by VMA and VMID */ - return arm_smmu_iotlb_flush_all(d); -} - static struct iommu_domain *arm_smmu_get_domain(struct domain *d, struct device *dev) { @@ -2879,7 +2875,6 @@ static const struct iommu_ops arm_smmu_i .add_device = arm_smmu_dt_add_device_generic, .teardown = arm_smmu_iommu_domain_teardown, .iotlb_flush = arm_smmu_iotlb_flush, - .iotlb_flush_all = arm_smmu_iotlb_flush_all, .assign_device = arm_smmu_assign_dev, .reassign_device = arm_smmu_reassign_dev, .map_page = arm_iommu_map_page, --- a/xen/drivers/passthrough/arm/smmu-v3.c +++ b/xen/drivers/passthrough/arm/smmu-v3.c @@ -3431,7 +3431,6 @@ static const struct iommu_ops arm_smmu_i .hwdom_init = arm_smmu_iommu_hwdom_init, .teardown = arm_smmu_iommu_xen_domain_teardown, .iotlb_flush = arm_smmu_iotlb_flush, - .iotlb_flush_all = arm_smmu_iotlb_flush_all, .assign_device = arm_smmu_assign_dev, .reassign_device = arm_smmu_reassign_dev, .map_page = arm_iommu_map_page, --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -463,15 +463,12 @@ int iommu_iotlb_flush_all(struct domain const struct domain_iommu *hd = dom_iommu(d); int rc; - if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush_all || + if ( !is_iommu_enabled(d) || !hd->platform_ops->iotlb_flush || !flush_flags ) return 0; - /* - * The operation does a full flush so we don't need to pass the - * flush_flags in. - */ - rc = iommu_call(hd->platform_ops, iotlb_flush_all, d); + rc = iommu_call(hd->platform_ops, iotlb_flush, d, INVALID_DFN, 0, + flush_flags | IOMMU_FLUSHF_all); if ( unlikely(rc) ) { if ( !d->is_shutting_down && printk_ratelimit() ) --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -731,18 +731,21 @@ static int __must_check iommu_flush_iotl unsigned long page_count, unsigned int flush_flags) { - ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); - ASSERT(flush_flags); + if ( flush_flags & IOMMU_FLUSHF_all ) + { + dfn = INVALID_DFN; + page_count = 0; + } + else + { + ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); + ASSERT(flush_flags); + } return iommu_flush_iotlb(d, dfn, flush_flags & IOMMU_FLUSHF_modified, page_count); } -static int __must_check iommu_flush_iotlb_all(struct domain *d) -{ - return iommu_flush_iotlb(d, INVALID_DFN, 0, 0); -} - static void queue_free_pt(struct domain *d, mfn_t mfn, unsigned int next_level) { if ( next_level > 1 ) @@ -2841,7 +2844,7 @@ static int __init intel_iommu_quarantine spin_unlock(&hd->arch.mapping_lock); if ( !rc ) - rc = iommu_flush_iotlb_all(d); + rc = iommu_flush_iotlb(d, INVALID_DFN, 0, 0); /* Pages may be leaked in failure case */ return rc; @@ -2874,7 +2877,6 @@ static struct iommu_ops __initdata vtd_o .resume = vtd_resume, .crash_shutdown = vtd_crash_shutdown, .iotlb_flush = iommu_flush_iotlb_pages, - .iotlb_flush_all = iommu_flush_iotlb_all, .get_reserved_device_memory = intel_iommu_get_reserved_device_memory, .dump_page_tables = vtd_dump_page_tables, }; --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -147,9 +147,11 @@ enum { _IOMMU_FLUSHF_added, _IOMMU_FLUSHF_modified, + _IOMMU_FLUSHF_all, }; #define IOMMU_FLUSHF_added (1u << _IOMMU_FLUSHF_added) #define IOMMU_FLUSHF_modified (1u << _IOMMU_FLUSHF_modified) +#define IOMMU_FLUSHF_all (1u << _IOMMU_FLUSHF_all) int __must_check iommu_map(struct domain *d, dfn_t dfn, mfn_t mfn, unsigned long page_count, unsigned int flags, @@ -282,7 +284,6 @@ struct iommu_ops { int __must_check (*iotlb_flush)(struct domain *d, dfn_t dfn, unsigned long page_count, unsigned int flush_flags); - int __must_check (*iotlb_flush_all)(struct domain *d); int (*get_reserved_device_memory)(iommu_grdm_t *, void *); void (*dump_page_tables)(struct domain *d);
Having a separate flush-all hook has always been puzzling me some. We will want to be able to force a full flush via accumulated flush flags from the map/unmap functions. Introduce a respective new flag and fold all flush handling to use the single remaining hook. Note that because of the respective comments in SMMU and IPMMU-VMSA code, I've folded the two prior hook functions into one. For SMMU-v3, which lacks a comment towards incapable hardware, I've left both functions in place on the assumption that selective and full flushes will eventually want separating. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- TBD: What we really are going to need is for the map/unmap functions to specify that a wider region needs flushing than just the one covered by the present set of (un)maps. This may still be less than a full flush, but at least as a first step it seemed better to me to keep things simple and go the flush-all route. --- v2: New.