Message ID | 20190320173634.21895-3-jean-philippe.brucker@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add PCI ATS support to Arm SMMUv3 | expand |
Hi Jean-Philippe, First off, thanks for posting this: it's definitely something that I'm keen to support, and getting bits in a piece at a time is probably a good idea. On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote: > When removing a mapping from a domain, we need to send an invalidation to > all devices that might have stored it in their Address Translation Cache > (ATC). In addition when updating the context descriptor of a live domain, > we'll need to send invalidations for all devices attached to it. > > Maintain a list of devices in each domain, protected by a spinlock. It is > updated every time we attach or detach devices to and from domains. > > It needs to be a spinlock because we'll invalidate ATC entries from > within hardirq-safe contexts, but it may be possible to relax the read > side with RCU later. > > Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> > --- > drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index d3880010c6cf..66a29c113dbc 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -594,6 +594,11 @@ struct arm_smmu_device { > struct arm_smmu_master_data { > struct arm_smmu_device *smmu; > struct arm_smmu_strtab_ent ste; > + > + struct arm_smmu_domain *domain; > + struct list_head domain_head; > + > + struct device *dev; > }; > > /* SMMU private data for an IOMMU domain */ > @@ -618,6 +623,9 @@ struct arm_smmu_domain { > }; > > struct iommu_domain domain; > + > + struct list_head devices; > + spinlock_t devices_lock; > }; > > struct arm_smmu_option_prop { > @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) > } > > mutex_init(&smmu_domain->init_mutex); > + INIT_LIST_HEAD(&smmu_domain->devices); > + spin_lock_init(&smmu_domain->devices_lock); I'm wondering whether we can't take this a bit further and re-organise the data structures to make this a little simpler overall. Something along the lines of: struct arm_smmu_master_data { struct list_head list; // masters in the same domain struct arm_smmu_device *smmu; unsigned int num_sids; u32 *sids; // Points into fwspec struct arm_smmu_domain *domain; // NULL -> !assigned }; and then just add a list_head into struct arm_smmu_domain to track the masters that have been attached (if you're feeling brave, you could put this into the s1_cfg). The ATC invalidation logic would then be: - Detaching a device: walk over the sids from the master data - Unmapping a range from a domain: walk over the attached masters I think this would also allow us to remove struct arm_smmu_strtab_ent completely. Dunno: this is one of the those things where you really have to try it to figure out why it doesn't work... Will
On 04/04/2019 15:39, Will Deacon wrote: > Hi Jean-Philippe, > > First off, thanks for posting this: it's definitely something that I'm keen > to support, and getting bits in a piece at a time is probably a good idea. > > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote: >> When removing a mapping from a domain, we need to send an invalidation to >> all devices that might have stored it in their Address Translation Cache >> (ATC). In addition when updating the context descriptor of a live domain, >> we'll need to send invalidations for all devices attached to it. >> >> Maintain a list of devices in each domain, protected by a spinlock. It is >> updated every time we attach or detach devices to and from domains. >> >> It needs to be a spinlock because we'll invalidate ATC entries from >> within hardirq-safe contexts, but it may be possible to relax the read >> side with RCU later. >> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> >> --- >> drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++ >> 1 file changed, 28 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index d3880010c6cf..66a29c113dbc 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -594,6 +594,11 @@ struct arm_smmu_device { >> struct arm_smmu_master_data { >> struct arm_smmu_device *smmu; >> struct arm_smmu_strtab_ent ste; >> + >> + struct arm_smmu_domain *domain; >> + struct list_head domain_head; >> + >> + struct device *dev; >> }; >> >> /* SMMU private data for an IOMMU domain */ >> @@ -618,6 +623,9 @@ struct arm_smmu_domain { >> }; >> >> struct iommu_domain domain; >> + >> + struct list_head devices; >> + spinlock_t devices_lock; >> }; >> >> struct arm_smmu_option_prop { >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) >> } >> >> mutex_init(&smmu_domain->init_mutex); >> + INIT_LIST_HEAD(&smmu_domain->devices); >> + spin_lock_init(&smmu_domain->devices_lock); > > I'm wondering whether we can't take this a bit further and re-organise the > data structures to make this a little simpler overall. Something along the > lines of: > > struct arm_smmu_master_data { > struct list_head list; // masters in the same domain > struct arm_smmu_device *smmu; > unsigned int num_sids; > u32 *sids; // Points into fwspec > struct arm_smmu_domain *domain; // NULL -> !assigned > }; > > and then just add a list_head into struct arm_smmu_domain to track the > masters that have been attached (if you're feeling brave, you could put > this into the s1_cfg). I'm not sure about that last bit, shouldn't the list of masters apply to both s1 and s2? > > The ATC invalidation logic would then be: > > - Detaching a device: walk over the sids from the master data > - Unmapping a range from a domain: walk over the attached masters > > I think this would also allow us to remove struct arm_smmu_strtab_ent > completely. Makes sense, it does work and simplifies the structures. It makes the PASID and PRI patches slightly nicer as well. I'll resend once my tests complete. Thanks, Jean > > Dunno: this is one of the those things where you really have to try it > to figure out why it doesn't work... > > Will > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu >
On Fri, Apr 05, 2019 at 05:35:52PM +0100, Jean-Philippe Brucker wrote: > On 04/04/2019 15:39, Will Deacon wrote: > > On Wed, Mar 20, 2019 at 05:36:32PM +0000, Jean-Philippe Brucker wrote: > >> When removing a mapping from a domain, we need to send an invalidation to > >> all devices that might have stored it in their Address Translation Cache > >> (ATC). In addition when updating the context descriptor of a live domain, > >> we'll need to send invalidations for all devices attached to it. > >> > >> Maintain a list of devices in each domain, protected by a spinlock. It is > >> updated every time we attach or detach devices to and from domains. > >> > >> It needs to be a spinlock because we'll invalidate ATC entries from > >> within hardirq-safe contexts, but it may be possible to relax the read > >> side with RCU later. > >> > >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> > >> --- > >> drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++ > >> 1 file changed, 28 insertions(+) > >> > >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > >> index d3880010c6cf..66a29c113dbc 100644 > >> --- a/drivers/iommu/arm-smmu-v3.c > >> +++ b/drivers/iommu/arm-smmu-v3.c > >> @@ -594,6 +594,11 @@ struct arm_smmu_device { > >> struct arm_smmu_master_data { > >> struct arm_smmu_device *smmu; > >> struct arm_smmu_strtab_ent ste; > >> + > >> + struct arm_smmu_domain *domain; > >> + struct list_head domain_head; > >> + > >> + struct device *dev; > >> }; > >> > >> /* SMMU private data for an IOMMU domain */ > >> @@ -618,6 +623,9 @@ struct arm_smmu_domain { > >> }; > >> > >> struct iommu_domain domain; > >> + > >> + struct list_head devices; > >> + spinlock_t devices_lock; > >> }; > >> > >> struct arm_smmu_option_prop { > >> @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) > >> } > >> > >> mutex_init(&smmu_domain->init_mutex); > >> + INIT_LIST_HEAD(&smmu_domain->devices); > >> + spin_lock_init(&smmu_domain->devices_lock); > > > > I'm wondering whether we can't take this a bit further and re-organise the > > data structures to make this a little simpler overall. Something along the > > lines of: > > > > struct arm_smmu_master_data { > > struct list_head list; // masters in the same domain > > struct arm_smmu_device *smmu; > > unsigned int num_sids; > > u32 *sids; // Points into fwspec > > struct arm_smmu_domain *domain; // NULL -> !assigned > > }; > > > > and then just add a list_head into struct arm_smmu_domain to track the > > masters that have been attached (if you're feeling brave, you could put > > this into the s1_cfg). > > I'm not sure about that last bit, shouldn't the list of masters apply to > both s1 and s2? I was assuming that (a) we were only using ATS with stage-1 and (b) we only need the masters list for ATC invalidation. Did I go wrong somewhere? > > The ATC invalidation logic would then be: > > > > - Detaching a device: walk over the sids from the master data > > - Unmapping a range from a domain: walk over the attached masters > > > > I think this would also allow us to remove struct arm_smmu_strtab_ent > > completely. > > Makes sense, it does work and simplifies the structures. It makes the > PASID and PRI patches slightly nicer as well. I'll resend once my tests > complete. Brill, thanks for giving it a go so quickly. Will
On 05/04/2019 17:39, Will Deacon wrote: >>> I'm wondering whether we can't take this a bit further and re-organise the >>> data structures to make this a little simpler overall. Something along the >>> lines of: >>> >>> struct arm_smmu_master_data { >>> struct list_head list; // masters in the same domain >>> struct arm_smmu_device *smmu; >>> unsigned int num_sids; >>> u32 *sids; // Points into fwspec >>> struct arm_smmu_domain *domain; // NULL -> !assigned >>> }; >>> >>> and then just add a list_head into struct arm_smmu_domain to track the >>> masters that have been attached (if you're feeling brave, you could put >>> this into the s1_cfg). >> >> I'm not sure about that last bit, shouldn't the list of masters apply to >> both s1 and s2? > > I was assuming that (a) we were only using ATS with stage-1 and (b) we only > need the masters list for ATC invalidation. Did I go wrong somewhere? Hmm, I messed that up in patch 3, actually. I'm still unsure about it, but in the end I was meaning to enable ATS for both stage-1 and stage-2 domains. We could make it stage-1 only, but VFIO can assign a device with a stage-1 domain to userspace. That's the case with QEMU, which doesn't use VFIO_TYPE1_NESTING_IOMMU at the moment. So I don't think limiting ATS to stage-1 protects against anything. For (b), the master list will likely also be used for PASID support, to invalidate cached CDs for each device in a domain, but that's only stage-1. Then the current patchsets for nested translation also rely on the master list to bind guest PASID tables to all devices attached to a domain. But with those patches, s1_cfg and s2_cfg aren't in an enum anymore and we could still keep the list in s1_cfg. Thanks, Jean
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index d3880010c6cf..66a29c113dbc 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -594,6 +594,11 @@ struct arm_smmu_device { struct arm_smmu_master_data { struct arm_smmu_device *smmu; struct arm_smmu_strtab_ent ste; + + struct arm_smmu_domain *domain; + struct list_head domain_head; + + struct device *dev; }; /* SMMU private data for an IOMMU domain */ @@ -618,6 +623,9 @@ struct arm_smmu_domain { }; struct iommu_domain domain; + + struct list_head devices; + spinlock_t devices_lock; }; struct arm_smmu_option_prop { @@ -1493,6 +1501,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) } mutex_init(&smmu_domain->init_mutex); + INIT_LIST_HEAD(&smmu_domain->devices); + spin_lock_init(&smmu_domain->devices_lock); + return &smmu_domain->domain; } @@ -1711,8 +1722,18 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec) static void arm_smmu_detach_dev(struct device *dev) { + unsigned long flags; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_master_data *master = fwspec->iommu_priv; + struct arm_smmu_domain *smmu_domain = master->domain; + + if (smmu_domain) { + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_del(&master->domain_head); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + master->domain = NULL; + } master->ste.assigned = false; arm_smmu_install_ste_for_dev(fwspec); @@ -1721,6 +1742,7 @@ static void arm_smmu_detach_dev(struct device *dev) static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) { int ret = 0; + unsigned long flags; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct arm_smmu_device *smmu; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); @@ -1757,6 +1779,11 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) } ste->assigned = true; + master->domain = smmu_domain; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_add(&master->domain_head, &smmu_domain->devices); + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); if (smmu_domain->stage == ARM_SMMU_DOMAIN_BYPASS) { ste->s1_cfg = NULL; @@ -1883,6 +1910,7 @@ static int arm_smmu_add_device(struct device *dev) return -ENOMEM; master->smmu = smmu; + master->dev = dev; fwspec->iommu_priv = master; }
When removing a mapping from a domain, we need to send an invalidation to all devices that might have stored it in their Address Translation Cache (ATC). In addition when updating the context descriptor of a live domain, we'll need to send invalidations for all devices attached to it. Maintain a list of devices in each domain, protected by a spinlock. It is updated every time we attach or detach devices to and from domains. It needs to be a spinlock because we'll invalidate ATC entries from within hardirq-safe contexts, but it may be possible to relax the read side with RCU later. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> --- drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)