diff mbox series

[v2,14/18] iommu/arm-smmu-v3: Support domains with shared CDs

Message ID 20230606120854.4170244-15-mshavit@google.com (mailing list archive)
State New, archived
Headers show
Series Add PASID support to SMMUv3 unmanaged domains | expand

Commit Message

Michael Shavit June 6, 2023, 12:07 p.m. UTC
SVA may attach a CD to masters that have different upstream SMMU
devices. The arm_smmu_domain structure can only be attached to a single
upstream SMMU device however. To work around this limitation, we propose
an ARM_SMMU_DOMAIN_S1_SHARED domain type for domains that attach a CD
shared across with arm_smmu_domains (each attached to a different
upstream SMMU device).

Signed-off-by: Michael Shavit <mshavit@google.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 ++++++++++++++++-----
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 ++
 2 files changed, 22 insertions(+), 6 deletions(-)

Comments

Jason Gunthorpe June 6, 2023, 5:09 p.m. UTC | #1
On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote:
> SVA may attach a CD to masters that have different upstream SMMU
> devices. The arm_smmu_domain structure can only be attached to a single
> upstream SMMU device however. 

Isn't that pretty much because we don't support replicating
invalidations to each of the different SMMU instances?

I don't really get why there would be anything like a "shared CD".

It looks kind of like the arm_smmu_ctx_desc has become a bit weird
since its main purpose seems to be to refcount the ASID.

But our general design is that the iommu_domain should be 1:1 with the
ASID, so it is really strange that we'd refcount the ASID..

I would expect different SMMU devices to be handled by allowing single
domains to be attached to a list of masters where the masters can all
be from different instances.

When an invalidation comes in we have to replicate it through all the
instances for the IOTLB and all the masters for the ATC.

What we definately shouldn't do is try to have different SVA
iommu_domain's pointing at the same ASID. That is again making SVA
special, which we are trying to get away from :)

You might try to stop your series here and get the first batch of
patches done. This latter bit seems to be a seperate topic?

Jason
Michael Shavit June 6, 2023, 6:36 p.m. UTC | #2
On Tue, Jun 6, 2023 at 10:39 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote:
> > SVA may attach a CD to masters that have different upstream SMMU
> > devices. The arm_smmu_domain structure can only be attached to a single
> > upstream SMMU device however.
>
> Isn't that pretty much because we don't support replicating
> invalidations to each of the different SMMU instances?
....
> I would expect different SMMU devices to be handled by allowing single
> domains to be attached to a list of masters where the masters can all
> be from different instances.

Oh you're right! When I first looked at this, arm_smmu_domain still
held an arm_smmu_s1_cfg which depends on the SMMU device. But
attaching a single (stage 1) arm_smmu_domain to multiple SMMUs looks
much more tractable after the first half of this patch series. We can
add a handle to the smmu device in the new arm_smmu_attached_domain
struct for this purpose.

> What we definately shouldn't do is try to have different SVA
> iommu_domain's pointing at the same ASID. That is again making SVA
> special, which we are trying to get away from :)

Fwiw, this change is preserving the status-quo in that regard;
arm-smmu-v3-sva.c is already doing this. But yes, I agree that
resolving the limitation is a better long term solution... and
something I can try to look at further.

> You might try to stop your series here and get the first batch of
> patches done. This latter bit seems to be a seperate topic?

The main motivation for this part of the series is to reduce
inconsistencies with the smmu_domain->attached_domains list and
arm-smmu-v3 functions that rely on that list. Specifically to reach
the last patch in the series: "iommu/arm-smmu-v3-sva: Remove
atc_inv_domain_ssid".

Splitting this part into a follow-up patch series would definitely be
easier and helpful if you're all ok with it :) .
Jason Gunthorpe June 7, 2023, 11:59 a.m. UTC | #3
On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote:
> > What we definately shouldn't do is try to have different SVA
> > iommu_domain's pointing at the same ASID. That is again making SVA
> > special, which we are trying to get away from :)
> 
> Fwiw, this change is preserving the status-quo in that regard;
> arm-smmu-v3-sva.c is already doing this. But yes, I agree that
> resolving the limitation is a better long term solution... and
> something I can try to look at further.

I suppose we also don't really have a entirely clear picture what
allocating multiple SVA domains should even do in the iommu driver.

The driver would like to share the ASID, but things are much cleaner
for everything if the driver model has ASID 1:1 with the iommu_domain.

It suggests we are missing some core code in iommu_sva_bind_device()
to try to re-use existing SVA iommu_domains. This would certainly be
better than trying to teach every driver how to share and refcount
its ASID concept...

Today we have this super hacky iommu_get_domain_for_dev_pasid()
thing that allows SVA domain reuse for a single device.

Possibly what we should do is conver the u32 pasid in the mm_struct to
a struct iommu_mm_data * and put alot more stuff in there. eg a linked
list of all SVA domains.

> Splitting this part into a follow-up patch series would definitely be
> easier and helpful if you're all ok with it :) .

I think splitting it into a series to re-organize the way ste/cd stuff
works is a nice contained topic.

Adjusting the way the ASID works with SVA is another good topic

And so on

Jason
Baolu Lu June 8, 2023, 2:39 a.m. UTC | #4
On 6/7/23 7:59 PM, Jason Gunthorpe wrote:
> On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote:
>>> What we definately shouldn't do is try to have different SVA
>>> iommu_domain's pointing at the same ASID. That is again making SVA
>>> special, which we are trying to get away from 
Jason Gunthorpe June 8, 2023, 1:39 p.m. UTC | #5
On Thu, Jun 08, 2023 at 10:39:23AM +0800, Baolu Lu wrote:
> On 6/7/23 7:59 PM, Jason Gunthorpe wrote:
> > On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote:
> > > > What we definately shouldn't do is try to have different SVA
> > > > iommu_domain's pointing at the same ASID. That is again making SVA
> > > > special, which we are trying to get away from 
Baolu Lu June 9, 2023, 1:44 a.m. UTC | #6
On 6/8/23 9:39 PM, Jason Gunthorpe wrote:
> On Thu, Jun 08, 2023 at 10:39:23AM +0800, Baolu Lu wrote:
>> On 6/7/23 7:59 PM, Jason Gunthorpe wrote:
>>> On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote:
>>>>> What we definately shouldn't do is try to have different SVA
>>>>> iommu_domain's pointing at the same ASID. That is again making SVA
>>>>> special, which we are trying to get away from 
Michael Shavit June 14, 2023, 9:17 a.m. UTC | #7
On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote:
> > SVA may attach a CD to masters that have different upstream SMMU
> > devices. The arm_smmu_domain structure can only be attached to a single
> > upstream SMMU device however.
>
> Isn't that pretty much because we don't support replicating
> invalidations to each of the different SMMU instances?

Looked into this some more, and supporting attach to multiple devices
is still very hard:
1. When an arm_smmu_domain is first attached to a master, it
initializes an io_pgtable_cfg object whose properties depend on the
master's upstream SMMU device. This io_pgtable_cfg is then used to
allocate an io_pgtable object, and the arm_smmu_ctx_desc's TTBR field
points to that io_pgtable's TTBR (and ditto for the vttbr on stage 2
domains). So then arm_smmu_domain needs to be split into two,
arm_smmu_domain and arm_smmu_device_domain with the latter containing
a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg.
Each iommu_domain_ops operation now needs to loop over each
arm_smmu_device_domain.
2. Some of the iommu_domain fields also depend on the per-SMMU
io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end.
These need to be restricted as the domain is attached to more devices.
3. Attaching a domain to a new SMMU device must be prohibited after
any call to map_pages or if iommu_domain.pgsize_bitmap and
iommu-domain.geometry.aperture_end have been consumed by any system.
The first is something the arm-smmu-v3.c driver could feasibly enforce
on its own, the second requires changes to the iommu framework.

The arm-smmu-v3-sva.c implementation avoids all these problems because it
doesn't need to allocate an io_pgtable; the shared arm_smmu_ctx_desc's
TTBR is set to the MM's TTBR. One possible solution would be to make
the changes proposed in 1., but only allowing SVA domains to attach to
multiple devices:
1. Domain functions only accessed by arm-smmu-v3.c (such as
arm_smmu_iova_to_phys) can assume there's a single
arm_smmu_device_domain per arm_smmu_domain.
2. Domain functions also accessed by arm-smmu-v3-sva.c (such as
invalidations commands) must iterate over all arm_smmu_device_domains.

> On Wed, Jun 7, 2023 at 8:00 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> I think splitting it into a series to re-organize the way ste/cd stuff
> works is a nice contained topic.

Should I explicitly resend this patch series as a v3 cutting off
patches 14 and onwards?
Michael Shavit June 14, 2023, 9:43 a.m. UTC | #8
On Wed, Jun 14, 2023 at 5:17 PM Michael Shavit <mshavit@google.com> wrote:
> The arm-smmu-v3-sva.c implementation avoids all these problems because it
> doesn't need to allocate an io_pgtable; the shared arm_smmu_ctx_desc's
> TTBR is set to the MM's TTBR. One possible solution would be to make
> the changes proposed in 1., but only allowing SVA domains to attach to
> multiple devices:
> 1. Domain functions only accessed by arm-smmu-v3.c (such as
> arm_smmu_iova_to_phys) can assume there's a single
> arm_smmu_device_domain per arm_smmu_domain.
> 2. Domain functions also accessed by arm-smmu-v3-sva.c (such as
> invalidations commands) must iterate over all arm_smmu_device_domains.

Or rather, arm_smmu_domain would hold a list of attached smmu devices,
but only SVA domains would be allowed to have multiple entries in that
list since the other arm_smmu_domain fields are tied to a single SMMU
device for other domain types.
Robin Murphy June 14, 2023, 9:57 a.m. UTC | #9
On 2023-06-14 10:17, Michael Shavit wrote:
> On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote:
>>> SVA may attach a CD to masters that have different upstream SMMU
>>> devices. The arm_smmu_domain structure can only be attached to a single
>>> upstream SMMU device however.
>>
>> Isn't that pretty much because we don't support replicating
>> invalidations to each of the different SMMU instances?
> 
> Looked into this some more, and supporting attach to multiple devices
> is still very hard:
> 1. When an arm_smmu_domain is first attached to a master, it
> initializes an io_pgtable_cfg object whose properties depend on the
> master's upstream SMMU device. This io_pgtable_cfg is then used to
> allocate an io_pgtable object, and the arm_smmu_ctx_desc's TTBR field
> points to that io_pgtable's TTBR (and ditto for the vttbr on stage 2
> domains). So then arm_smmu_domain needs to be split into two,
> arm_smmu_domain and arm_smmu_device_domain with the latter containing
> a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg.
> Each iommu_domain_ops operation now needs to loop over each
> arm_smmu_device_domain.
> 2. Some of the iommu_domain fields also depend on the per-SMMU
> io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end.
> These need to be restricted as the domain is attached to more devices.
> 3. Attaching a domain to a new SMMU device must be prohibited after
> any call to map_pages or if iommu_domain.pgsize_bitmap and
> iommu-domain.geometry.aperture_end have been consumed by any system.
> The first is something the arm-smmu-v3.c driver could feasibly enforce
> on its own, the second requires changes to the iommu framework.

In practice it would be entirely reasonable to only support 
cross-instance attach between instances with matching capabilities such 
that they *can* share the pagetable directly.

> The arm-smmu-v3-sva.c implementation avoids all these problems because it
> doesn't need to allocate an io_pgtable;

SVA has all the same underlying problems - pagetable sharing is still 
pagetable sharing regardless of which code happens to allocate the 
physical pages - they're just avoided up-front by disallowing SVA at all 
if the relevant capabilities don't line up between SMMU and CPU.

Thanks,
Robin.
Jason Gunthorpe June 14, 2023, 12:10 p.m. UTC | #10
On Wed, Jun 14, 2023 at 05:17:03PM +0800, Michael Shavit wrote:
> On Wed, Jun 7, 2023 at 1:09 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Tue, Jun 06, 2023 at 08:07:50PM +0800, Michael Shavit wrote:
> > > SVA may attach a CD to masters that have different upstream SMMU
> > > devices. The arm_smmu_domain structure can only be attached to a single
> > > upstream SMMU device however.
> >
> > Isn't that pretty much because we don't support replicating
> > invalidations to each of the different SMMU instances?
> 
> Looked into this some more, and supporting attach to multiple devices
> is still very hard:
> 1. When an arm_smmu_domain is first attached to a master, it
> initializes an io_pgtable_cfg object whose properties depend on the
> master's upstream SMMU device.

So, this is actually kind of wrong, Robin is working on fixing it.

The mental model you should have is when an iommu_domain is created it
represents a single, definitive, IO page table format. The format is
selected based on creation arguments and the 'master' it will be used
with.

An iommu_domain has a single IO page table in a single format, and
that format doesn't change while the iommu_domain exists.

Even single-instance cases have S1 and S2 IO page table formats
co-existing.

When we talk about multi instance support, it means the iommu_domain -
in whatever fixed IO page table format it uses - can be attached to
any SMMU instance that supports it as a compatible page table format.

ARM doesn't quite reach this model, but once it passes the finalize
step it does. The goal is to move finalize to allocate. So for your
purposes you can ignore the difference.

> domains). So then arm_smmu_domain needs to be split into two,
> arm_smmu_domain and arm_smmu_device_domain with the latter containing
> a per-SMMU device io_pgtable, arm_smmu_ctx_desc and arm_smmu_s2_cfg.
> Each iommu_domain_ops operation now needs to loop over each
> arm_smmu_device_domain.

No, if the instance is not compatible whith the domain then the
attachment fails.

> 2. Some of the iommu_domain fields also depend on the per-SMMU
> io_pgtable_cfg; specifically pgsize_bitmap and geometry.aperture_end.
> These need to be restricted as the domain is attached to more
> devices.

These need to be an exact match then.

> 3. Attaching a domain to a new SMMU device must be prohibited after
> any call to map_pages or if iommu_domain.pgsize_bitmap and
> iommu-domain.geometry.aperture_end have been consumed by any system.

Attach needs to fail, we don't reconfigure the domain.

> The arm-smmu-v3-sva.c implementation avoids all these problems
> because it doesn't need to allocate an io_pgtable; the shared
> arm_smmu_ctx_desc's

This isn't true, it has an IO page table (owned by the MM) and that
page table has all the same properties you were concerned about
above. They all still need to be checked for compatability.

ie from a SMMU HW instance perspective there is no difference between
a S1 or MM owned IO page table. From a code perspective the only
difference should be where the TTBR comes from. The SVA case still
should have a cfg describing what kind of IO pagetable the mm has
created, and that cfg should still technically be checked for
compatability.

Have in mind that SVA is *exactly* the same as a S1 iommu_domain with
PRI enabled except that the TTBR comes from the MM instead of being
internally allocated. There should be no other differences between the
two operating modes, that SVA became its owns special thing needs to
be undone, not doubled down on.

> > I think splitting it into a series to re-organize the way ste/cd stuff
> > works is a nice contained topic.
> 
> Should I explicitly resend this patch series as a v3 cutting off
> patches 14 and onwards?

I think it is good to make progress, it looked to me like the first
part stood alone fairly well and was an improvement on its own.

Jason
Michael Shavit June 14, 2023, 1:30 p.m. UTC | #11
On Wed, Jun 14, 2023 at 5:57 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> In practice it would be entirely reasonable to only support
> cross-instance attach between instances with matching capabilities such
> that they *can* share the pagetable directly.

On Wed, Jun 14, 2023 at 8:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> When we talk about multi instance support, it means the iommu_domain -
> in whatever fixed IO page table format it uses - can be attached to
> any SMMU instance that supports it as a compatible page table format.
>
> ARM doesn't quite reach this model, but once it passes the finalize
> step it does. The goal is to move finalize to allocate. So for your
> purposes you can ignore the difference.

Got you. Failing the atach when the page table format is incompatible
with the smmu device is a lot easier to handle. I didn't notice that
SVA was already checking this elsewhere.
I can give the multi-instance support a try (with the rest of these
patches on top) and send it out as a follow-up series to this one.

> I think it is good to make progress, it looked to me like the first
> part stood alone fairly well and was an improvement on its own.

Sorry for the noobie question; it's not 100% obvious to me what the
next step is here. Is there anything I should do to progress that
first part forward?
Jason Gunthorpe June 14, 2023, 1:35 p.m. UTC | #12
On Wed, Jun 14, 2023 at 09:30:34PM +0800, Michael Shavit wrote:
> > I think it is good to make progress, it looked to me like the first
> > part stood alone fairly well and was an improvement on its own.
> 
> Sorry for the noobie question; it's not 100% obvious to me what the
> next step is here. Is there anything I should do to progress that
> first part forward?

Reword your cover letter to cover the new subset and repost the series
as v3 with just the patches you want to go with

Jason
Zhang, Tina July 5, 2023, 9:56 a.m. UTC | #13
Hi,

> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, June 7, 2023 8:00 PM
> To: Michael Shavit <mshavit@google.com>
> Cc: Will Deacon <will@kernel.org>; Robin Murphy <robin.murphy@arm.com>;
> Joerg Roedel <joro@8bytes.org>; jean-philippe@linaro.org;
> nicolinc@nvidia.com; baolu.lu@linux.intel.com; linux-arm-
> kernel@lists.infradead.org; iommu@lists.linux.dev; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH v2 14/18] iommu/arm-smmu-v3: Support domains with
> shared CDs
> 
> On Wed, Jun 07, 2023 at 12:06:07AM +0530, Michael Shavit wrote:
> > > What we definately shouldn't do is try to have different SVA
> > > iommu_domain's pointing at the same ASID. That is again making SVA
> > > special, which we are trying to get away from :)
> >
> > Fwiw, this change is preserving the status-quo in that regard;
> > arm-smmu-v3-sva.c is already doing this. But yes, I agree that
> > resolving the limitation is a better long term solution... and
> > something I can try to look at further.
> 
> I suppose we also don't really have a entirely clear picture what allocating
> multiple SVA domains should even do in the iommu driver.
> 
> The driver would like to share the ASID, but things are much cleaner for
> everything if the driver model has ASID 1:1 with the iommu_domain.
> 
> It suggests we are missing some core code in iommu_sva_bind_device() to try
> to re-use existing SVA iommu_domains. This would certainly be better than
> trying to teach every driver how to share and refcount its ASID concept...
> 
> Today we have this super hacky iommu_get_domain_for_dev_pasid() thing
> that allows SVA domain reuse for a single device.
> 
> Possibly what we should do is conver the u32 pasid in the mm_struct to a
> struct iommu_mm_data * and put alot more stuff in there. eg a linked list of
> all SVA domains.
If we are going to have 1:1 between SVA domain and pasid, why we need a linked list of all SVA domains? Would a SVA domain pointer be enough?

I've got a patch-set which takes this suggestion to add an iommu_mm_data struct field to mm_struct. I'll send it out for review soon. The motivation of that patch-set is to let the invalidate_range() callback use the SVA domain referenced by mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation.

Regards,
-Tina

> 
> > Splitting this part into a follow-up patch series would definitely be
> > easier and helpful if you're all ok with it :) .
> 
> I think splitting it into a series to re-organize the way ste/cd stuff works is a
> nice contained topic.
> 
> Adjusting the way the ASID works with SVA is another good topic
> 
> And so on
> 
> Jason
Jason Gunthorpe July 10, 2023, 4:55 p.m. UTC | #14
On Wed, Jul 05, 2023 at 09:56:50AM +0000, Zhang, Tina wrote:

> > Possibly what we should do is conver the u32 pasid in the mm_struct to a
> > struct iommu_mm_data * and put alot more stuff in there. eg a linked list of
> > all SVA domains.

> If we are going to have 1:1 between SVA domain and pasid, why we
> need a linked list of all SVA domains? Would a SVA domain pointer be
> enough?

Kevin asked this, we can't assume that a single SVA domain is going to
work in a multi-iommu-driver system, which we are trying to enable at
the core level..
 
> I've got a patch-set which takes this suggestion to add an
> iommu_mm_data struct field to mm_struct. I'll send it out for review
> soon. The motivation of that patch-set is to let the
> invalidate_range() callback use the SVA domain referenced by
> mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation.

Huh?

You are supposed to put the struct mmu_notifier inside the sva_domain
struct and use container_of().

This is another reason why I'd prefer we de-duplicate SVA domains at
the core code level as duplicitive notifiers are expensive..

Please don't add stuff to the mm just for this reason.

Jason
Zhang, Tina July 11, 2023, 12:26 a.m. UTC | #15
> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, July 11, 2023 12:55 AM
> To: Zhang, Tina <tina.zhang@intel.com>
> Cc: Michael Shavit <mshavit@google.com>; Will Deacon <will@kernel.org>;
> Robin Murphy <robin.murphy@arm.com>; Joerg Roedel <joro@8bytes.org>;
> jean-philippe@linaro.org; nicolinc@nvidia.com; baolu.lu@linux.intel.com;
> linux-arm-kernel@lists.infradead.org; iommu@lists.linux.dev; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH v2 14/18] iommu/arm-smmu-v3: Support domains with
> shared CDs
> 
> On Wed, Jul 05, 2023 at 09:56:50AM +0000, Zhang, Tina wrote:
> 
> > > Possibly what we should do is conver the u32 pasid in the mm_struct
> > > to a struct iommu_mm_data * and put alot more stuff in there. eg a
> > > linked list of all SVA domains.
> 
> > If we are going to have 1:1 between SVA domain and pasid, why we need
> > a linked list of all SVA domains? Would a SVA domain pointer be
> > enough?
> 
> Kevin asked this, we can't assume that a single SVA domain is going to work in
> a multi-iommu-driver system, which we are trying to enable at the core level..
> 
> > I've got a patch-set which takes this suggestion to add an
> > iommu_mm_data struct field to mm_struct. I'll send it out for review
> > soon. The motivation of that patch-set is to let the
> > invalidate_range() callback use the SVA domain referenced by
> > mm->iommu_mm_data->sva_domain to do per-iommu IOTLB invalidation.
> 
> Huh?
> 
> You are supposed to put the struct mmu_notifier inside the sva_domain
> struct and use container_of().
No. Just want to use iommu_mm to replace the old pasid field.
https://lore.kernel.org/linux-iommu/20230707013441.365583-5-tina.zhang@intel.com/

The motivation is mainly to for performance optimization for vIOMMU, though it could also benefit native world.

Currently, in invalidate_range() callback implemented by VT-d driver, due to lacking sva_domain knowledge (i.e., intel_invalidate_range() doesn't know which IOMMUs' IOTLB should be invalidated), intel_invalidate_range() just performs IOMMU IOTLB per device and that leads to superfluous IOTLB invalidations. For example, if there are two devices physically behind an IOMMU and those two devices are used by one user space application, it means in each invalidate_range() callback, two IOTLB and two device-IOTLB invalidations would be performed, instead of one IOTLB plus two device-TLB invalidations (i.e., one IOTLB invalidation is redundant). Things will become worse in a virtualized environment, especially with one vIOMMU exposed to guest, as invoking IOTLB is much more expensive than doing it in native and conceptually there could be more devices behind a vIOMMU that means more superfluous IOTLB invalidations.

So, we want to add sva_domain related info to mm to remove the superfluous IOTLB invalidations.

Thanks,
-Tina

> 
> This is another reason why I'd prefer we de-duplicate SVA domains at the
> core code level as duplicitive notifiers are expensive..
> 
> Please don't add stuff to the mm just for this reason.
> 
> Jason
Jason Gunthorpe July 11, 2023, 1:53 p.m. UTC | #16
On Tue, Jul 11, 2023 at 12:26:56AM +0000, Zhang, Tina wrote:

> The motivation is mainly to for performance optimization for vIOMMU,
> though it could also benefit native world.
> 
> Currently, in invalidate_range() callback implemented by VT-d
> driver, due to lacking sva_domain knowledge (i.e.,
> intel_invalidate_range() doesn't know which IOMMUs' IOTLB should be
> invalidated), intel_invalidate_range() just performs IOMMU IOTLB per
> device and that leads to superfluous IOTLB invalidations. 

You get the sva_domain from container_of on the notifier

The sva_domain has a list of all the devices and iommu's that it is
connected to

The driver optimizes the invalidations based on the list.

The core code de-duplicates the sva domain so there is usually only
one.

If someone creates two SVA domains then the driver is less optimal, oh
well.

The driver should not look inside the mm_struct and certainly should
never try to get a "SVA domain" from it.

Jason
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b7f834dde85d1..69b1d09fd0284 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -965,6 +965,20 @@  void arm_smmu_tlb_inv_asid(struct arm_smmu_device *smmu, u16 asid)
 	arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
 }
 
+static struct arm_smmu_ctx_desc *arm_smmu_get_cd(struct arm_smmu_domain *domain)
+{
+	if (domain->stage == ARM_SMMU_DOMAIN_S1_SHARED_CD)
+		return domain->shared_cd;
+	else
+		return &domain->cd;
+}
+
+static bool arm_smmu_is_s1_domain(struct arm_smmu_domain *domain)
+{
+	return domain->stage == ARM_SMMU_DOMAIN_S1_SHARED_CD ||
+	       domain->stage == ARM_SMMU_DOMAIN_S1;
+}
+
 /* master may be null */
 static void arm_smmu_sync_cd(struct arm_smmu_master *master,
 			     int ssid, bool leaf)
@@ -1887,8 +1901,8 @@  static void arm_smmu_tlb_inv_context(void *cookie)
 	 * insertion to guarantee those are observed before the TLBI. Do be
 	 * careful, 007.
 	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		arm_smmu_tlb_inv_asid(smmu, smmu_domain->cd.asid);
+	if (arm_smmu_is_s1_domain(smmu_domain)) {
+		arm_smmu_tlb_inv_asid(smmu, arm_smmu_get_cd(smmu_domain)->asid);
 	} else {
 		cmd.opcode	= CMDQ_OP_TLBI_S12_VMALL;
 		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
@@ -1968,10 +1982,10 @@  static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
 		},
 	};
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+	if (arm_smmu_is_s1_domain(smmu_domain)) {
 		cmd.opcode	= smmu_domain->smmu->features & ARM_SMMU_FEAT_E2H ?
 				  CMDQ_OP_TLBI_EL2_VA : CMDQ_OP_TLBI_NH_VA;
-		cmd.tlbi.asid	= smmu_domain->cd.asid;
+		cmd.tlbi.asid	= arm_smmu_get_cd(smmu_domain)->asid;
 	} else {
 		cmd.opcode	= CMDQ_OP_TLBI_S2_IPA;
 		cmd.tlbi.vmid	= smmu_domain->s2_cfg.vmid;
@@ -2549,7 +2563,7 @@  static int arm_smmu_set_dev_pasid(struct iommu_domain *domain,
 		return -ENODEV;
 	}
 
-	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1) {
+	if (!arm_smmu_is_s1_domain(smmu_domain)) {
 		dev_err(dev, "set_dev_pasid only supports stage 1 domains\n");
 		return -EINVAL;
 	}
@@ -2575,7 +2589,7 @@  static int arm_smmu_set_dev_pasid(struct iommu_domain *domain,
 	 */
 	mutex_lock(&arm_smmu_asid_lock);
 	ret = arm_smmu_write_ctx_desc(master->smmu, master->s1_cfg, master,
-				      pasid, &smmu_domain->cd);
+				      pasid, arm_smmu_get_cd(smmu_domain));
 	if (ret) {
 		mutex_unlock(&arm_smmu_asid_lock);
 		kfree(attached_domain);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 3525d60668c23..4ac69427abf1c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -713,6 +713,7 @@  struct arm_smmu_master {
 /* SMMU private data for an IOMMU domain */
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
+	ARM_SMMU_DOMAIN_S1_SHARED_CD,
 	ARM_SMMU_DOMAIN_S2,
 	ARM_SMMU_DOMAIN_NESTED,
 	ARM_SMMU_DOMAIN_BYPASS,
@@ -728,6 +729,7 @@  struct arm_smmu_domain {
 	enum arm_smmu_domain_stage		stage;
 	union {
 		struct arm_smmu_ctx_desc	cd;
+		struct arm_smmu_ctx_desc	*shared_cd;
 		struct arm_smmu_s2_cfg		s2_cfg;
 	};