Message ID | 20230621063825.268890-3-mshavit@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add PASID support to SMMUv3 unmanaged domains | expand |
Hi Michael, On Wed, Jun 21, 2023 at 02:37:14PM +0800, Michael Shavit wrote: > Except for Nested domains, arm_smmu_master will own the STEs that are > inserted into the arm_smmu_device's STE table. I think that the master still owns an STE when attached to a nested domain. Though an IOMMU_DOMAIN_NESTED iommu_domain is an opaque object to the STE in the guest, the host still has a real STE for the nested configuration somewhere -- and it's likely still to be owned by the master that's attached to the opaque NESTED iommu_domain in the host kernel. > -static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) > +static int arm_smmu_init_s1_cfg(struct arm_smmu_master *master, > + struct arm_smmu_s1_cfg *cfg) We here pass in an s1_cfg ptr because we expect someone else rather than the master could own the s1_cfg? But the final codeline by the end of this series seems that only master owns an s1_cfg. So perhaps we could re-organize the patches to clean this away, as the cfg always comes from a master? > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > index 68d519f21dbd8..053cc14c23969 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h > @@ -688,6 +688,7 @@ struct arm_smmu_master { > struct arm_smmu_domain *domain; > struct list_head domain_head; > struct arm_smmu_stream *streams; > + struct arm_smmu_s1_cfg owned_s1_cfg; I am a bit confused by this naming. If only master would own an s1_cfg, perhaps we can just make it "s1_cfg" and drop the s1_cfg pointer in the next patch. Thanks Nicolin
On Thu, Jul 13, 2023 at 9:22 AM Nicolin Chen <nicolinc@nvidia.com> wrote: > > > Except for Nested domains, arm_smmu_master will own the STEs that are > > inserted into the arm_smmu_device's STE table. > > I think that the master still owns an STE when attached to a > nested domain. Though an IOMMU_DOMAIN_NESTED iommu_domain is > an opaque object to the STE in the guest, the host still has > a real STE for the nested configuration somewhere -- and it's > likely still to be owned by the master that's attached to the > opaque NESTED iommu_domain in the host kernel. > I am a bit confused by this naming. If only master would own > an s1_cfg, perhaps we can just make it "s1_cfg" and drop the > s1_cfg pointer in the next patch. Could be that the naming is causing some confusion. This owned_s1_cfg is very different from the s1_cfg set-up by Nested domains in your patch series. It's better to think of it as the default s1_cfg used for DMA/SVA/UNMANAGED domains. Because stage 1 domains represent a single page table, it doesn't make sense for them to own an entire CD table. In contrast, nested domains map an entire CD table and it therefore makes sense for them to own the s1_cfg representing that table. Would renaming this as default_s1_cfg make more sense?
On Thu, Jul 13, 2023 at 04:34:56PM +0800, Michael Shavit wrote: > On Thu, Jul 13, 2023 at 9:22 AM Nicolin Chen <nicolinc@nvidia.com> wrote: > > > > > Except for Nested domains, arm_smmu_master will own the STEs that are > > > inserted into the arm_smmu_device's STE table. > > > > I think that the master still owns an STE when attached to a > > nested domain. Though an IOMMU_DOMAIN_NESTED iommu_domain is > > an opaque object to the STE in the guest, the host still has > > a real STE for the nested configuration somewhere -- and it's > > likely still to be owned by the master that's attached to the > > opaque NESTED iommu_domain in the host kernel. > > > I am a bit confused by this naming. If only master would own > > an s1_cfg, perhaps we can just make it "s1_cfg" and drop the > > s1_cfg pointer in the next patch. > > Could be that the naming is causing some confusion. This owned_s1_cfg > is very different from the s1_cfg set-up by Nested domains in your > patch series. It's better to think of it as the default s1_cfg used > for DMA/SVA/UNMANAGED domains. Because stage 1 domains represent a > single page table, it doesn't make sense for them to own an entire CD > table. In contrast, nested domains map an entire CD table and it > therefore makes sense for them to own the s1_cfg representing that > table. > Would renaming this as default_s1_cfg make more sense? It would make alot more sense if the STE value used by an unmanaged S1 domain was located in/near the unmanaged domain or called 'unmanaged S1 STE' or something if it really has to be in the master. Why does this even need to be stored, can't we compute it? Notice that we have unmanaged domains that use a CD and SVA domains typically would be on a CD, so it is a bit weird already. I'd think the basic mental model should be to extract the STE from the thing you intend to install. Either the default CD table, or from the iommu_domain. ie some 'get STE from iommu_domain' function? There shouldn't be a concept of a "default" or "owned" STE value.. Jason
On Thu, Jul 13, 2023 at 10:29 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > It would make alot more sense if the STE value used by an unmanaged S1 > domain was located in/near the unmanaged domain or called 'unmanaged > S1 STE' or something if it really has to be in the master. Why does > this even need to be stored, can't we compute it? struct s1_cfg* and struct s2_cfg* are precisely what is used to compute an STE. For example, when s1_cfg is set, arm_smmu_write_strtab will write the s1_cfg's CD table dma_pointer into the STE's STRTAB_STE_0_CFG field. When neither are set, the STE fields are written to enable bypass (or abort depending on the config). > I'd think the basic mental model should be to extract the STE from the > thing you intend to install. Either the default CD table, or from the > iommu_domain. ie some 'get STE from iommu_domain' function? I don't follow this. When we attach a domain with pasid (whether through SVA or the set_dev_pasid API) , we don't want to install an entirely new CD table. We want to write something (page-table pointer) to a common CD table. Where should the s1_cfg which owns that common table live? I thought we concluded that it should be owned by the arm_smmu_master rather than any domain (to avoid dependencies between domains a-la aux-domain). With this change, even attach_dev with a DMA or UNMANAGED domain is now just preparing a single entry into this common CD table.
On Fri, Jul 14, 2023 at 12:16 AM Michael Shavit <mshavit@google.com> wrote: > domains a-la aux-domain). With this change, even attach_dev with a DMA > or UNMANAGED domain is now just preparing a single entry into this > common CD table. I have to correct this part, arm-smmu-v3.c's attach_dev() implementation does both: it writes to the pasid=0 entry of the owned_s1_cfg's CD table, and then installs that CD table to the STE.
On Fri, Jul 14, 2023 at 12:16:16AM +0800, Michael Shavit wrote: > On Thu, Jul 13, 2023 at 10:29 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > It would make alot more sense if the STE value used by an unmanaged S1 > > domain was located in/near the unmanaged domain or called 'unmanaged > > S1 STE' or something if it really has to be in the master. Why does > > this even need to be stored, can't we compute it? > > struct s1_cfg* and struct s2_cfg* are precisely what is used to > compute an STE. For example, when s1_cfg is set, arm_smmu_write_strtab > will write the s1_cfg's CD table dma_pointer into the STE's > STRTAB_STE_0_CFG field. When neither are set, the STE fields are > written to enable bypass (or abort depending on the config). I guess I never really understood why these were precomputed and stored at all. Even more confusing is why we need to keep pointers to them anywhere? Compute the STE and CDE directly from the source data when you need it? eg If I want to install an IDENITY domain into a STE then I compute the STE for identity and go ahead and do it. > > I'd think the basic mental model should be to extract the STE from the > > thing you intend to install. Either the default CD table, or from the > > iommu_domain. ie some 'get STE from iommu_domain' function? > > I don't follow this. When we attach a domain with pasid (whether > through SVA or the set_dev_pasid API) , we don't want to install an > entirely new CD table. The master object owns an optional CD table. If it is exsists it is used by every domain that is attached to that master. In the code flow there are two entry points to attach a domain, attach to a PASID or attach to a RID. For attach to PASID the code should always force the master to have a CD table and then attach the domain to the CD table. For attach to RID the code should do a bunch of checks and decide if it should force the master to have a CD table and attach the domain to that, or directly attach the domain to the STE. When the master gains a CD table then the CD table object becomes attached to the STE. In all cases we should be able to point to the object the STE points at and don't need a cfg or pointer to cfg since the object itself can provide the cfg. In all cases when you go to compute a STE you figure out what object is attached to it (CD or domain), compute the correct STE for that object, the set it. Same for he CDE, query the correct CDE from the iommu_domain when you attach it to the table. There should be no such thing as a "default" STE, and I question if it makes sense to even precompute the s1/s2_cfg values during finalize at all.. > We want to write something (page-table pointer) to a common CD > table. Where should the s1_cfg which owns that common table live? I would suggest a 'cd table struct' that as all the stuff related to the CD table, including an API to cacluate the STE this CD table requires. If not in actual code with a real struct, then in a logical sense in that a chunk of the master struct is the "CD table". > I thought we concluded that it should be owned by the > arm_smmu_master rather than any domain (to avoid dependencies > between domains a-la aux-domain). Yes, I'm not saying anything against that, just how and where the STE and CDE values flow around. Jason
On Thu, Jul 13, 2023 at 01:41:38PM -0300, Jason Gunthorpe wrote: > On Fri, Jul 14, 2023 at 12:16:16AM +0800, Michael Shavit wrote: > > On Thu, Jul 13, 2023 at 10:29 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > It would make alot more sense if the STE value used by an unmanaged S1 > > > domain was located in/near the unmanaged domain or called 'unmanaged > > > S1 STE' or something if it really has to be in the master. Why does > > > this even need to be stored, can't we compute it? > > > > struct s1_cfg* and struct s2_cfg* are precisely what is used to > > compute an STE. For example, when s1_cfg is set, arm_smmu_write_strtab > > will write the s1_cfg's CD table dma_pointer into the STE's > > STRTAB_STE_0_CFG field. When neither are set, the STE fields are > > written to enable bypass (or abort depending on the config). > > I guess I never really understood why these were precomputed and > stored at all. Even more confusing is why we need to keep pointers to > them anywhere? Compute the STE and CDE directly from the source data > when you need it? > > eg If I want to install an IDENITY domain into a STE then I compute > the STE for identity and go ahead and do it. I think it'd be clear if the master stores STE value directly to get rid of s1_cfg/s2_cfg in the master struct. We fill in those domain-related STE fields when the domain attaches to the master and then call arm_smmu_write_strtab(). For struct arm_smmu_domain, it still has a union of a CD (for an S1 domain) or an s2_cfg (for an S2 domain). Or we could select a better naming for s2_cfg too, since s1_cfg is gone. Does this match with your expectation? > > > I'd think the basic mental model should be to extract the STE from the > > > thing you intend to install. Either the default CD table, or from the > > > iommu_domain. ie some 'get STE from iommu_domain' function? > > > > I don't follow this. When we attach a domain with pasid (whether > > through SVA or the set_dev_pasid API) , we don't want to install an > > entirely new CD table. > > The master object owns an optional CD table. If it is exsists it is > used by every domain that is attached to that master. > > In the code flow there are two entry points to attach a domain, attach > to a PASID or attach to a RID. > > For attach to PASID the code should always force the master to have a > CD table and then attach the domain to the CD table. > > For attach to RID the code should do a bunch of checks and decide if > it should force the master to have a CD table and attach the domain to > that, or directly attach the domain to the STE. I think a master own a CD table in both cases, though the table could have a single entry or multiple entries, depending on its ssid_bits/cdmax value. And since a master own the CD table, we could preallocate the CD table in arm_smmu_probe_device(), like Michael does in this series. And at the same time, the s1ctxptr field of the STE could be setup too. Then we insert the CD from a domain into the CD table during the domain attaching. Yet two cases that would waste the preallocated CD table: 1) If a master with ssid_bits=0 gets attached to an IDENITY S1 domain, it sets CONFIG=BYPASS in the STE, which wastes the single-entry CD table, unlike currently the driver bypassing the allocation of a CD table at all. 2) When enabling nested translation, we should replace all the S1 fields in the STE with guest/user values. So, the kernel- level CD table (either single-entry or multi-entry) in the master struct will not be used. Although this seems to be unchanged since currently the driver wastes this too in the default domain? With that, I still feel it clear and flexible. And I can simply add my use case of attaching an IDENITY domain to the default substream when having a multi-entry CD table. Thanks Nicolin
On Thu, Jul 13, 2023 at 12:54:47PM -0700, Nicolin Chen wrote: > On Thu, Jul 13, 2023 at 01:41:38PM -0300, Jason Gunthorpe wrote: > > On Fri, Jul 14, 2023 at 12:16:16AM +0800, Michael Shavit wrote: > > > On Thu, Jul 13, 2023 at 10:29 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > It would make alot more sense if the STE value used by an unmanaged S1 > > > > domain was located in/near the unmanaged domain or called 'unmanaged > > > > S1 STE' or something if it really has to be in the master. Why does > > > > this even need to be stored, can't we compute it? > > > > > > struct s1_cfg* and struct s2_cfg* are precisely what is used to > > > compute an STE. For example, when s1_cfg is set, arm_smmu_write_strtab > > > will write the s1_cfg's CD table dma_pointer into the STE's > > > STRTAB_STE_0_CFG field. When neither are set, the STE fields are > > > written to enable bypass (or abort depending on the config). > > > > I guess I never really understood why these were precomputed and > > stored at all. Even more confusing is why we need to keep pointers to > > them anywhere? Compute the STE and CDE directly from the source data > > when you need it? > > > > eg If I want to install an IDENITY domain into a STE then I compute > > the STE for identity and go ahead and do it. > > I think it'd be clear if the master stores STE value directly to > get rid of s1_cfg/s2_cfg in the master struct. We fill in those > domain-related STE fields when the domain attaches to the master > and then call arm_smmu_write_strtab(). Right, though master effectively records the STE when it writes it to the steering table. If we need another copy I don't know. > For struct arm_smmu_domain, it still has a union of a CD (for an > S1 domain) or an s2_cfg (for an S2 domain). Or we could select > a better naming for s2_cfg too, since s1_cfg is gone. By spec it should be called CD and STE value, but frankly I like CDTE and STE a little better (context descriptor table entry) as it evokes a sense of similarity. I don't care for the words 's1_cfg' and 's2_cfg' to represent the CD and STE, that is pretty confusing now that we have more uses for the CD and STE's than simple s1/s2 IO page tables. > I think a master own a CD table in both cases, though the table > could have a single entry or multiple entries Yes. > depending on its ssid_bits/cdmax value. And since a master own the > CD table, we could preallocate the CD table in > arm_smmu_probe_device(), like Michael does in this series. And at > the same time, the s1ctxptr field of the STE could be setup > too. Then we insert the CD from a domain into the CD table during > the domain attaching. We insert the CDTE from the domain into the CD table, and generate a STE for the CD table and insert it ino the Steering table. > Yet two cases that would waste the preallocated CD table: > 1) If a master with ssid_bits=0 gets attached to an IDENITY S1 > domain, it sets CONFIG=BYPASS in the STE, which wastes the > single-entry CD table, unlike currently the driver bypassing > the allocation of a CD table at all. > 2) When enabling nested translation, we should replace all the > S1 fields in the STE with guest/user values. So, the kernel- > level CD table (either single-entry or multi-entry) in the > master struct will not be used. Although this seems to be > unchanged since currently the driver wastes this too in the > default domain? As we discussed in another thread dynamic resizing of the CD table makes some sense. Eg keep it at one entry until PASID mode is enabled then resize it to the max PASID bits. But I view that as an improvement beyond here. It seems fine for now to pre allocate the full CD table and waste the memory. PASID capable devices are rare. > With that, I still feel it clear and flexible. And I can simply > add my use case of attaching an IDENITY domain to the default > substream when having a multi-entry CD table. Yes, with the above note about dynamically resizing the CD table, I think we generally have the idea that programming the CD, eg by installing an entry, can cause the CD tables's STE to (atomically) change. Eg to toggle the S1DSS bit if the PASID 0 is IDENTITY, or to resize the table if we exceed the PASID space. Longer term we'd also need a way to setup IDENTITY domains for !0 substreams as well too (eg for SIOV). It is too bad the CDTE doesn't have a bit for bypass. I suppose it will need dummy 1:1 IO page tables or something. Jason
On Thu, Jul 13, 2023 at 08:48:32PM -0300, Jason Gunthorpe wrote: > > For struct arm_smmu_domain, it still has a union of a CD (for an > > S1 domain) or an s2_cfg (for an S2 domain). Or we could select > > a better naming for s2_cfg too, since s1_cfg is gone. > > By spec it should be called CD and STE value, but frankly I like CDTE > and STE a little better (context descriptor table entry) as it evokes > a sense of similarity. I don't care for the words 's1_cfg' and > 's2_cfg' to represent the CD and STE, that is pretty confusing now > that we have more uses for the CD and STE's than simple s1/s2 IO page > tables. We have STE and PTE, so CDTE sounds legit to me, though probably it'd be safer by just following the spec? We can always add kdoc and comments to make things clear. @Michael, Would it be possible for you to update a v5, following Jason's suggestion overall? And I think we can have a smaller refactor series first without set_dev_pasid, to have a common codeline sooner for us to add new features, such as set_dev_pasid, the use case of IDENTITY default substream, and the nesting series. I will help testing with some pasid/non-pasid use cases too. > > Yet two cases that would waste the preallocated CD table: > > 1) If a master with ssid_bits=0 gets attached to an IDENITY S1 > > domain, it sets CONFIG=BYPASS in the STE, which wastes the > > single-entry CD table, unlike currently the driver bypassing > > the allocation of a CD table at all. > > 2) When enabling nested translation, we should replace all the > > S1 fields in the STE with guest/user values. So, the kernel- > > level CD table (either single-entry or multi-entry) in the > > master struct will not be used. Although this seems to be > > unchanged since currently the driver wastes this too in the > > default domain? > > As we discussed in another thread dynamic resizing of the CD table > makes some sense. Eg keep it at one entry until PASID mode is enabled > then resize it to the max PASID bits. I see. > But I view that as an improvement beyond here. It seems fine for now > to pre allocate the full CD table and waste the memory. PASID capable > devices are rare. Yea, that'd be easier for us to move forward other features :) > > With that, I still feel it clear and flexible. And I can simply > > add my use case of attaching an IDENITY domain to the default > > substream when having a multi-entry CD table. > > Yes, with the above note about dynamically resizing the CD table, I > think we generally have the idea that programming the CD, eg by > installing an entry, can cause the CD tables's STE to (atomically) > change. > > Eg to toggle the S1DSS bit if the PASID 0 is IDENTITY, or to resize > the table if we exceed the PASID space. Yea, we have enable_pasid() so probably can resize over there. > Longer term we'd also need a way to setup IDENTITY domains for !0 > substreams as well too (eg for SIOV). It is too bad the CDTE doesn't > have a bit for bypass. I suppose it will need dummy 1:1 IO page tables > or something. Oh. I haven't thought about that far. Noted it down. Thanks Nicolin
On Fri, Jul 14, 2023 at 12:41 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > The master object owns an optional CD table. If it is exsists it is > used by every domain that is attached to that master. > > In the code flow there are two entry points to attach a domain, attach > to a PASID or attach to a RID. > > For attach to PASID the code should always force the master to have a > CD table and then attach the domain to the CD table. > > For attach to RID the code should do a bunch of checks and decide if > it should force the master to have a CD table and attach the domain to > that, or directly attach the domain to the STE. Yes. This is the current flow (except that we fail instead of forcing when a CD table isn't already attached in the PASID flow). owned_s1_cfg is simply a pre-allocated version of your optional CD table. > When the master gains a CD table then the CD table object becomes > attached to the STE. In all cases we should be able to point to the > object the STE points at and don't need a cfg or pointer to cfg since > the object itself can provide the cfg. Ok, practically speaking, are we just talking about reverting patch 3 and keeping a handle to the primary domain in arm_smmu_master? Instead of directly accessing the currently active CD table using arm_smmu_master->s1_cfg, you'd like set_dev_pasid() to look for it by investigating what kind of domain is attached, and reaching to the pre-alllocated/optional CD table otherwise if necessary? > > We want to write something (page-table pointer) to a common CD > > table. Where should the s1_cfg which owns that common table live? > > I would suggest a 'cd table struct' that as all the stuff related to > the CD table, including an API to cacluate the STE this CD table > requires. If not in actual code with a real struct, then in a logical > sense in that a chunk of the master struct is the "CD table". Sure, that's almost exactly what s1_cfg is today (with these patches).... * s1_cfg.arm_smmu_ctx_desc_cfg describes the CD table * s1_cfg.s1fmt and s1_cfg.s1cdmax describes attributes of that CD table. These could technically be deduced-back from arm_smmu_ctx_desc_cfg's l1_desc and num_l1_ents * s1_cfg.stall_enabled was introduced by patch 4 and in retrospect does not belong there at all. It should have gone in arm_smmu_ctx_desc instead (ignoring the whole should arm_smmu_ctx_desc even be precomputed in the first place topic for a second).
On Fri, Jul 14, 2023 at 9:14 AM Nicolin Chen <nicolinc@nvidia.com> wrote: > @Michael, > Would it be possible for you to update a v5, following Jason's > suggestion overall? And I think we can have a smaller refactor > series first without set_dev_pasid, to have a common codeline > sooner for us to add new features, such as set_dev_pasid, the > use case of IDENTITY default substream, and the nesting series. > I will help testing with some pasid/non-pasid use cases too. Want to make sure I fully understand these last few messages first. At a high level, we want: 1. arm_smmu_master is allowed to own a CD table, but not an STE-precursor (s1_cfg/s2_cfg). The s1_cfg is practically already that, and we can probably get there with minimal changes. 2. arm_smmu_master shouldn't point to the currently active CD table (which may or may not be the one it owns) or STE-precursor as a shortcut. All code should figure it out by looking at the master's currently attached domain (functionality could be provided by helper). 3. arm_smmu_domain shouldn't pre-generate any STE-precursors. The STE/CD for a domain should either be computed right when it is written, or computed ahead of time and stored as a copy in the smmu-domain. Is that right?
On Fri, Jul 14, 2023 at 05:12:35PM +0800, Michael Shavit wrote: > On Fri, Jul 14, 2023 at 9:14 AM Nicolin Chen <nicolinc@nvidia.com> wrote: > > @Michael, > > Would it be possible for you to update a v5, following Jason's > > suggestion overall? And I think we can have a smaller refactor > > series first without set_dev_pasid, to have a common codeline > > sooner for us to add new features, such as set_dev_pasid, the > > use case of IDENTITY default substream, and the nesting series. > > I will help testing with some pasid/non-pasid use cases too. > > Want to make sure I fully understand these last few messages first. At > a high level, we want: > 1. arm_smmu_master is allowed to own a CD table, but not an > STE-precursor (s1_cfg/s2_cfg). The s1_cfg is practically already that, > and we can probably get there with minimal changes. > 2. arm_smmu_master shouldn't point to the currently active CD table > (which may or may not be the one it owns) or STE-precursor as a > shortcut. All code should figure it out by looking at the master's > currently attached domain (functionality could be provided by helper). > 3. arm_smmu_domain shouldn't pre-generate any STE-precursors. The > STE/CD for a domain should either be computed right when it is > written, or computed ahead of time and stored as a copy in the > smmu-domain. > Is that right? To be honest, I'm having a hard time following the thread. I'd like to avoid renaming things "for fun", so let's try to focus on the actual restructuring initially. I'd also like to see what this "computing" of the SMMU data structures actually looks like -- the information stored in the 'arm_smmu_master' structure isn't redundant, so what's the problem? If you all grok this, then I'm happy to wait for the patches, but I don't want you to go away and spend lots of effort hacking something which doesn't end up being what folks are asking for. Will
On Fri, Jul 14, 2023 at 05:12:35PM +0800, Michael Shavit wrote: > On Fri, Jul 14, 2023 at 9:14 AM Nicolin Chen <nicolinc@nvidia.com> wrote: > > @Michael, > > Would it be possible for you to update a v5, following Jason's > > suggestion overall? And I think we can have a smaller refactor > > series first without set_dev_pasid, to have a common codeline > > sooner for us to add new features, such as set_dev_pasid, the > > use case of IDENTITY default substream, and the nesting series. > > I will help testing with some pasid/non-pasid use cases too. > > Want to make sure I fully understand these last few messages first. At > a high level, we want: > 1. arm_smmu_master is allowed to own a CD table, but not an > STE-precursor (s1_cfg/s2_cfg). The s1_cfg is practically already that, > and we can probably get there with minimal changes. Yah > 2. arm_smmu_master shouldn't point to the currently active CD table > (which may or may not be the one it owns) or STE-precursor as a > shortcut. All code should figure it out by looking at the master's > currently attached domain (functionality could be provided by > helper). I think that is close. Most likely the master should have a pointer to an STE owning iommu_domain. If that is !NULL then the master's CD table is not used and the STE is computed from that iommu_domain. Essentially it says that iommu_domain owns the entire RID. Otherwise the STE comes from the master's CD table, and this means the master is in SSID mode. > 3. arm_smmu_domain shouldn't pre-generate any STE-precursors. The > STE/CD for a domain should either be computed right when it is > written, or computed ahead of time and stored as a copy in the > smmu-domain. I think this is cleaner since alot of the STE/CD calculation is either constant or copying data from other places in the domain struct, but maybe you'll find this is wrong. At least it is not super important at this point. Jason
On Fri, Jul 14, 2023 at 04:02:50PM +0800, Michael Shavit wrote: > On Fri, Jul 14, 2023 at 12:41 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > The master object owns an optional CD table. If it is exsists it is > > used by every domain that is attached to that master. > > > > In the code flow there are two entry points to attach a domain, attach > > to a PASID or attach to a RID. > > > > For attach to PASID the code should always force the master to have a > > CD table and then attach the domain to the CD table. > > > > For attach to RID the code should do a bunch of checks and decide if > > it should force the master to have a CD table and attach the domain to > > that, or directly attach the domain to the STE. > > Yes. This is the current flow (except that we fail instead of forcing > when a CD table isn't already attached in the PASID flow). > owned_s1_cfg is simply a pre-allocated version of your optional CD > table. Really? That seems like a terrible name for the CD table. > > When the master gains a CD table then the CD table object becomes > > attached to the STE. In all cases we should be able to point to the > > object the STE points at and don't need a cfg or pointer to cfg since > > the object itself can provide the cfg. > > Ok, practically speaking, are we just talking about reverting patch 3 > and keeping a handle to the primary domain in arm_smmu_master? I think the master should have a pointer to the iommu_domain that owns the STE or if NULL the master should assign its internal CD table to the STE. The iommu_domain structs should not have any references to a CD table. > > I would suggest a 'cd table struct' that as all the stuff related to > > the CD table, including an API to cacluate the STE this CD table > > requires. If not in actual code with a real struct, then in a logical > > sense in that a chunk of the master struct is the "CD table". > > Sure, that's almost exactly what s1_cfg is today (with these patches).... > * s1_cfg.arm_smmu_ctx_desc_cfg describes the CD table > * s1_cfg.s1fmt and s1_cfg.s1cdmax describes attributes of that CD > table. These could technically be deduced-back from > arm_smmu_ctx_desc_cfg's l1_desc and num_l1_ents Yes, this makes sense, there is some redundancy here for sure. patch 1 makes sense, arm_smmu_ctx_desc is effectively the Context Descriptor Entry, and it belongs in the domain patch 2 should delete arm_smmu_s1_cfg and just put arm_smmu_ctx_desc_cfg directly in the master. arm_smmu_ctx_desc_cfg is a weird name for the contex descriptor table, but it is much less weird than s1_cfg. As you say s1fmt/s1cdmax are redundant. patch 3 I don't understand, we should not add something called s1_cfg/s2_cfg to the master. The master should have 'arm_smmu_ctx_desc_cfg cd_table' and 'arm_smmu_domain ste_domain' patch 4 should have everything working with the cd table accept a 'struct arm_smmu_ctx_desc_cfg *', eg arm_smmu_get_cd_ptr (get a pointer to a CD entry in the CD table). patch 5 makes sense, but something seems odd about the order as we somehow half moved it in patch 2? My suggestion for patch structure is to start by cleaning up the CD table object. Make arm_smmu_ctx_desc_cfg the CD table, remove the redudencies, remove arm_s1_cfg, clean the CD table APIs to only use 'struct arm_smmu_ctx_desc_cfg *', add the 'ste_domain' to the master, and then as the last step just move the arm_smmu_ctx_desc_cfg from the iommu_domain to the master. And that is a nice little series on its own - you end up with a shared CD table in the master, and no CD table in any domains. Jason
On Fri, Jul 14, 2023 at 9:21 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > patch 2 should delete arm_smmu_s1_cfg and just put > arm_smmu_ctx_desc_cfg directly in the master. arm_smmu_ctx_desc_cfg is > a weird name for the contex descriptor table, but it is much less > weird than s1_cfg. As you say s1fmt/s1cdmax are redundant. s1fmt is fairly trivial to replace but s1cdmax requires inversing previous computations. I don't really buy that getting rid of it simplifies anything, even if it's technically redundant. > patch 3 I don't understand, we should not add something called > s1_cfg/s2_cfg to the master. The master should have > 'arm_smmu_ctx_desc_cfg cd_table' and 'arm_smmu_domain ste_domain' This was simply meant to be a more convenient way of finding the currently active cdtable from the arm_smmu_write_ctx_desc/write_strtab_ent functions without having to inspect the currently attached domain. But sure, that's easy enough to revert. > patch 5 makes sense, but something seems odd about the order as we > somehow half moved it in patch 2? Ack; patch 2 can be reordered to come after patch 4, or even squashed with 5 if you prefer. > My suggestion for patch structure is to start by cleaning up the CD > table object. Make arm_smmu_ctx_desc_cfg the CD table, remove the > redudencies, remove arm_s1_cfg, clean the CD table APIs to only use > 'struct arm_smmu_ctx_desc_cfg *', add the 'ste_domain' to the master, > and then as the last step just move the arm_smmu_ctx_desc_cfg from the > iommu_domain to the master. > > And that is a nice little series on its own - you end up with a shared > CD table in the master, and no CD table in any domains. I don't entirely buy that refactoring s1_cfg is worth the extra effort, nor that it should be tied to this patch series. This series already makes s1_cfg behave as the CD table; whether we want to entirely get rid of pre-computed data useful for writing an STE sounds like a separate cleanup.
On Mon, Jul 17, 2023 at 06:06:19PM +0800, Michael Shavit wrote: > On Fri, Jul 14, 2023 at 9:21 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > patch 2 should delete arm_smmu_s1_cfg and just put > > arm_smmu_ctx_desc_cfg directly in the master. arm_smmu_ctx_desc_cfg is > > a weird name for the contex descriptor table, but it is much less > > weird than s1_cfg. As you say s1fmt/s1cdmax are redundant. > > s1fmt is fairly trivial to replace but s1cdmax requires inversing > previous computations. I don't really buy that getting rid of it > simplifies anything, even if it's technically redundant. Then store s1cdmax in the arm_smmu_ctx_desc_cfg and still get rid of arm_smmu_s1_cfg The point is to get rid of "s1_cfg" entirely as language in the driver *because it makes zero sense now*. Prior to your restructuring it was sort of the STE to use for the S1 page table which had an embedded CD. After all this change it isn't realated to a S1 page table anymore at all. > > patch 3 I don't understand, we should not add something called > > s1_cfg/s2_cfg to the master. The master should have > > 'arm_smmu_ctx_desc_cfg cd_table' and 'arm_smmu_domain ste_domain' > > This was simply meant to be a more convenient way of finding the > currently active cdtable from the There is only one (meaningful) choice though, the cdtable is either the master's default CD table or it isn't. You detect that by checking for NULL ste_domain > > My suggestion for patch structure is to start by cleaning up the CD > > table object. Make arm_smmu_ctx_desc_cfg the CD table, remove the > > redudencies, remove arm_s1_cfg, clean the CD table APIs to only use > > 'struct arm_smmu_ctx_desc_cfg *', add the 'ste_domain' to the master, > > and then as the last step just move the arm_smmu_ctx_desc_cfg from the > > iommu_domain to the master. > > > > And that is a nice little series on its own - you end up with a shared > > CD table in the master, and no CD table in any domains. > > I don't entirely buy that refactoring s1_cfg is worth the extra > effort, nor that it should be tied to this patch series. This series > already makes s1_cfg behave as the CD table; whether we want to > entirely get rid of pre-computed data useful for writing an STE sounds > like a separate cleanup. s1_cfg is a terrible name to describe the cd table. Please do the tiny bit extra to get rid of it for clarity. Jason
Haven't had time to address this yet; but Ack on last message. Will address in a v5 series.
Sorry for the delay; I'm trying to refactor these patches now. > I think the master should have a pointer to the iommu_domain that owns > the STE or if NULL the master should assign its internal CD table to > the STE. Just to clarify, does the nested domain patch series require writing CDs into the user-space's CD table using arm_smmu_write_ctx_desc()? Or is there any other requirement for writing a CD into a domain-owned CD table from arm_smmu_write_ctx_desc? Writing a CD entry to the CD table involves some dependencies on the master(s) that the table is attached to: 1. The CD entries STALL bit value in arm_smmu_write_ctx_desc depends on the master (e.g. if STALL_FORCE is set on the smmu device). This could potentially be encoded in arm_smmu_ctx_desc_cfg, at which point that CD table is only attachable to masters with the same stall_enabled value. 2. arm_smmu_write_ctx_desc must sync the CD for the attached master(s) in the middle of writing CD entry. This is all easier to handle in arm_smmu_write_ctx_desc if the table is always owned by the master.
On Thu, Jul 27, 2023 at 07:22:05PM +0800, Michael Shavit wrote: > Sorry for the delay; I'm trying to refactor these patches now. > > > I think the master should have a pointer to the iommu_domain that owns > > the STE or if NULL the master should assign its internal CD table to > > the STE. > Just to clarify, does the nested domain patch series require writing > CDs into the user-space's CD table using arm_smmu_write_ctx_desc()? No, CD entries in nested CD tables are written by userspace only. > Or is there any other requirement for writing a CD into a > domain-owned CD table from arm_smmu_write_ctx_desc? Not that I know of. The only time the kernel writes a CD entry is to the shared CD table stored in the master. > 1. The CD entries STALL bit value in arm_smmu_write_ctx_desc depends > on the master (e.g. if STALL_FORCE is set on the smmu device). This > could potentially be encoded in arm_smmu_ctx_desc_cfg, at which point > that CD table is only attachable to masters with the same > stall_enabled value. For cleanness I would orgnize it like this. > 2. arm_smmu_write_ctx_desc must sync the CD for the attached master(s) > in the middle of writing CD entry. > > This is all easier to handle in arm_smmu_write_ctx_desc if the table > is always owned by the master. I think it is fine if you start with a shared CD table being 1:1 with a single master. Making the CD table shared between masters (eg for multi-device-group) is a micro-optimization, and I'm not sure we have workloads where it is worthwhile. We already block PASID support for multi-device-group. Though resizable CD table is probably a better place to put efforts. Jason
Awesome that helps a lot. These are the patches I have in mind: 1. Move ctx_desc out of s1_cfg 2. Replace domain->s1_cfg with `struct arm_smmu_ctx_desc_cfg cd_table` 3. Move stall_enabled from domain to arm_smmu_ctx_desc_cfg 4. Accept an arm_smmu_master instead of an arm_smmu_domain as the parameter for arm_smmu_write_ctx_desc --- arm_smmu_write_ctx_desc can simply get the cd_table it writes to from master->domain->cd_table; this get's replaced by master->cd_table in subsequent commits. --- SVA is weird if we cut changes here: it iterates over all masters a domain is attached to in order to call arm_smmu_write_ctx_desc(), which ends up writing to the shared cd_table since in master->domain. This becomes more sensible once masters stop sharing the cd _table in subsequent commits. --- arm_smmu_write_ctx_desc is used to sync the cd for all masters the domain was attached to. Now that SVA is calling it in a loop, and to break the dependency on arm_smmu_domain, we should only sync for the master passed in as the parameter 5. Introduce a skip_cd_sync parameter to arm_smmu_write_ctx_desc so that arm_smmu_domain_finalise_s1 can skip the sync_cd before the cd_table is attached to the master. Before the previous commit, arm_smmu_domain_finalise_s1 was calling arm_smmu_write_ctx_desc() before adding the master to the domain->devices list, which was used by arm_sync_smmu() to issue sync commands to masters. This optimization was broken by the previous commit since we stopped depending on domain->devices. 6. Move cd_table from domain to arm_smmu_master->cd_table.
On Thu, Jul 27, 2023 at 10:04:04PM +0800, Michael Shavit wrote: > Awesome that helps a lot. These are the patches I have in mind: > > 1. Move ctx_desc out of s1_cfg > 2. Replace domain->s1_cfg with `struct arm_smmu_ctx_desc_cfg cd_table` > 3. Move stall_enabled from domain to arm_smmu_ctx_desc_cfg > 4. Accept an arm_smmu_master instead of an arm_smmu_domain as the > parameter for arm_smmu_write_ctx_desc > --- arm_smmu_write_ctx_desc can simply get the cd_table it writes to > from master->domain->cd_table; this get's replaced by master->cd_table > in subsequent commits. Makes sense > --- SVA is weird if we cut changes here: it iterates over all masters > a domain is attached to in order to call arm_smmu_write_ctx_desc(), > which ends up writing to the shared cd_table since in master->domain. > This becomes more sensible once masters stop sharing the cd _table in > subsequent commits. Seems like it is OK inside the series > --- arm_smmu_write_ctx_desc is used to sync the cd for all masters the > domain was attached to. Now that SVA is calling it in a loop, and to > break the dependency on arm_smmu_domain, we should only sync for the > master passed in as the parameter Sounds correct > 5. Introduce a skip_cd_sync parameter to arm_smmu_write_ctx_desc so > that arm_smmu_domain_finalise_s1 can skip the sync_cd before the > cd_table is attached to the master. Before the previous commit, > arm_smmu_domain_finalise_s1 was calling arm_smmu_write_ctx_desc() > before adding the master to the domain->devices list, which was used > by arm_sync_smmu() to issue sync commands to masters. This > optimization was broken by the previous commit since we stopped > depending on domain->devices. Tracking if the cd table is not yet installed in a STE might be a cleaner approach than a flag? > 6. Move cd_table from domain to arm_smmu_master->cd_table. Yep Jason
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index beff04b897718..023769f5ca79a 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1126,15 +1126,16 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid, return 0; } -static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) +static int arm_smmu_init_s1_cfg(struct arm_smmu_master *master, + struct arm_smmu_s1_cfg *cfg) { int ret; size_t l1size; size_t max_contexts; - struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; + struct arm_smmu_device *smmu = master->smmu; struct arm_smmu_ctx_desc_cfg *cdcfg = &cfg->cdcfg; + cfg->s1cdmax = master->ssid_bits; max_contexts = 1 << cfg->s1cdmax; if (!(smmu->features & ARM_SMMU_FEAT_2_LVL_CDTAB) || @@ -1175,12 +1176,11 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) return ret; } -static void arm_smmu_free_cd_tables(struct arm_smmu_domain *smmu_domain) +static void arm_smmu_free_cd_tables(struct arm_smmu_device *smmu, + struct arm_smmu_ctx_desc_cfg *cdcfg) { int i; size_t size, l1size; - struct arm_smmu_device *smmu = smmu_domain->smmu; - struct arm_smmu_ctx_desc_cfg *cdcfg = &smmu_domain->s1_cfg.cdcfg; if (cdcfg->l1_desc) { size = CTXDESC_L2_ENTRIES * (CTXDESC_CD_DWORDS << 3); @@ -2076,7 +2076,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) /* Prevent SVA from touching the CD while we're freeing it */ mutex_lock(&arm_smmu_asid_lock); if (cfg->cdcfg.cdtab) - arm_smmu_free_cd_tables(smmu_domain); + arm_smmu_free_cd_tables(smmu, &cfg->cdcfg); arm_smmu_free_asid(&smmu_domain->cd); mutex_unlock(&arm_smmu_asid_lock); } else { @@ -2108,11 +2108,9 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, if (ret) goto out_unlock; - cfg->s1cdmax = master->ssid_bits; - smmu_domain->stall_enabled = master->stall_enabled; - ret = arm_smmu_alloc_cd_tables(smmu_domain); + ret = arm_smmu_init_s1_cfg(master, cfg); if (ret) goto out_free_asid; @@ -2140,7 +2138,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain, return 0; out_free_cd_tables: - arm_smmu_free_cd_tables(smmu_domain); + arm_smmu_free_cd_tables(smmu, &cfg->cdcfg); out_free_asid: arm_smmu_free_asid(cd); out_unlock: @@ -2704,6 +2702,13 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) smmu->features & ARM_SMMU_FEAT_STALL_FORCE) master->stall_enabled = true; + ret = arm_smmu_init_s1_cfg(master, &master->owned_s1_cfg); + if (ret) { + arm_smmu_disable_pasid(master); + arm_smmu_remove_master(master); + goto err_free_master; + } + return &smmu->iommu; err_free_master: @@ -2719,6 +2724,7 @@ static void arm_smmu_release_device(struct device *dev) if (WARN_ON(arm_smmu_master_sva_enabled(master))) iopf_queue_remove_device(master->smmu->evtq.iopf, dev); arm_smmu_detach_dev(master); + arm_smmu_free_cd_tables(master->smmu, &master->owned_s1_cfg.cdcfg); arm_smmu_disable_pasid(master); arm_smmu_remove_master(master); kfree(master); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 68d519f21dbd8..053cc14c23969 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -688,6 +688,7 @@ struct arm_smmu_master { struct arm_smmu_domain *domain; struct list_head domain_head; struct arm_smmu_stream *streams; + struct arm_smmu_s1_cfg owned_s1_cfg; unsigned int num_streams; bool ats_enabled; bool stall_enabled;
Except for Nested domains, arm_smmu_master will own the STEs that are inserted into the arm_smmu_device's STE table. Signed-off-by: Michael Shavit <mshavit@google.com> --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 28 +++++++++++++-------- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + 2 files changed, 18 insertions(+), 11 deletions(-)