diff mbox series

[12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order

Message ID 12-v1-e289ca9121be+2be-smmuv3_newapi_p1_jgg@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Update SMMUv3 to the modern iommu API (part 1/2) | expand

Commit Message

Jason Gunthorpe Oct. 11, 2023, 12:33 a.m. UTC
Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.

When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.

If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.

If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.

Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 29 ++++++++++++++-------
 1 file changed, 20 insertions(+), 9 deletions(-)

Comments

Michael Shavit Oct. 12, 2023, 9:01 a.m. UTC | #1
On Wed, Oct 11, 2023 at 8:33 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> If we are replacing a CD table entry when the STE already points at the CD
> entry then we just need to do the make/break sequence.

Do you mean when the STE already points at the CD table? What's the
make/break sequence?


>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> @@ -2554,6 +2546,17 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                                 master->domain = NULL;
>                                 goto out_list_del;
>                         }
> +               } else {
> +                       /*
> +                        * arm_smmu_write_ctx_desc() relies on the entry being
> +                        * invalid to work, clear any existing entry.
> +                        */
> +                       ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> +                                                     NULL);
> +                       if (ret) {
> +                               master->domain = NULL;
> +                               goto out_list_del;
> +                       }
>                 }
>
>                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> @@ -2563,15 +2566,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>                 }
>
>                 arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> +               arm_smmu_install_ste_for_dev(master, &target);

Even if it's handled correctly under the hood by clever ste writing
logic, isn't it weird that we don't explicitly check whether the CD
table is already installed and skip arm_smmu_install_ste_for_dev in
that case?
Jason Gunthorpe Oct. 12, 2023, 12:34 p.m. UTC | #2
On Thu, Oct 12, 2023 at 05:01:16PM +0800, Michael Shavit wrote:
> On Wed, Oct 11, 2023 at 8:33 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > If we are replacing a CD table entry when the STE already points at the CD
> > entry then we just need to do the make/break sequence.
> 
> Do you mean when the STE already points at the CD table? 

Yes

> What's the make/break sequence?

When replacing a CD table entry at this point the code makes the CD
table entry non-valid then immediately makes it valid. This is because
the CD code cannot (yet, ~10 patches later it does) handle a Valid to
Valid transition.

> > +               } else {
> > +                       /*
> > +                        * arm_smmu_write_ctx_desc() relies on the entry being
> > +                        * invalid to work, clear any existing entry.
> > +                        */
> > +                       ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> > +                                                     NULL);
> > +                       if (ret) {
> > +                               master->domain = NULL;
> > +                               goto out_list_del;
> > +                       }
> >                 }
> >
> >                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> > @@ -2563,15 +2566,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >                 }
> >
> >                 arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> > +               arm_smmu_install_ste_for_dev(master, &target);
> 
> Even if it's handled correctly under the hood by clever ste writing
> logic, isn't it weird that we don't explicitly check whether the CD
> table is already installed and skip arm_smmu_install_ste_for_dev in
> that case?

There is a design logic at work here..

At this layer in the code we think in terms of 'target state'. We know
what the correct STE must be, so we compute that full value and make
the HW use that value. The lower layer computes the steps required to
put the HW into the target state, which might be a NOP.

Trying to optimizing the NOP here means this layer has to keep track
of what state the STE is currently in vs only tracking what state it
should be in. Avoiding that tracking is a main point of the new
programming logic.

This is a pretty common design pattern, "desired state" or "target
state".

Later on this becomes more complex as the CD table may be installed to
the STE but the S1DSS or EATS is not correct for S1 operation. Coding
it this way eventually trivially corrects those things as well. That
is something like 30 patches later.

Regards,
Jason
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index b84cd91dc5e596..540f38bb44873e 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2470,14 +2470,6 @@  static void arm_smmu_detach_dev(struct arm_smmu_master *master)
 
 	master->domain = NULL;
 	master->ats_enabled = false;
-	/*
-	 * Clearing the CD entry isn't strictly required to detach the domain
-	 * since the table is uninstalled anyway, but it helps avoid confusion
-	 * in the call to arm_smmu_write_ctx_desc on the next attach (which
-	 * expects the entry to be empty).
-	 */
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1 && master->cd_table.cdtab)
-		arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -2554,6 +2546,17 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 				master->domain = NULL;
 				goto out_list_del;
 			}
+		} else {
+			/*
+			 * arm_smmu_write_ctx_desc() relies on the entry being
+			 * invalid to work, clear any existing entry.
+			 */
+			ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
+			if (ret) {
+				master->domain = NULL;
+				goto out_list_del;
+			}
 		}
 
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
@@ -2563,15 +2566,23 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		}
 
 		arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
+		arm_smmu_install_ste_for_dev(master, &target);
 		break;
 	case ARM_SMMU_DOMAIN_S2:
 		arm_smmu_make_s2_domain_ste(&target, master, smmu_domain);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	case ARM_SMMU_DOMAIN_BYPASS:
 		arm_smmu_make_bypass_ste(&target);
+		arm_smmu_install_ste_for_dev(master, &target);
+		if (master->cd_table.cdtab)
+			arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
+						      NULL);
 		break;
 	}
-	arm_smmu_install_ste_for_dev(master, &target);
 
 	arm_smmu_enable_ats(master);
 	goto out_unlock;