diff mbox series

[v4,06/16] iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev

Message ID 6-v4-c93b774edcc4+42d2b-smmuv3_newapi_p1_jgg@nvidia.com (mailing list archive)
State New, archived
Headers show
Series Update SMMUv3 to the modern iommu API (part 1/3) | expand

Commit Message

Jason Gunthorpe Jan. 25, 2024, 11:57 p.m. UTC
The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.

During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.

This is pretty complicated, and almost works today because
arm_smmu_detach_dev() removes the master from the linked list before
working on the CD entries, preventing parallel update of the CD.

However, it does have an issue where the CD can remain programed while the
domain appears to be unattached. arm_smmu_share_asid() will then not clear
any CD entriess and install its own CD entry with the same ASID
concurrently. This creates a small race window where the IOMMU can see two
ASIDs pointing to different translations.

Solve this by wrapping most of the attach flow in the
arm_smmu_asid_lock. This locks more than strictly needed to prepare for
the next patch which will reorganize the order of the linked list, STE and
CD changes.

Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

Comments

Mostafa Saleh Feb. 1, 2024, 12:15 p.m. UTC | #1
Hi Jason,

On Thu, Jan 25, 2024 at 07:57:16PM -0400, Jason Gunthorpe wrote:
> The BTM support wants to be able to change the ASID of any smmu_domain.
> When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> the target domain's devices list.
> 
> During attach of a S1 domain we must ensure that the devices list and
> CD are in sync, otherwise we could miss CD updates or a parallel CD update
> could push an out of date CD.
> 
> This is pretty complicated, and almost works today because
> arm_smmu_detach_dev() removes the master from the linked list before
> working on the CD entries, preventing parallel update of the CD.
> 
> However, it does have an issue where the CD can remain programed while the
> domain appears to be unattached. arm_smmu_share_asid() will then not clear
> any CD entriess and install its own CD entry with the same ASID
> concurrently. This creates a small race window where the IOMMU can see two
> ASIDs pointing to different translations.

I don’t see the race condition.

The current flow is as follows,
For SVA, if the asid was used by domain_x, it will do:

lock(arm_smmu_asid_lock)
Alloc new asid and set cd->asid.
lock(domain_x->devices_lock)
Write new CD with the new asid
unlock(domain_x->devices_lock)
unlock(arm_smmu_asid_lock)

For attach_dev (domain_y), if the device was attached to domain_z
//Detach old domain
lock(domain_z->devices_lock)
Remove master from old domain
unlock(domain_z->devices_lock)
Clear CD
//Attach new domain
lock(arm_smmu_asid_lock)
Allocate ASID
unlock(arm_smmu_asid_lock)

lock(domain_y->devices_lock)
Insert new master.
unlock(domain_y->devices_lock)

lock(arm_smmu_asid_lock)
Write CD
unlock(arm_smmu_asid_lock)


In case
1) domain_x == domain_z(old domain)
Write to the CD is protected by domain_x->devices_lock, so either:
    a) The device will be removed, so SVA code will not touch it, and the
    detach will clear the CD.
    b) The device CD will be updated from the SVA with the new code, but then
    it will be removed from the domain and cleared.

I don’t see any case where we end with a programmed CD.

2) domain_x == domain_y(new domain)

Similarly the device would either see the new CD(new asid) or the old CD
then the new CD.

Can you please clarify the race condition? as it seems I am missing something.

> Solve this by wrapping most of the attach flow in the
> arm_smmu_asid_lock. This locks more than strictly needed to prepare for
> the next patch which will reorganize the order of the linked list, STE and
> CD changes.
> 
> Move arm_smmu_detach_dev() till after we have initialized the domain so
> the lock can be held for less time.
> 
> Reviewed-by: Michael Shavit <mshavit@google.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Tested-by: Moritz Fischer <moritzf@google.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 22 ++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 9a95d0f1494223..539ef380f457fa 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2612,8 +2612,6 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  		return -EBUSY;
>  	}
>  
> -	arm_smmu_detach_dev(master);
> -
>  	mutex_lock(&smmu_domain->init_mutex);
>  
>  	if (!smmu_domain->smmu) {
> @@ -2628,6 +2626,16 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	if (ret)
>  		return ret;
>  
> +	/*
> +	 * Prevent arm_smmu_share_asid() from trying to change the ASID
> +	 * of either the old or new domain while we are working on it.
> +	 * This allows the STE and the smmu_domain->devices list to
> +	 * be inconsistent during this routine.
> +	 */
> +	mutex_lock(&arm_smmu_asid_lock);
> +
> +	arm_smmu_detach_dev(master);
> +
>  	master->domain = smmu_domain;
>  
>  	/*
> @@ -2653,13 +2661,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  			}
>  		}
>  
> -		/*
> -		 * Prevent SVA from concurrently modifying the CD or writing to
> -		 * the CD entry
> -		 */
> -		mutex_lock(&arm_smmu_asid_lock);
>  		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> -		mutex_unlock(&arm_smmu_asid_lock);
>  		if (ret) {
>  			master->domain = NULL;
>  			goto out_list_del;
> @@ -2669,13 +2671,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  	arm_smmu_install_ste_for_dev(master);
>  
>  	arm_smmu_enable_ats(master);
> -	return 0;
> +	goto out_unlock;
>  
>  out_list_del:
>  	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>  	list_del(&master->domain_head);
>  	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>  
> +out_unlock:
> +	mutex_unlock(&arm_smmu_asid_lock);
>  	return ret;
>  }
>  
> -- 
> 2.43.0
> 

Thanks,
Mostafa
Jason Gunthorpe Feb. 1, 2024, 1:24 p.m. UTC | #2
On Thu, Feb 01, 2024 at 12:15:53PM +0000, Mostafa Saleh wrote:
> Hi Jason,
> 
> On Thu, Jan 25, 2024 at 07:57:16PM -0400, Jason Gunthorpe wrote:
> > The BTM support wants to be able to change the ASID of any smmu_domain.
> > When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> > the target domain's devices list.
> > 
> > During attach of a S1 domain we must ensure that the devices list and
> > CD are in sync, otherwise we could miss CD updates or a parallel CD update
> > could push an out of date CD.
> > 
> > This is pretty complicated, and almost works today because
> > arm_smmu_detach_dev() removes the master from the linked list before
> > working on the CD entries, preventing parallel update of the CD.
> > 
> > However, it does have an issue where the CD can remain programed while the
> > domain appears to be unattached. arm_smmu_share_asid() will then not clear
> > any CD entriess and install its own CD entry with the same ASID
> > concurrently. This creates a small race window where the IOMMU can see two
> > ASIDs pointing to different translations.
> 
> I don’t see the race condition.
> 
> The current flow is as follows,
> For SVA, if the asid was used by domain_x, it will do:
> 
> lock(arm_smmu_asid_lock)
> Alloc new asid and set cd->asid.
> lock(domain_x->devices_lock)
> Write new CD with the new asid
> unlock(domain_x->devices_lock)
> unlock(arm_smmu_asid_lock)
> 
> For attach_dev (domain_y), if the device was attached to domain_z
> //Detach old domain
> lock(domain_z->devices_lock)
> Remove master from old domain
> unlock(domain_z->devices_lock)

At this moment all locks are dropped and the RID's CD entry continues
to use the ASID.

The racing BTM flow now runs and will do your above:

arm_smmu_mmu_notifier_get()
 arm_smmu_alloc_shared_cd()
  arm_smmu_share_asid():
    arm_smmu_update_ctx_desc_devices() <<- Does nothing due to list_del above
    arm_smmu_tlb_inv_asid() <<-- Woops, we are invalidating an ASID that is still in a CD!
 arm_smmu_write_ctx_desc() <<-- Install a new translation on a PASID's CD

Now the HW can observe two installed CDs using the same ASID but they
point to different translations. This is illegal.

> Clear CD

Now we remove the RID CD, but it is too late, the PASID CD is already
installed.

ASID/VMID lifecycle must be strictly contained to ensure the cache
remains coherent:

1. All programmed STE/CDs using the ASID/VMID must always point to the
   same translation

2. All references to a ASID/VMID must be removed from their STE/CDs
   before the ASID is flushed

3. The ASID/VMID must be flushed before it is assigned to a STE/CD
   with a new translation.

We solve this by requiring that the arm_smmu_asid_lock must be held
such that the smmu_domains->devices list AND the actual content of the
CD tables are always observed to be consistent.

Jason
Mostafa Saleh Feb. 13, 2024, 1:30 p.m. UTC | #3
On Thu, Feb 01, 2024 at 09:24:43AM -0400, Jason Gunthorpe wrote:
> On Thu, Feb 01, 2024 at 12:15:53PM +0000, Mostafa Saleh wrote:
> > Hi Jason,
> > 
> > On Thu, Jan 25, 2024 at 07:57:16PM -0400, Jason Gunthorpe wrote:
> > > The BTM support wants to be able to change the ASID of any smmu_domain.
> > > When it goes to do this it holds the arm_smmu_asid_lock and iterates over
> > > the target domain's devices list.
> > > 
> > > During attach of a S1 domain we must ensure that the devices list and
> > > CD are in sync, otherwise we could miss CD updates or a parallel CD update
> > > could push an out of date CD.
> > > 
> > > This is pretty complicated, and almost works today because
> > > arm_smmu_detach_dev() removes the master from the linked list before
> > > working on the CD entries, preventing parallel update of the CD.
> > > 
> > > However, it does have an issue where the CD can remain programed while the
> > > domain appears to be unattached. arm_smmu_share_asid() will then not clear
> > > any CD entriess and install its own CD entry with the same ASID
> > > concurrently. This creates a small race window where the IOMMU can see two
> > > ASIDs pointing to different translations.
> > 
> > I don’t see the race condition.
> > 
> > The current flow is as follows,
> > For SVA, if the asid was used by domain_x, it will do:
> > 
> > lock(arm_smmu_asid_lock)
> > Alloc new asid and set cd->asid.
> > lock(domain_x->devices_lock)
> > Write new CD with the new asid
> > unlock(domain_x->devices_lock)
> > unlock(arm_smmu_asid_lock)
> > 
> > For attach_dev (domain_y), if the device was attached to domain_z
> > //Detach old domain
> > lock(domain_z->devices_lock)
> > Remove master from old domain
> > unlock(domain_z->devices_lock)
> 
> At this moment all locks are dropped and the RID's CD entry continues
> to use the ASID.
> 
> The racing BTM flow now runs and will do your above:
> 
> arm_smmu_mmu_notifier_get()
>  arm_smmu_alloc_shared_cd()
>   arm_smmu_share_asid():
>     arm_smmu_update_ctx_desc_devices() <<- Does nothing due to list_del above
>     arm_smmu_tlb_inv_asid() <<-- Woops, we are invalidating an ASID that is still in a CD!
>  arm_smmu_write_ctx_desc() <<-- Install a new translation on a PASID's CD
> 
> Now the HW can observe two installed CDs using the same ASID but they
> point to different translations. This is illegal.
> 
> > Clear CD
> 
> Now we remove the RID CD, but it is too late, the PASID CD is already
> installed.
> 
> ASID/VMID lifecycle must be strictly contained to ensure the cache
> remains coherent:
> 
> 1. All programmed STE/CDs using the ASID/VMID must always point to the
>    same translation
> 
> 2. All references to a ASID/VMID must be removed from their STE/CDs
>    before the ASID is flushed
> 
> 3. The ASID/VMID must be flushed before it is assigned to a STE/CD
>    with a new translation.
> 
> We solve this by requiring that the arm_smmu_asid_lock must be held
> such that the smmu_domains->devices list AND the actual content of the
> CD tables are always observed to be consistent.
> 
> Jason

I see, thanks a lot for the detailed explanation. 
Maybe this can be added to the change log, so it’s documented somewhere.

Also, I guess this is mainly theoretical, as it requires the detached device to
issue DMA while being detached?

Thanks,
Mostafa
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9a95d0f1494223..539ef380f457fa 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2612,8 +2612,6 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		return -EBUSY;
 	}
 
-	arm_smmu_detach_dev(master);
-
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
@@ -2628,6 +2626,16 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (ret)
 		return ret;
 
+	/*
+	 * Prevent arm_smmu_share_asid() from trying to change the ASID
+	 * of either the old or new domain while we are working on it.
+	 * This allows the STE and the smmu_domain->devices list to
+	 * be inconsistent during this routine.
+	 */
+	mutex_lock(&arm_smmu_asid_lock);
+
+	arm_smmu_detach_dev(master);
+
 	master->domain = smmu_domain;
 
 	/*
@@ -2653,13 +2661,7 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 			}
 		}
 
-		/*
-		 * Prevent SVA from concurrently modifying the CD or writing to
-		 * the CD entry
-		 */
-		mutex_lock(&arm_smmu_asid_lock);
 		ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
-		mutex_unlock(&arm_smmu_asid_lock);
 		if (ret) {
 			master->domain = NULL;
 			goto out_list_del;
@@ -2669,13 +2671,15 @@  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	arm_smmu_install_ste_for_dev(master);
 
 	arm_smmu_enable_ats(master);
-	return 0;
+	goto out_unlock;
 
 out_list_del:
 	spin_lock_irqsave(&smmu_domain->devices_lock, flags);
 	list_del(&master->domain_head);
 	spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
 
+out_unlock:
+	mutex_unlock(&arm_smmu_asid_lock);
 	return ret;
 }