diff mbox series

iommu/io-pgtable-arm: Fix race handling in split_blk_unmap()

Message ID f6700817286f60597f2a93835bf658f3ef3585ef.1535026499.git.robin.murphy@arm.com (mailing list archive)
State New, archived
Headers show
Series iommu/io-pgtable-arm: Fix race handling in split_blk_unmap() | expand

Commit Message

Robin Murphy Aug. 23, 2018, 12:14 p.m. UTC
In removing the pagetable-wide lock, we gained the possibility of the
vanishingly unlikely case where we have a race between two concurrent
unmappers splitting the same block entry. The logic to handle this is
fairly straightforward - whoever loses the race frees their partial
next-level table and instead dereferences the winner's newly-installed
entry in order to fall back to a regular unmap, which intentionally
echoes the pre-existing case of recursively splitting a 1GB block down
to 4KB pages by installing a full table of 2MB blocks first.

Unfortunately, the chump who implemented that logic failed to update the
condition check for that fallback, meaning that if said race occurs at
the last level (where the loser's unmap_idx is valid) then the unmap
won't actually happen. Fix that to properly account for both the race
and recursive cases.

Fixes: 2c3d273eabe8 ("iommu/io-pgtable-arm: Support lockless operation")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/io-pgtable-arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Will Deacon Sept. 6, 2018, 10:05 a.m. UTC | #1
Hi Robin,

On Thu, Aug 23, 2018 at 01:14:59PM +0100, Robin Murphy wrote:
> In removing the pagetable-wide lock, we gained the possibility of the
> vanishingly unlikely case where we have a race between two concurrent
> unmappers splitting the same block entry. The logic to handle this is
> fairly straightforward - whoever loses the race frees their partial
> next-level table and instead dereferences the winner's newly-installed
> entry in order to fall back to a regular unmap, which intentionally
> echoes the pre-existing case of recursively splitting a 1GB block down
> to 4KB pages by installing a full table of 2MB blocks first.
> 
> Unfortunately, the chump who implemented that logic failed to update the
> condition check for that fallback, meaning that if said race occurs at
> the last level (where the loser's unmap_idx is valid) then the unmap
> won't actually happen. Fix that to properly account for both the race
> and recursive cases.
> 
> Fixes: 2c3d273eabe8 ("iommu/io-pgtable-arm: Support lockless operation")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/io-pgtable-arm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Well spotted! Did you just find this by inspection?

> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 010a254305dd..93b4833cef73 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -575,7 +575,7 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>  		tablep = iopte_deref(pte, data);
>  	}
>  
> -	if (unmap_idx < 0)
> +	if (unmap_idx < 0 || pte != blk_pte)
>  		return __arm_lpae_unmap(data, iova, size, lvl, tablep);

Can we tidy up the control flow a bit here to avoid re-checking the status
of the cmpxchg? See below.

Will

--->8

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 88641b4560bc..2f79efd16a05 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -574,13 +574,12 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 			return 0;
 
 		tablep = iopte_deref(pte, data);
+	} else if (unmap_idx >= 0) {
+		io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
+		return size;
 	}
 
-	if (unmap_idx < 0)
-		return __arm_lpae_unmap(data, iova, size, lvl, tablep);
-
-	io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
-	return size;
+	return __arm_lpae_unmap(data, iova, size, lvl, tablep);
 }
 
 static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
Robin Murphy Sept. 6, 2018, 11:14 a.m. UTC | #2
On 06/09/18 11:05, Will Deacon wrote:
> Hi Robin,
> 
> On Thu, Aug 23, 2018 at 01:14:59PM +0100, Robin Murphy wrote:
>> In removing the pagetable-wide lock, we gained the possibility of the
>> vanishingly unlikely case where we have a race between two concurrent
>> unmappers splitting the same block entry. The logic to handle this is
>> fairly straightforward - whoever loses the race frees their partial
>> next-level table and instead dereferences the winner's newly-installed
>> entry in order to fall back to a regular unmap, which intentionally
>> echoes the pre-existing case of recursively splitting a 1GB block down
>> to 4KB pages by installing a full table of 2MB blocks first.
>>
>> Unfortunately, the chump who implemented that logic failed to update the
>> condition check for that fallback, meaning that if said race occurs at
>> the last level (where the loser's unmap_idx is valid) then the unmap
>> won't actually happen. Fix that to properly account for both the race
>> and recursive cases.
>>
>> Fixes: 2c3d273eabe8 ("iommu/io-pgtable-arm: Support lockless operation")
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/io-pgtable-arm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Well spotted! Did you just find this by inspection?

Indeed - I just kept seeing this bit in the context of the other patches 
and thinking "hang on, how do we actually use that new tablep after the 
deref?".

IIRC I actually implemented the new logic for v7s first, so clearly made 
the blunder in adding the recursive case when porting the change to LPAE.

>> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
>> index 010a254305dd..93b4833cef73 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -575,7 +575,7 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>>   		tablep = iopte_deref(pte, data);
>>   	}
>>   
>> -	if (unmap_idx < 0)
>> +	if (unmap_idx < 0 || pte != blk_pte)
>>   		return __arm_lpae_unmap(data, iova, size, lvl, tablep);
> 
> Can we tidy up the control flow a bit here to avoid re-checking the status
> of the cmpxchg? See below.

Sure, I just made the minimal change for correctness so as not to 
overcomplicate matters, but I'm pretty sure that diff looks correct. 
Feel free to squash it in if you're planning on taking this (yes it's 
nominally a fix, but it's absolutely non-urgent on account of nobody 
will ever realistically hit this).

Robin.

> 
> Will
> 
> --->8
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 88641b4560bc..2f79efd16a05 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -574,13 +574,12 @@ static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
>   			return 0;
>   
>   		tablep = iopte_deref(pte, data);
> +	} else if (unmap_idx >= 0) {
> +		io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
> +		return size;
>   	}
>   
> -	if (unmap_idx < 0)
> -		return __arm_lpae_unmap(data, iova, size, lvl, tablep);
> -
> -	io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);
> -	return size;
> +	return __arm_lpae_unmap(data, iova, size, lvl, tablep);
>   }
>   
>   static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>
Joerg Roedel Sept. 25, 2018, 9:01 a.m. UTC | #3
Hey Robin,

On Thu, Aug 23, 2018 at 01:14:59PM +0100, Robin Murphy wrote:
> Fixes: 2c3d273eabe8 ("iommu/io-pgtable-arm: Support lockless operation")
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

I can't find a newer version of this in my inbox, do you plan to send
one or has it been addressed differently?


Regards,

	Joerg
Robin Murphy Sept. 25, 2018, 10:48 a.m. UTC | #4
Hi Joerg,

On 25/09/18 10:01, Joerg Roedel wrote:
> Hey Robin,
> 
> On Thu, Aug 23, 2018 at 01:14:59PM +0100, Robin Murphy wrote:
>> Fixes: 2c3d273eabe8 ("iommu/io-pgtable-arm: Support lockless operation")
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> I can't find a newer version of this in my inbox, do you plan to send
> one or has it been addressed differently?

Will has the cleaned-up version in his tree along with a couple of other 
SMMU fixes, which I assume he's planning to send you a pull for (he's 
off at yet another conference just now so I can't confirm offline). We 
were planning for the non-strict stuff to come in as SMMU updates on top 
of that (p.s. please shout if you have any objections on that series so 
I can get on them ASAP!)

Thanks,
Robin.
Joerg Roedel Sept. 25, 2018, 1:03 p.m. UTC | #5
Hi Robin,

On Tue, Sep 25, 2018 at 11:48:17AM +0100, Robin Murphy wrote:
> Will has the cleaned-up version in his tree along with a couple of other
> SMMU fixes, which I assume he's planning to send you a pull for (he's off at
> yet another conference just now so I can't confirm offline).

Okay, less work for me, fine.

> We were planning for the non-strict stuff to come in as SMMU updates
> on top of that (p.s. please shout if you have any objections on that
> series so I can get on them ASAP!)

No objections, I have suggested this change for quite some time now.
Cool to see you implmented it :) I also looked at the patches and they
look good to me.


Regards,

	Joerg
diff mbox series

Patch

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 010a254305dd..93b4833cef73 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -575,7 +575,7 @@  static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
 		tablep = iopte_deref(pte, data);
 	}
 
-	if (unmap_idx < 0)
+	if (unmap_idx < 0 || pte != blk_pte)
 		return __arm_lpae_unmap(data, iova, size, lvl, tablep);
 
 	io_pgtable_tlb_add_flush(&data->iop, iova, size, size, true);