diff mbox series

[1/2] iommu/mediatek: Always tlb_flush_all when each PM resume

Message ID 20211122104400.4160-2-dafna.hirschfeld@collabora.com (mailing list archive)
State New, archived
Headers show
Series iommu/mediatek: fix tlb flush logic | expand

Commit Message

Dafna Hirschfeld Nov. 22, 2021, 10:43 a.m. UTC
From: Yong Wu <yong.wu@mediatek.com>

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in which
we get the pm_status via the m4u dev, BUT that function don't reflect the
real power-domain status of the HW since there may be other HW also use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's
power-domain which nearly always is enabled. thus no need fix tags here.
Prepare for mt8195.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
[imporvie inline doc]
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
---
 drivers/iommu/mtk_iommu.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Yong Wu (吴勇) Nov. 27, 2021, 2:46 a.m. UTC | #1
Hi Dafna,

Sorry for reply late.

On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:
> From: Yong Wu <yong.wu@mediatek.com>
> 
> Prepare for 2 HWs that sharing pgtable in different power-domains.
> 
> When there are 2 M4U HWs, it may has problem in the flush_range in
> which
> we get the pm_status via the m4u dev, BUT that function don't reflect
> the
> real power-domain status of the HW since there may be other HW also
> use
> that power-domain.
> 
> The function dma_alloc_attrs help allocate the iommu buffer which
> need the corresponding power domain since tlb flush is needed when
> preparing iova. BUT this function only is for allocating buffer,
> we have no good reason to request the user always call pm_runtime_get
> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
> in the pm_runtime_resume to make sure the tlb always is clean.
> 
> Another solution is always call pm_runtime_get in the
> tlb_flush_range.
> This will trigger pm runtime resume/backup so often when the iommu
> power is not active at some time(means user don't call pm_runtime_get
> before calling dma_alloc_xxx), This may cause the performance drop.
> thus we don't use this.
> 
> In other case, the iommu's power should always be active via device
> link with smi.
> 
> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
> display's
> power-domain which nearly always is enabled. thus no need fix tags
> here.
> Prepare for mt8195.

In this patchset, this message should be not proper. I think you could
add the comment why this patch is needed in mt8173.

> 
> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
> [imporvie inline doc]
> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
> ---
>  drivers/iommu/mtk_iommu.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 25b834104790..28dc4b95b6d9 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -964,6 +964,13 @@ static int __maybe_unused
> mtk_iommu_runtime_resume(struct device *dev)
>  		return ret;
>  	}
>  
> +	/*
> +	 * Users may allocate dma buffer before they call
> pm_runtime_get,
> +	 * in which case it will lack the necessary tlb flush.
> +	 * Thus, make sure to update the tlb after each PM resume.
> +	 */
> +	mtk_iommu_tlb_flush_all(data);

This should not work. since current the *_tlb_flush_all call
pm_runtime_get_if_in_use which will always return 0 when it called from
this runtime_cb in my test. thus, It won't do the tlb_flush_all
actually.

I guess this also depend on these two patches of mt8195 v3.
[PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
[PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all

like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
the pm operation.

This looks has a dependence. Let me know if I can help this.

> +
>  	/*
>  	 * Uppon first resume, only enable the clk and return, since
> the values of the
>  	 * registers are not yet set.
Dafna Hirschfeld Dec. 7, 2021, 8:31 a.m. UTC | #2
On 27.11.21 04:46, Yong Wu wrote:
> Hi Dafna,
> 
> Sorry for reply late.
> 
> On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:
>> From: Yong Wu <yong.wu@mediatek.com>
>>
>> Prepare for 2 HWs that sharing pgtable in different power-domains.
>>
>> When there are 2 M4U HWs, it may has problem in the flush_range in
>> which
>> we get the pm_status via the m4u dev, BUT that function don't reflect
>> the
>> real power-domain status of the HW since there may be other HW also
>> use
>> that power-domain.
>>
>> The function dma_alloc_attrs help allocate the iommu buffer which
>> need the corresponding power domain since tlb flush is needed when
>> preparing iova. BUT this function only is for allocating buffer,
>> we have no good reason to request the user always call pm_runtime_get
>> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
>> in the pm_runtime_resume to make sure the tlb always is clean.
>>
>> Another solution is always call pm_runtime_get in the
>> tlb_flush_range.
>> This will trigger pm runtime resume/backup so often when the iommu
>> power is not active at some time(means user don't call pm_runtime_get
>> before calling dma_alloc_xxx), This may cause the performance drop.
>> thus we don't use this.
>>
>> In other case, the iommu's power should always be active via device
>> link with smi.
>>
>> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
>> display's
>> power-domain which nearly always is enabled. thus no need fix tags
>> here.
>> Prepare for mt8195.
> 
> In this patchset, this message should be not proper. I think you could
> add the comment why this patch is needed in mt8173.
> 
>>
>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>> [imporvie inline doc]
>> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
>> ---
>>   drivers/iommu/mtk_iommu.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
>> index 25b834104790..28dc4b95b6d9 100644
>> --- a/drivers/iommu/mtk_iommu.c
>> +++ b/drivers/iommu/mtk_iommu.c
>> @@ -964,6 +964,13 @@ static int __maybe_unused
>> mtk_iommu_runtime_resume(struct device *dev)
>>   		return ret;
>>   	}
>>   
>> +	/*
>> +	 * Users may allocate dma buffer before they call
>> pm_runtime_get,
>> +	 * in which case it will lack the necessary tlb flush.
>> +	 * Thus, make sure to update the tlb after each PM resume.
>> +	 */
>> +	mtk_iommu_tlb_flush_all(data);
> 
> This should not work. since current the *_tlb_flush_all call
> pm_runtime_get_if_in_use which will always return 0 when it called from
> this runtime_cb in my test. thus, It won't do the tlb_flush_all
> actually.
> 
> I guess this also depend on these two patches of mt8195 v3.
> [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
> [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all
> 
> like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
> the pm operation.
> 
> This looks has a dependence. Let me know if I can help this.

It did work for me, testing on elm device. I'll check that again.


> 
>> +
>>   	/*
>>   	 * Uppon first resume, only enable the clk and return, since
>> the values of the
>>   	 * registers are not yet set.
Dafna Hirschfeld Dec. 8, 2021, 9:50 a.m. UTC | #3
On 07.12.21 10:31, Dafna Hirschfeld wrote:
> 
> 
> On 27.11.21 04:46, Yong Wu wrote:
>> Hi Dafna,
>>
>> Sorry for reply late.
>>
>> On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:
>>> From: Yong Wu <yong.wu@mediatek.com>
>>>
>>> Prepare for 2 HWs that sharing pgtable in different power-domains.
>>>
>>> When there are 2 M4U HWs, it may has problem in the flush_range in
>>> which
>>> we get the pm_status via the m4u dev, BUT that function don't reflect
>>> the
>>> real power-domain status of the HW since there may be other HW also
>>> use
>>> that power-domain.
>>>
>>> The function dma_alloc_attrs help allocate the iommu buffer which
>>> need the corresponding power domain since tlb flush is needed when
>>> preparing iova. BUT this function only is for allocating buffer,
>>> we have no good reason to request the user always call pm_runtime_get
>>> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
>>> in the pm_runtime_resume to make sure the tlb always is clean.
>>>
>>> Another solution is always call pm_runtime_get in the
>>> tlb_flush_range.
>>> This will trigger pm runtime resume/backup so often when the iommu
>>> power is not active at some time(means user don't call pm_runtime_get
>>> before calling dma_alloc_xxx), This may cause the performance drop.
>>> thus we don't use this.
>>>
>>> In other case, the iommu's power should always be active via device
>>> link with smi.
>>>
>>> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
>>> display's
>>> power-domain which nearly always is enabled. thus no need fix tags
>>> here.
>>> Prepare for mt8195.
>>
>> In this patchset, this message should be not proper. I think you could
>> add the comment why this patch is needed in mt8173.
>>
>>>
>>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>>> [imporvie inline doc]
>>> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
>>> ---
>>>   drivers/iommu/mtk_iommu.c | 7 +++++++
>>>   1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
>>> index 25b834104790..28dc4b95b6d9 100644
>>> --- a/drivers/iommu/mtk_iommu.c
>>> +++ b/drivers/iommu/mtk_iommu.c
>>> @@ -964,6 +964,13 @@ static int __maybe_unused
>>> mtk_iommu_runtime_resume(struct device *dev)
>>>           return ret;
>>>       }
>>> +    /*
>>> +     * Users may allocate dma buffer before they call
>>> pm_runtime_get,
>>> +     * in which case it will lack the necessary tlb flush.
>>> +     * Thus, make sure to update the tlb after each PM resume.
>>> +     */
>>> +    mtk_iommu_tlb_flush_all(data);
>>
>> This should not work. since current the *_tlb_flush_all call
>> pm_runtime_get_if_in_use which will always return 0 when it called from
>> this runtime_cb in my test. thus, It won't do the tlb_flush_all
>> actually.

He, indeed, my mistake, although the encoder works more or less fine even
without the full flush so I didn't catch that.

>>
>> I guess this also depend on these two patches of mt8195 v3.
>> [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
>> [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all

I'll add those two

>>
>> like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
>> the pm operation.

yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func
I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all"

So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all
on patch 9 and removing it again on patch 13 is confusing so I'll avoid that.

Thanks,
Dafna



>>
>> This looks has a dependence. Let me know if I can help this.
> 
> It did work for me, testing on elm device. I'll check that again.
> 
> 
>>
>>> +
>>>       /*
>>>        * Uppon first resume, only enable the clk and return, since
>>> the values of the
>>>        * registers are not yet set.
>
Dafna Hirschfeld Dec. 8, 2021, 10:18 a.m. UTC | #4
On 08.12.21 11:50, Dafna Hirschfeld wrote:
> 
> 
> On 07.12.21 10:31, Dafna Hirschfeld wrote:
>>
>>
>> On 27.11.21 04:46, Yong Wu wrote:
>>> Hi Dafna,
>>>
>>> Sorry for reply late.
>>>
>>> On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:
>>>> From: Yong Wu <yong.wu@mediatek.com>
>>>>
>>>> Prepare for 2 HWs that sharing pgtable in different power-domains.
>>>>
>>>> When there are 2 M4U HWs, it may has problem in the flush_range in
>>>> which
>>>> we get the pm_status via the m4u dev, BUT that function don't reflect
>>>> the
>>>> real power-domain status of the HW since there may be other HW also
>>>> use
>>>> that power-domain.
>>>>
>>>> The function dma_alloc_attrs help allocate the iommu buffer which
>>>> need the corresponding power domain since tlb flush is needed when
>>>> preparing iova. BUT this function only is for allocating buffer,
>>>> we have no good reason to request the user always call pm_runtime_get
>>>> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
>>>> in the pm_runtime_resume to make sure the tlb always is clean.
>>>>
>>>> Another solution is always call pm_runtime_get in the
>>>> tlb_flush_range.
>>>> This will trigger pm runtime resume/backup so often when the iommu
>>>> power is not active at some time(means user don't call pm_runtime_get
>>>> before calling dma_alloc_xxx), This may cause the performance drop.
>>>> thus we don't use this.
>>>>
>>>> In other case, the iommu's power should always be active via device
>>>> link with smi.
>>>>
>>>> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
>>>> display's
>>>> power-domain which nearly always is enabled. thus no need fix tags
>>>> here.
>>>> Prepare for mt8195.
>>>
>>> In this patchset, this message should be not proper. I think you could
>>> add the comment why this patch is needed in mt8173.
>>>
>>>>
>>>> Signed-off-by: Yong Wu <yong.wu@mediatek.com>
>>>> [imporvie inline doc]
>>>> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
>>>> ---
>>>>   drivers/iommu/mtk_iommu.c | 7 +++++++
>>>>   1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
>>>> index 25b834104790..28dc4b95b6d9 100644
>>>> --- a/drivers/iommu/mtk_iommu.c
>>>> +++ b/drivers/iommu/mtk_iommu.c
>>>> @@ -964,6 +964,13 @@ static int __maybe_unused
>>>> mtk_iommu_runtime_resume(struct device *dev)
>>>>           return ret;
>>>>       }
>>>> +    /*
>>>> +     * Users may allocate dma buffer before they call
>>>> pm_runtime_get,
>>>> +     * in which case it will lack the necessary tlb flush.
>>>> +     * Thus, make sure to update the tlb after each PM resume.
>>>> +     */
>>>> +    mtk_iommu_tlb_flush_all(data);
>>>
>>> This should not work. since current the *_tlb_flush_all call
>>> pm_runtime_get_if_in_use which will always return 0 when it called from
>>> this runtime_cb in my test. thus, It won't do the tlb_flush_all
>>> actually.
> 
> He, indeed, my mistake, although the encoder works more or less fine even
> without the full flush so I didn't catch that.
> 
>>>
>>> I guess this also depend on these two patches of mt8195 v3.
>>> [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
>>> [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all
> 
> I'll add those two
> 
>>>
>>> like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
>>> the pm operation.
> 
> yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func
> I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all"
> 
> So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all
> on patch 9 and removing it again on patch 13 is confusing so I'll avoid that.
> 

In addition, the call to mtk_iommu_tlb_flush_all from mtk_iommu_runtime_resume should move to the bottom of the function
after all values are updated

> Thanks,
> Dafna
> 
> 
> 
>>>
>>> This looks has a dependence. Let me know if I can help this.
>>
>> It did work for me, testing on elm device. I'll check that again.
>>
>>
>>>
>>>> +
>>>>       /*
>>>>        * Uppon first resume, only enable the clk and return, since
>>>> the values of the
>>>>        * registers are not yet set.
>>
diff mbox series

Patch

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 25b834104790..28dc4b95b6d9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -964,6 +964,13 @@  static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev)
 		return ret;
 	}
 
+	/*
+	 * Users may allocate dma buffer before they call pm_runtime_get,
+	 * in which case it will lack the necessary tlb flush.
+	 * Thus, make sure to update the tlb after each PM resume.
+	 */
+	mtk_iommu_tlb_flush_all(data);
+
 	/*
 	 * Uppon first resume, only enable the clk and return, since the values of the
 	 * registers are not yet set.