Message ID | 20211122104400.4160-2-dafna.hirschfeld@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iommu/mediatek: fix tlb flush logic | expand |
Hi Dafna, Sorry for reply late. On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: > From: Yong Wu <yong.wu@mediatek.com> > > Prepare for 2 HWs that sharing pgtable in different power-domains. > > When there are 2 M4U HWs, it may has problem in the flush_range in > which > we get the pm_status via the m4u dev, BUT that function don't reflect > the > real power-domain status of the HW since there may be other HW also > use > that power-domain. > > The function dma_alloc_attrs help allocate the iommu buffer which > need the corresponding power domain since tlb flush is needed when > preparing iova. BUT this function only is for allocating buffer, > we have no good reason to request the user always call pm_runtime_get > before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all > in the pm_runtime_resume to make sure the tlb always is clean. > > Another solution is always call pm_runtime_get in the > tlb_flush_range. > This will trigger pm runtime resume/backup so often when the iommu > power is not active at some time(means user don't call pm_runtime_get > before calling dma_alloc_xxx), This may cause the performance drop. > thus we don't use this. > > In other case, the iommu's power should always be active via device > link with smi. > > The previous SoC don't have PM except mt8192. the mt8192 IOMMU is > display's > power-domain which nearly always is enabled. thus no need fix tags > here. > Prepare for mt8195. In this patchset, this message should be not proper. I think you could add the comment why this patch is needed in mt8173. > > Signed-off-by: Yong Wu <yong.wu@mediatek.com> > [imporvie inline doc] > Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> > --- > drivers/iommu/mtk_iommu.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 25b834104790..28dc4b95b6d9 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -964,6 +964,13 @@ static int __maybe_unused > mtk_iommu_runtime_resume(struct device *dev) > return ret; > } > > + /* > + * Users may allocate dma buffer before they call > pm_runtime_get, > + * in which case it will lack the necessary tlb flush. > + * Thus, make sure to update the tlb after each PM resume. > + */ > + mtk_iommu_tlb_flush_all(data); This should not work. since current the *_tlb_flush_all call pm_runtime_get_if_in_use which will always return 0 when it called from this runtime_cb in my test. thus, It won't do the tlb_flush_all actually. I guess this also depend on these two patches of mt8195 v3. [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have the pm operation. This looks has a dependence. Let me know if I can help this. > + > /* > * Uppon first resume, only enable the clk and return, since > the values of the > * registers are not yet set.
On 27.11.21 04:46, Yong Wu wrote: > Hi Dafna, > > Sorry for reply late. > > On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: >> From: Yong Wu <yong.wu@mediatek.com> >> >> Prepare for 2 HWs that sharing pgtable in different power-domains. >> >> When there are 2 M4U HWs, it may has problem in the flush_range in >> which >> we get the pm_status via the m4u dev, BUT that function don't reflect >> the >> real power-domain status of the HW since there may be other HW also >> use >> that power-domain. >> >> The function dma_alloc_attrs help allocate the iommu buffer which >> need the corresponding power domain since tlb flush is needed when >> preparing iova. BUT this function only is for allocating buffer, >> we have no good reason to request the user always call pm_runtime_get >> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all >> in the pm_runtime_resume to make sure the tlb always is clean. >> >> Another solution is always call pm_runtime_get in the >> tlb_flush_range. >> This will trigger pm runtime resume/backup so often when the iommu >> power is not active at some time(means user don't call pm_runtime_get >> before calling dma_alloc_xxx), This may cause the performance drop. >> thus we don't use this. >> >> In other case, the iommu's power should always be active via device >> link with smi. >> >> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is >> display's >> power-domain which nearly always is enabled. thus no need fix tags >> here. >> Prepare for mt8195. > > In this patchset, this message should be not proper. I think you could > add the comment why this patch is needed in mt8173. > >> >> Signed-off-by: Yong Wu <yong.wu@mediatek.com> >> [imporvie inline doc] >> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> >> --- >> drivers/iommu/mtk_iommu.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >> index 25b834104790..28dc4b95b6d9 100644 >> --- a/drivers/iommu/mtk_iommu.c >> +++ b/drivers/iommu/mtk_iommu.c >> @@ -964,6 +964,13 @@ static int __maybe_unused >> mtk_iommu_runtime_resume(struct device *dev) >> return ret; >> } >> >> + /* >> + * Users may allocate dma buffer before they call >> pm_runtime_get, >> + * in which case it will lack the necessary tlb flush. >> + * Thus, make sure to update the tlb after each PM resume. >> + */ >> + mtk_iommu_tlb_flush_all(data); > > This should not work. since current the *_tlb_flush_all call > pm_runtime_get_if_in_use which will always return 0 when it called from > this runtime_cb in my test. thus, It won't do the tlb_flush_all > actually. > > I guess this also depend on these two patches of mt8195 v3. > [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all > [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all > > like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have > the pm operation. > > This looks has a dependence. Let me know if I can help this. It did work for me, testing on elm device. I'll check that again. > >> + >> /* >> * Uppon first resume, only enable the clk and return, since >> the values of the >> * registers are not yet set.
On 07.12.21 10:31, Dafna Hirschfeld wrote: > > > On 27.11.21 04:46, Yong Wu wrote: >> Hi Dafna, >> >> Sorry for reply late. >> >> On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: >>> From: Yong Wu <yong.wu@mediatek.com> >>> >>> Prepare for 2 HWs that sharing pgtable in different power-domains. >>> >>> When there are 2 M4U HWs, it may has problem in the flush_range in >>> which >>> we get the pm_status via the m4u dev, BUT that function don't reflect >>> the >>> real power-domain status of the HW since there may be other HW also >>> use >>> that power-domain. >>> >>> The function dma_alloc_attrs help allocate the iommu buffer which >>> need the corresponding power domain since tlb flush is needed when >>> preparing iova. BUT this function only is for allocating buffer, >>> we have no good reason to request the user always call pm_runtime_get >>> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all >>> in the pm_runtime_resume to make sure the tlb always is clean. >>> >>> Another solution is always call pm_runtime_get in the >>> tlb_flush_range. >>> This will trigger pm runtime resume/backup so often when the iommu >>> power is not active at some time(means user don't call pm_runtime_get >>> before calling dma_alloc_xxx), This may cause the performance drop. >>> thus we don't use this. >>> >>> In other case, the iommu's power should always be active via device >>> link with smi. >>> >>> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is >>> display's >>> power-domain which nearly always is enabled. thus no need fix tags >>> here. >>> Prepare for mt8195. >> >> In this patchset, this message should be not proper. I think you could >> add the comment why this patch is needed in mt8173. >> >>> >>> Signed-off-by: Yong Wu <yong.wu@mediatek.com> >>> [imporvie inline doc] >>> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> >>> --- >>> drivers/iommu/mtk_iommu.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>> index 25b834104790..28dc4b95b6d9 100644 >>> --- a/drivers/iommu/mtk_iommu.c >>> +++ b/drivers/iommu/mtk_iommu.c >>> @@ -964,6 +964,13 @@ static int __maybe_unused >>> mtk_iommu_runtime_resume(struct device *dev) >>> return ret; >>> } >>> + /* >>> + * Users may allocate dma buffer before they call >>> pm_runtime_get, >>> + * in which case it will lack the necessary tlb flush. >>> + * Thus, make sure to update the tlb after each PM resume. >>> + */ >>> + mtk_iommu_tlb_flush_all(data); >> >> This should not work. since current the *_tlb_flush_all call >> pm_runtime_get_if_in_use which will always return 0 when it called from >> this runtime_cb in my test. thus, It won't do the tlb_flush_all >> actually. He, indeed, my mistake, although the encoder works more or less fine even without the full flush so I didn't catch that. >> >> I guess this also depend on these two patches of mt8195 v3. >> [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all >> [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all I'll add those two >> >> like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have >> the pm operation. yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all" So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all on patch 9 and removing it again on patch 13 is confusing so I'll avoid that. Thanks, Dafna >> >> This looks has a dependence. Let me know if I can help this. > > It did work for me, testing on elm device. I'll check that again. > > >> >>> + >>> /* >>> * Uppon first resume, only enable the clk and return, since >>> the values of the >>> * registers are not yet set. >
On 08.12.21 11:50, Dafna Hirschfeld wrote: > > > On 07.12.21 10:31, Dafna Hirschfeld wrote: >> >> >> On 27.11.21 04:46, Yong Wu wrote: >>> Hi Dafna, >>> >>> Sorry for reply late. >>> >>> On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: >>>> From: Yong Wu <yong.wu@mediatek.com> >>>> >>>> Prepare for 2 HWs that sharing pgtable in different power-domains. >>>> >>>> When there are 2 M4U HWs, it may has problem in the flush_range in >>>> which >>>> we get the pm_status via the m4u dev, BUT that function don't reflect >>>> the >>>> real power-domain status of the HW since there may be other HW also >>>> use >>>> that power-domain. >>>> >>>> The function dma_alloc_attrs help allocate the iommu buffer which >>>> need the corresponding power domain since tlb flush is needed when >>>> preparing iova. BUT this function only is for allocating buffer, >>>> we have no good reason to request the user always call pm_runtime_get >>>> before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all >>>> in the pm_runtime_resume to make sure the tlb always is clean. >>>> >>>> Another solution is always call pm_runtime_get in the >>>> tlb_flush_range. >>>> This will trigger pm runtime resume/backup so often when the iommu >>>> power is not active at some time(means user don't call pm_runtime_get >>>> before calling dma_alloc_xxx), This may cause the performance drop. >>>> thus we don't use this. >>>> >>>> In other case, the iommu's power should always be active via device >>>> link with smi. >>>> >>>> The previous SoC don't have PM except mt8192. the mt8192 IOMMU is >>>> display's >>>> power-domain which nearly always is enabled. thus no need fix tags >>>> here. >>>> Prepare for mt8195. >>> >>> In this patchset, this message should be not proper. I think you could >>> add the comment why this patch is needed in mt8173. >>> >>>> >>>> Signed-off-by: Yong Wu <yong.wu@mediatek.com> >>>> [imporvie inline doc] >>>> Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com> >>>> --- >>>> drivers/iommu/mtk_iommu.c | 7 +++++++ >>>> 1 file changed, 7 insertions(+) >>>> >>>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c >>>> index 25b834104790..28dc4b95b6d9 100644 >>>> --- a/drivers/iommu/mtk_iommu.c >>>> +++ b/drivers/iommu/mtk_iommu.c >>>> @@ -964,6 +964,13 @@ static int __maybe_unused >>>> mtk_iommu_runtime_resume(struct device *dev) >>>> return ret; >>>> } >>>> + /* >>>> + * Users may allocate dma buffer before they call >>>> pm_runtime_get, >>>> + * in which case it will lack the necessary tlb flush. >>>> + * Thus, make sure to update the tlb after each PM resume. >>>> + */ >>>> + mtk_iommu_tlb_flush_all(data); >>> >>> This should not work. since current the *_tlb_flush_all call >>> pm_runtime_get_if_in_use which will always return 0 when it called from >>> this runtime_cb in my test. thus, It won't do the tlb_flush_all >>> actually. > > He, indeed, my mistake, although the encoder works more or less fine even > without the full flush so I didn't catch that. > >>> >>> I guess this also depend on these two patches of mt8195 v3. >>> [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all >>> [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all > > I'll add those two > >>> >>> like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have >>> the pm operation. > > yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func > I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all" > > So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all > on patch 9 and removing it again on patch 13 is confusing so I'll avoid that. > In addition, the call to mtk_iommu_tlb_flush_all from mtk_iommu_runtime_resume should move to the bottom of the function after all values are updated > Thanks, > Dafna > > > >>> >>> This looks has a dependence. Let me know if I can help this. >> >> It did work for me, testing on elm device. I'll check that again. >> >> >>> >>>> + >>>> /* >>>> * Uppon first resume, only enable the clk and return, since >>>> the values of the >>>> * registers are not yet set. >>
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 25b834104790..28dc4b95b6d9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* + * Users may allocate dma buffer before they call pm_runtime_get, + * in which case it will lack the necessary tlb flush. + * Thus, make sure to update the tlb after each PM resume. + */ + mtk_iommu_tlb_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set.