diff mbox series

[v2,05/19] PM / devfreq: tegra: Replace write memory barrier with the read barrier

Message ID 20190415145505.18397-6-digetx@gmail.com (mailing list archive)
State Not Applicable, archived
Headers show
Series NVIDIA Tegra devfreq improvements and Tegra20/30 support | expand

Commit Message

Dmitry Osipenko April 15, 2019, 2:54 p.m. UTC
The write memory barrier isn't needed because the BUS buffer is flushed
by read after write that happens after the removed wmb(), we will also
use readl() instead of the relaxed version to ensure that read is indeed
completed.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/devfreq/tegra-devfreq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Chanwoo Choi April 16, 2019, 8 a.m. UTC | #1
Hi,

On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
> The write memory barrier isn't needed because the BUS buffer is flushed
> by read after write that happens after the removed wmb(), we will also
> use readl() instead of the relaxed version to ensure that read is indeed
> completed.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra-devfreq.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index d62fb1b0d9bb..f0f0d78f6cbf 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra,
>  static void actmon_write_barrier(struct tegra_devfreq *tegra)
>  {
>  	/* ensure the update has reached the ACTMON */
> -	wmb();
> -	actmon_readl(tegra, ACTMON_GLB_STATUS);
> +	readl(tegra->regs + ACTMON_GLB_STATUS);

I think that this meaning of actmon_write_barrier() keeps
the order of 'store' assembly command without the execution change
from compiler optimization by using the wmb().

But, this patch edits it as following:
The result of the following two cases are same?

[original code]
	wmb()
	read_relaxed()

[new code by this patch]
	readl_relaxed()
	rmb()


>  }
>  
>  static void actmon_isr_device(struct tegra_devfreq *tegra,
>
Dmitry Osipenko April 16, 2019, 1:57 p.m. UTC | #2
16.04.2019 11:00, Chanwoo Choi пишет:
> Hi,
> 
> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
>> The write memory barrier isn't needed because the BUS buffer is flushed
>> by read after write that happens after the removed wmb(), we will also
>> use readl() instead of the relaxed version to ensure that read is indeed
>> completed.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/devfreq/tegra-devfreq.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
>> index d62fb1b0d9bb..f0f0d78f6cbf 100644
>> --- a/drivers/devfreq/tegra-devfreq.c
>> +++ b/drivers/devfreq/tegra-devfreq.c
>> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra,
>>  static void actmon_write_barrier(struct tegra_devfreq *tegra)
>>  {
>>  	/* ensure the update has reached the ACTMON */
>> -	wmb();
>> -	actmon_readl(tegra, ACTMON_GLB_STATUS);
>> +	readl(tegra->regs + ACTMON_GLB_STATUS);
> 
> I think that this meaning of actmon_write_barrier() keeps
> the order of 'store' assembly command without the execution change
> from compiler optimization by using the wmb().

The IO mapped memory is strongly-ordered on ARM, hence all readl/writel accesses are guaranteed to be ordered by default. I think wmb() here is just a cut-n-pasted relic from old downstream driver.

> But, this patch edits it as following:
> The result of the following two cases are same?
> 
> [original code]
> 	wmb()
> 	read_relaxed()
> 
> [new code by this patch]
> 	readl_relaxed()
> 	rmb()

Yes, the result is the same. The wmb() is not just about IO accesses, but about all kind of memory accesses and at least on Tegra30 it is quite expensive operation because it translates into L2XO cache syncing (arm_heavy_mb) + dsb(). It should be more efficient to flush out writes with a read-back and then wait for that read operation to be completed.
Chanwoo Choi April 17, 2019, 12:55 a.m. UTC | #3
On 19. 4. 16. 오후 10:57, Dmitry Osipenko wrote:
> 16.04.2019 11:00, Chanwoo Choi пишет:
>> Hi,
>>
>> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote:
>>> The write memory barrier isn't needed because the BUS buffer is flushed
>>> by read after write that happens after the removed wmb(), we will also
>>> use readl() instead of the relaxed version to ensure that read is indeed
>>> completed.
>>>
>>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>>> ---
>>>  drivers/devfreq/tegra-devfreq.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
>>> index d62fb1b0d9bb..f0f0d78f6cbf 100644
>>> --- a/drivers/devfreq/tegra-devfreq.c
>>> +++ b/drivers/devfreq/tegra-devfreq.c
>>> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra,
>>>  static void actmon_write_barrier(struct tegra_devfreq *tegra)
>>>  {
>>>  	/* ensure the update has reached the ACTMON */
>>> -	wmb();
>>> -	actmon_readl(tegra, ACTMON_GLB_STATUS);
>>> +	readl(tegra->regs + ACTMON_GLB_STATUS);
>>
>> I think that this meaning of actmon_write_barrier() keeps
>> the order of 'store' assembly command without the execution change
>> from compiler optimization by using the wmb().
> 
> The IO mapped memory is strongly-ordered on ARM, hence all readl/writel accesses are guaranteed to be ordered by default. I think wmb() here is just a cut-n-pasted relic from old downstream driver.

OK.

> 
>> But, this patch edits it as following:
>> The result of the following two cases are same?
>>
>> [original code]
>> 	wmb()
>> 	read_relaxed()
>>
>> [new code by this patch]
>> 	readl_relaxed()
>> 	rmb()
> 
> Yes, the result is the same. The wmb() is not just about IO accesses, but about all kind of memory accesses and at least on Tegra30 it is quite expensive operation because it translates into L2XO cache syncing (arm_heavy_mb) + dsb(). It should be more efficient to flush out writes with a read-back and then wait for that read operation to be completed.
> 
> 

OK. Thanks for explanation.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
diff mbox series

Patch

diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
index d62fb1b0d9bb..f0f0d78f6cbf 100644
--- a/drivers/devfreq/tegra-devfreq.c
+++ b/drivers/devfreq/tegra-devfreq.c
@@ -243,8 +243,7 @@  static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra,
 static void actmon_write_barrier(struct tegra_devfreq *tegra)
 {
 	/* ensure the update has reached the ACTMON */
-	wmb();
-	actmon_readl(tegra, ACTMON_GLB_STATUS);
+	readl(tegra->regs + ACTMON_GLB_STATUS);
 }
 
 static void actmon_isr_device(struct tegra_devfreq *tegra,