Message ID | 20190415145505.18397-6-digetx@gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | NVIDIA Tegra devfreq improvements and Tegra20/30 support | expand |
Hi, On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote: > The write memory barrier isn't needed because the BUS buffer is flushed > by read after write that happens after the removed wmb(), we will also > use readl() instead of the relaxed version to ensure that read is indeed > completed. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > drivers/devfreq/tegra-devfreq.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c > index d62fb1b0d9bb..f0f0d78f6cbf 100644 > --- a/drivers/devfreq/tegra-devfreq.c > +++ b/drivers/devfreq/tegra-devfreq.c > @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, > static void actmon_write_barrier(struct tegra_devfreq *tegra) > { > /* ensure the update has reached the ACTMON */ > - wmb(); > - actmon_readl(tegra, ACTMON_GLB_STATUS); > + readl(tegra->regs + ACTMON_GLB_STATUS); I think that this meaning of actmon_write_barrier() keeps the order of 'store' assembly command without the execution change from compiler optimization by using the wmb(). But, this patch edits it as following: The result of the following two cases are same? [original code] wmb() read_relaxed() [new code by this patch] readl_relaxed() rmb() > } > > static void actmon_isr_device(struct tegra_devfreq *tegra, >
16.04.2019 11:00, Chanwoo Choi пишет: > Hi, > > On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote: >> The write memory barrier isn't needed because the BUS buffer is flushed >> by read after write that happens after the removed wmb(), we will also >> use readl() instead of the relaxed version to ensure that read is indeed >> completed. >> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >> --- >> drivers/devfreq/tegra-devfreq.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c >> index d62fb1b0d9bb..f0f0d78f6cbf 100644 >> --- a/drivers/devfreq/tegra-devfreq.c >> +++ b/drivers/devfreq/tegra-devfreq.c >> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, >> static void actmon_write_barrier(struct tegra_devfreq *tegra) >> { >> /* ensure the update has reached the ACTMON */ >> - wmb(); >> - actmon_readl(tegra, ACTMON_GLB_STATUS); >> + readl(tegra->regs + ACTMON_GLB_STATUS); > > I think that this meaning of actmon_write_barrier() keeps > the order of 'store' assembly command without the execution change > from compiler optimization by using the wmb(). The IO mapped memory is strongly-ordered on ARM, hence all readl/writel accesses are guaranteed to be ordered by default. I think wmb() here is just a cut-n-pasted relic from old downstream driver. > But, this patch edits it as following: > The result of the following two cases are same? > > [original code] > wmb() > read_relaxed() > > [new code by this patch] > readl_relaxed() > rmb() Yes, the result is the same. The wmb() is not just about IO accesses, but about all kind of memory accesses and at least on Tegra30 it is quite expensive operation because it translates into L2XO cache syncing (arm_heavy_mb) + dsb(). It should be more efficient to flush out writes with a read-back and then wait for that read operation to be completed.
On 19. 4. 16. 오후 10:57, Dmitry Osipenko wrote: > 16.04.2019 11:00, Chanwoo Choi пишет: >> Hi, >> >> On 19. 4. 15. 오후 11:54, Dmitry Osipenko wrote: >>> The write memory barrier isn't needed because the BUS buffer is flushed >>> by read after write that happens after the removed wmb(), we will also >>> use readl() instead of the relaxed version to ensure that read is indeed >>> completed. >>> >>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> >>> --- >>> drivers/devfreq/tegra-devfreq.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c >>> index d62fb1b0d9bb..f0f0d78f6cbf 100644 >>> --- a/drivers/devfreq/tegra-devfreq.c >>> +++ b/drivers/devfreq/tegra-devfreq.c >>> @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, >>> static void actmon_write_barrier(struct tegra_devfreq *tegra) >>> { >>> /* ensure the update has reached the ACTMON */ >>> - wmb(); >>> - actmon_readl(tegra, ACTMON_GLB_STATUS); >>> + readl(tegra->regs + ACTMON_GLB_STATUS); >> >> I think that this meaning of actmon_write_barrier() keeps >> the order of 'store' assembly command without the execution change >> from compiler optimization by using the wmb(). > > The IO mapped memory is strongly-ordered on ARM, hence all readl/writel accesses are guaranteed to be ordered by default. I think wmb() here is just a cut-n-pasted relic from old downstream driver. OK. > >> But, this patch edits it as following: >> The result of the following two cases are same? >> >> [original code] >> wmb() >> read_relaxed() >> >> [new code by this patch] >> readl_relaxed() >> rmb() > > Yes, the result is the same. The wmb() is not just about IO accesses, but about all kind of memory accesses and at least on Tegra30 it is quite expensive operation because it translates into L2XO cache syncing (arm_heavy_mb) + dsb(). It should be more efficient to flush out writes with a read-back and then wait for that read operation to be completed. > > OK. Thanks for explanation. Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c index d62fb1b0d9bb..f0f0d78f6cbf 100644 --- a/drivers/devfreq/tegra-devfreq.c +++ b/drivers/devfreq/tegra-devfreq.c @@ -243,8 +243,7 @@ static void tegra_devfreq_update_wmark(struct tegra_devfreq *tegra, static void actmon_write_barrier(struct tegra_devfreq *tegra) { /* ensure the update has reached the ACTMON */ - wmb(); - actmon_readl(tegra, ACTMON_GLB_STATUS); + readl(tegra->regs + ACTMON_GLB_STATUS); } static void actmon_isr_device(struct tegra_devfreq *tegra,
The write memory barrier isn't needed because the BUS buffer is flushed by read after write that happens after the removed wmb(), we will also use readl() instead of the relaxed version to ensure that read is indeed completed. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> --- drivers/devfreq/tegra-devfreq.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)