Message ID | 20240625-adreno_barriers-v2-1-c01f2ef4b62a@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Clean up barriers | expand |
On Tue, Jun 25, 2024 at 08:54:41PM +0200, Konrad Dybcio wrote: > Memory barriers help ensure instruction ordering, NOT time and order > of actual write arrival at other observers (e.g. memory-mapped IP). > On architectures employing weak memory ordering, the latter can be a > giant pain point, and it has been as part of this driver. > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of > readl/writel, which include r/w (respectively) barriers. > > Replace the barriers with a readback (or drop altogether where possible) > that ensures the previous writes have exited the write buffer (as the CPU > must flush the write to the register it's trying to read back). > > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Some in pci these readbacks are actually part of the spec and called posting reads. I'd very much recommend drivers create a small wrapper function for these cases with a void return value, because it makes the code so much more legible and easier to understand. -Sima > --- > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 4 +--- > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 10 ++++++---- > 2 files changed, 7 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > index 0e3dfd4c2bc8..09d640165b18 100644 > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > @@ -466,9 +466,7 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu) > int ret; > u32 val; > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1); > - /* Wait for the register to finish posting */ > - wmb(); > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1)); > > ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val, > val & (1 << 1), 100, 10000); > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > index c98cdb1e9326..4083d0cad782 100644 > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > @@ -855,14 +855,16 @@ static int hw_init(struct msm_gpu *gpu) > /* Clear GBIF halt in case GX domain was not collapsed */ > if (adreno_is_a619_holi(adreno_gpu)) { > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > + gpu_read(gpu, REG_A6XX_GBIF_HALT); > + > gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0); > - /* Let's make extra sure that the GPU can access the memory.. */ > - mb(); > + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL); > } else if (a6xx_has_gbif(adreno_gpu)) { > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > + gpu_read(gpu, REG_A6XX_GBIF_HALT); > + > gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0); > - /* Let's make extra sure that the GPU can access the memory.. */ > - mb(); > + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT); > } > > /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */ > > -- > 2.45.2 >
On Wed, Jun 26, 2024 at 09:59:39AM +0200, Daniel Vetter wrote: > On Tue, Jun 25, 2024 at 08:54:41PM +0200, Konrad Dybcio wrote: > > Memory barriers help ensure instruction ordering, NOT time and order > > of actual write arrival at other observers (e.g. memory-mapped IP). > > On architectures employing weak memory ordering, the latter can be a > > giant pain point, and it has been as part of this driver. > > > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of > > readl/writel, which include r/w (respectively) barriers. > > > > Replace the barriers with a readback (or drop altogether where possible) > > that ensures the previous writes have exited the write buffer (as the CPU > > must flush the write to the register it's trying to read back). > > > > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> > > Some in pci these readbacks are actually part of the spec and called > posting reads. I'd very much recommend drivers create a small wrapper > function for these cases with a void return value, because it makes the > code so much more legible and easier to understand. For Adreno which is configured via mmio, we don't need to do this often. GBIF_HALT is a scenario where we need to be extra careful as it can potentially cause some internal lockup. Another scenario I can think of is GPU soft reset where need to keep a delay on cpu side after triggering. We should closely scrutinize any other instance that comes up. So I feel a good justification as a comment here would be enough, to remind the reader. Think of it as a way to discourage the use by making it hard. This is a bit subjective, I am fine if you have a strong opinion on this. -Akhil. > -Sima > > > --- > > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 4 +--- > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 10 ++++++---- > > 2 files changed, 7 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > index 0e3dfd4c2bc8..09d640165b18 100644 > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > @@ -466,9 +466,7 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu) > > int ret; > > u32 val; > > > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1); > > - /* Wait for the register to finish posting */ > > - wmb(); > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1)); > > > > ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val, > > val & (1 << 1), 100, 10000); > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > index c98cdb1e9326..4083d0cad782 100644 > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > @@ -855,14 +855,16 @@ static int hw_init(struct msm_gpu *gpu) > > /* Clear GBIF halt in case GX domain was not collapsed */ > > if (adreno_is_a619_holi(adreno_gpu)) { > > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > > + gpu_read(gpu, REG_A6XX_GBIF_HALT); > > + > > gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0); > > - /* Let's make extra sure that the GPU can access the memory.. */ > > - mb(); > > + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL); > > } else if (a6xx_has_gbif(adreno_gpu)) { > > gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > > + gpu_read(gpu, REG_A6XX_GBIF_HALT); > > + > > gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0); > > - /* Let's make extra sure that the GPU can access the memory.. */ > > - mb(); > > + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT); > > } > > > > /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */ > > > > -- > > 2.45.2 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Thu, Jun 27, 2024 at 02:54:57AM +0530, Akhil P Oommen wrote: > On Wed, Jun 26, 2024 at 09:59:39AM +0200, Daniel Vetter wrote: > > On Tue, Jun 25, 2024 at 08:54:41PM +0200, Konrad Dybcio wrote: > > > Memory barriers help ensure instruction ordering, NOT time and order > > > of actual write arrival at other observers (e.g. memory-mapped IP). > > > On architectures employing weak memory ordering, the latter can be a > > > giant pain point, and it has been as part of this driver. > > > > > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of > > > readl/writel, which include r/w (respectively) barriers. > > > > > > Replace the barriers with a readback (or drop altogether where possible) > > > that ensures the previous writes have exited the write buffer (as the CPU > > > must flush the write to the register it's trying to read back). > > > > > > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> > > > > Some in pci these readbacks are actually part of the spec and called > > posting reads. I'd very much recommend drivers create a small wrapper > > function for these cases with a void return value, because it makes the > > code so much more legible and easier to understand. > > For Adreno which is configured via mmio, we don't need to do this often. GBIF_HALT > is a scenario where we need to be extra careful as it can potentially cause some > internal lockup. Another scenario I can think of is GPU soft reset where need to > keep a delay on cpu side after triggering. We should closely scrutinize any > other instance that comes up. So I feel a good justification as a comment here > would be enough, to remind the reader. Think of it as a way to discourage the > use by making it hard. > > This is a bit subjective, I am fine if you have a strong opinion on this. Eh it's up to you, but "we don't do this often" is a reason to make them stand out even more. Similar reasons why cpu memory barriers must all have a comment, to explain what they're synchronizing against. Up to you if you just want a comment rule or make them stand out even more with an explicit name (and still have the comment rule) that's different from normal reads. Again comparing to cpu barriers, the nice thing is that they're (in most cases at least, unless you do really scary stuff) very easy to spot in the code and the ring alarm bells when doing reviews. -Sima
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index 0e3dfd4c2bc8..09d640165b18 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -466,9 +466,7 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu) int ret; u32 val; - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1); - /* Wait for the register to finish posting */ - wmb(); + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1)); ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val, val & (1 << 1), 100, 10000); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index c98cdb1e9326..4083d0cad782 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -855,14 +855,16 @@ static int hw_init(struct msm_gpu *gpu) /* Clear GBIF halt in case GX domain was not collapsed */ if (adreno_is_a619_holi(adreno_gpu)) { gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); + gpu_read(gpu, REG_A6XX_GBIF_HALT); + gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0); - /* Let's make extra sure that the GPU can access the memory.. */ - mb(); + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL); } else if (a6xx_has_gbif(adreno_gpu)) { gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); + gpu_read(gpu, REG_A6XX_GBIF_HALT); + gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0); - /* Let's make extra sure that the GPU can access the memory.. */ - mb(); + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT); } /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
Memory barriers help ensure instruction ordering, NOT time and order of actual write arrival at other observers (e.g. memory-mapped IP). On architectures employing weak memory ordering, the latter can be a giant pain point, and it has been as part of this driver. Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of readl/writel, which include r/w (respectively) barriers. Replace the barriers with a readback (or drop altogether where possible) that ensures the previous writes have exited the write buffer (as the CPU must flush the write to the register it's trying to read back). Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 4 +--- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 10 ++++++---- 2 files changed, 7 insertions(+), 7 deletions(-)