diff mbox series

drm/gem_shmem: Use a writecombine mapping for ->vaddr

Message ID 20190529065121.13485-1-boris.brezillon@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm/gem_shmem: Use a writecombine mapping for ->vaddr | expand

Commit Message

Boris Brezillon May 29, 2019, 6:51 a.m. UTC
Right now, the BO is mapped as a cached region when ->vmap() is called
and the underlying object is not a dmabuf.
Doing that makes cache management a bit more complicated (you'd need
to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
to be passed to the GPU/CPU), so let's map the BO with writecombine
attributes instead (as done in most drivers).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Found this issue while working on panfrost perfcnt where the GPU dumps
perf counter values in memory and the CPU reads them back in
kernel-space. This patch seems to solve the unpredictable behavior I
had.

I can also go for the other option (call dma_map/unmap/_sg() when
needed) if you think that's more appropriate.
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Daniel Vetter May 29, 2019, 6:58 a.m. UTC | #1
On Wed, May 29, 2019 at 08:51:21AM +0200, Boris Brezillon wrote:
> Right now, the BO is mapped as a cached region when ->vmap() is called
> and the underlying object is not a dmabuf.
> Doing that makes cache management a bit more complicated (you'd need
> to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
> to be passed to the GPU/CPU), so let's map the BO with writecombine
> attributes instead (as done in most drivers).
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> Found this issue while working on panfrost perfcnt where the GPU dumps
> perf counter values in memory and the CPU reads them back in
> kernel-space. This patch seems to solve the unpredictable behavior I
> had.
> 
> I can also go for the other option (call dma_map/unmap/_sg() when
> needed) if you think that's more appropriate.

Uh, I guess shmem helpers (or gem helpers in general) need some concept
about what kind of cpu mapping is desired. Since some cpus (like e.g.
i915) do actually want cached mode for everything.

Same is needed for the userspace mmap, those should all agree.

Default probably best if we go with uncached.
-Daniel

> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 1ee208c2c85e..472ea5d81f82 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -255,7 +255,8 @@ static void *drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem)
>  	if (obj->import_attach)
>  		shmem->vaddr = dma_buf_vmap(obj->import_attach->dmabuf);
>  	else
> -		shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT, VM_MAP, PAGE_KERNEL);
> +		shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT,
> +				    VM_MAP, pgprot_writecombine(PAGE_KERNEL));
>  
>  	if (!shmem->vaddr) {
>  		DRM_DEBUG_KMS("Failed to vmap pages\n");
> -- 
> 2.20.1
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Eric Anholt May 31, 2019, 3:46 p.m. UTC | #2
Boris Brezillon <boris.brezillon@collabora.com> writes:

> Right now, the BO is mapped as a cached region when ->vmap() is called
> and the underlying object is not a dmabuf.
> Doing that makes cache management a bit more complicated (you'd need
> to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
> to be passed to the GPU/CPU), so let's map the BO with writecombine
> attributes instead (as done in most drivers).
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> Found this issue while working on panfrost perfcnt where the GPU dumps
> perf counter values in memory and the CPU reads them back in
> kernel-space. This patch seems to solve the unpredictable behavior I
> had.
>
> I can also go for the other option (call dma_map/unmap/_sg() when
> needed) if you think that's more appropriate.

writecombined was the intent, and this makes kernel vmap match the
userspace mmap path.

Reviewed-by: Eric Anholt <eric@anholt.net>
Daniel Vetter June 3, 2019, 7:21 a.m. UTC | #3
On Fri, May 31, 2019 at 08:46:58AM -0700, Eric Anholt wrote:
> Boris Brezillon <boris.brezillon@collabora.com> writes:
> 
> > Right now, the BO is mapped as a cached region when ->vmap() is called
> > and the underlying object is not a dmabuf.
> > Doing that makes cache management a bit more complicated (you'd need
> > to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
> > to be passed to the GPU/CPU), so let's map the BO with writecombine
> > attributes instead (as done in most drivers).
> >
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > ---
> > Found this issue while working on panfrost perfcnt where the GPU dumps
> > perf counter values in memory and the CPU reads them back in
> > kernel-space. This patch seems to solve the unpredictable behavior I
> > had.
> >
> > I can also go for the other option (call dma_map/unmap/_sg() when
> > needed) if you think that's more appropriate.
> 
> writecombined was the intent, and this makes kernel vmap match the
> userspace mmap path.

Since I missed that obviously: Where do the shmem helpers set write
combined mode for userspace mmap?
-Daniel

> 
> Reviewed-by: Eric Anholt <eric@anholt.net>



> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Eric Anholt June 3, 2019, 3:43 p.m. UTC | #4
Daniel Vetter <daniel@ffwll.ch> writes:

> On Fri, May 31, 2019 at 08:46:58AM -0700, Eric Anholt wrote:
>> Boris Brezillon <boris.brezillon@collabora.com> writes:
>> 
>> > Right now, the BO is mapped as a cached region when ->vmap() is called
>> > and the underlying object is not a dmabuf.
>> > Doing that makes cache management a bit more complicated (you'd need
>> > to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
>> > to be passed to the GPU/CPU), so let's map the BO with writecombine
>> > attributes instead (as done in most drivers).
>> >
>> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
>> > ---
>> > Found this issue while working on panfrost perfcnt where the GPU dumps
>> > perf counter values in memory and the CPU reads them back in
>> > kernel-space. This patch seems to solve the unpredictable behavior I
>> > had.
>> >
>> > I can also go for the other option (call dma_map/unmap/_sg() when
>> > needed) if you think that's more appropriate.
>> 
>> writecombined was the intent, and this makes kernel vmap match the
>> userspace mmap path.
>
> Since I missed that obviously: Where do the shmem helpers set write
> combined mode for userspace mmap?

That was the trick when I asked the question, too.  drm_gem.c is what
sets WC by default.
Daniel Vetter June 3, 2019, 3:57 p.m. UTC | #5
On Mon, Jun 3, 2019 at 5:43 PM Eric Anholt <eric@anholt.net> wrote:
>
> Daniel Vetter <daniel@ffwll.ch> writes:
>
> > On Fri, May 31, 2019 at 08:46:58AM -0700, Eric Anholt wrote:
> >> Boris Brezillon <boris.brezillon@collabora.com> writes:
> >>
> >> > Right now, the BO is mapped as a cached region when ->vmap() is called
> >> > and the underlying object is not a dmabuf.
> >> > Doing that makes cache management a bit more complicated (you'd need
> >> > to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
> >> > to be passed to the GPU/CPU), so let's map the BO with writecombine
> >> > attributes instead (as done in most drivers).
> >> >
> >> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> >> > ---
> >> > Found this issue while working on panfrost perfcnt where the GPU dumps
> >> > perf counter values in memory and the CPU reads them back in
> >> > kernel-space. This patch seems to solve the unpredictable behavior I
> >> > had.
> >> >
> >> > I can also go for the other option (call dma_map/unmap/_sg() when
> >> > needed) if you think that's more appropriate.
> >>
> >> writecombined was the intent, and this makes kernel vmap match the
> >> userspace mmap path.
> >
> > Since I missed that obviously: Where do the shmem helpers set write
> > combined mode for userspace mmap?
>
> That was the trick when I asked the question, too.  drm_gem.c is what
> sets WC by default.

TIL. And looks like this has been the case forever, it laned in

commit a2c0a97b784f837300f7b0869c82ab712c600952
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Wed Nov 5 10:31:53 2008 -0800

    drm: GEM mmap support

I'll retract my concern, making this wc everywhere is the right thing to do.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Cheers, Daniel
Rob Herring June 10, 2019, 3:40 p.m. UTC | #6
On Wed, May 29, 2019 at 12:51 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> Right now, the BO is mapped as a cached region when ->vmap() is called
> and the underlying object is not a dmabuf.
> Doing that makes cache management a bit more complicated (you'd need
> to call dma_map/unmap_sg() on the ->sgt field everytime the BO is about
> to be passed to the GPU/CPU), so let's map the BO with writecombine
> attributes instead (as done in most drivers).
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> ---
> Found this issue while working on panfrost perfcnt where the GPU dumps
> perf counter values in memory and the CPU reads them back in
> kernel-space. This patch seems to solve the unpredictable behavior I
> had.
>
> I can also go for the other option (call dma_map/unmap/_sg() when
> needed) if you think that's more appropriate.

Using writecombine everywhere was the intention, but I missed this
spot. I've applied this adding a Fixes tag.

Rob
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 1ee208c2c85e..472ea5d81f82 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -255,7 +255,8 @@  static void *drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem)
 	if (obj->import_attach)
 		shmem->vaddr = dma_buf_vmap(obj->import_attach->dmabuf);
 	else
-		shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT, VM_MAP, PAGE_KERNEL);
+		shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT,
+				    VM_MAP, pgprot_writecombine(PAGE_KERNEL));
 
 	if (!shmem->vaddr) {
 		DRM_DEBUG_KMS("Failed to vmap pages\n");