[2/2] drm/vgem: use normal cached mmap'ings

Message ID	20190716164221.15436-2-robdclark@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Rob Clark <robdclark@gmail.com> To: dri-devel@lists.freedesktop.org Subject: [PATCH 2/2] drm/vgem: use normal cached mmap'ings Date: Tue, 16 Jul 2019 09:42:15 -0700 Message-Id: <20190716164221.15436-2-robdclark@gmail.com> In-Reply-To: <20190716164221.15436-1-robdclark@gmail.com> References: <20190716164221.15436-1-robdclark@gmail.com> MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AsJY7Tj55Ay2SJtdNoNPmO3/SlrZD7SizoCm9UXpp50=; b=ZJgOo0aPZ5O0N0+21Je0obxlaf1FaJT3cv7gM8x1kbGuErDXE66IS718/79YsniIMC JkD/a0C49E5DiaULB8NAGz4S8hPSyLvlkfCyBR7L5Xdq9pKY3cfBJvxR5rQdH8CN4jfd 096nCgykKR1hJHdZbXuaIiOxaH27fAm8d/X9StbT12ZzEGLrr0xKC8aASBB96Bhognyl bDN4tz6wkQGm8j6sSVSyUXAscYYPe1TAr3OLWP86ylZATpU/Tpg4DrRWGW9MWZvLCPwf oix5USpYNgIvTdqTjiHeFgb0dy1CSLYzgmAFpEKV0mYWVBjVSG9xTZFdFU6t78mAMUeL JpBw== Precedence: list Cc: Rob Clark <robdclark@chromium.org>, Deepak Sharma <deepak.sharma@amd.com>, Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>, Eric Biggers <ebiggers@google.com>, David Airlie <airlied@linux.ie>, linux-kernel@vger.kernel.org, Thomas Zimmermann <tzimmermann@suse.de>, Emil Velikov <emil.velikov@collabora.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	[1/2] drm/gem: don't force writecombine mmap'ing \| expand [1/2] drm/gem: don't force writecombine mmap'ing [2/2] drm/vgem: use normal cached mmap'ings

Message ID

20190716164221.15436-2-robdclark@gmail.com (mailing list archive)

State

New, archived

Headers

From: Rob Clark <robdclark@gmail.com>
To: dri-devel@lists.freedesktop.org
Subject: [PATCH 2/2] drm/vgem: use normal cached mmap'ings
Date: Tue, 16 Jul 2019 09:42:15 -0700
Message-Id: <20190716164221.15436-2-robdclark@gmail.com>
In-Reply-To: <20190716164221.15436-1-robdclark@gmail.com>
References: <20190716164221.15436-1-robdclark@gmail.com>
MIME-Version: 1.0
Precedence: list
Cc: Rob Clark <robdclark@chromium.org>, Deepak Sharma <deepak.sharma@amd.com>,
 Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>,
 Eric Biggers <ebiggers@google.com>, David Airlie <airlied@linux.ie>,
 linux-kernel@vger.kernel.org, Thomas Zimmermann <tzimmermann@suse.de>,
 Emil Velikov <emil.velikov@collabora.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

[1/2] drm/gem: don't force writecombine mmap'ing | expand

Commit Message

Rob Clark July 16, 2019, 4:42 p.m. UTC

From: Rob Clark <robdclark@chromium.org>

Since there is no real device associated with vgem, it is impossible to
end up with appropriate dev->dma_ops, meaning that we have no way to
invalidate the shmem pages allocated by vgem.  So, at least on platforms
without drm_cflush_pages(), we end up with corruption when cache lines
from previous usage of vgem bo pages get evicted to memory.

The only sane option is to use cached mappings.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
Possibly we could dma_sync_*_for_{device,cpu}() on dmabuf attach/detach,
although the ->gem_prime_{pin,unpin}() API isn't quite ideal for that as
it is.  And that doesn't really help for drivers that don't attach/
detach for each use.

But AFAICT vgem is mainly used for dmabuf testing, so maybe we don't
need to care too much about use of cached mmap'ings.

 drivers/gpu/drm/vgem/vgem_drv.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Comments

Chris Wilson July 16, 2019, 4:59 p.m. UTC | #1

Quoting Rob Clark (2019-07-16 17:42:15)
> From: Rob Clark <robdclark@chromium.org>
> 
> Since there is no real device associated with vgem, it is impossible to
> end up with appropriate dev->dma_ops, meaning that we have no way to
> invalidate the shmem pages allocated by vgem.  So, at least on platforms
> without drm_cflush_pages(), we end up with corruption when cache lines
> from previous usage of vgem bo pages get evicted to memory.
> 
> The only sane option is to use cached mappings.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
> Possibly we could dma_sync_*_for_{device,cpu}() on dmabuf attach/detach,
> although the ->gem_prime_{pin,unpin}() API isn't quite ideal for that as
> it is.  And that doesn't really help for drivers that don't attach/
> detach for each use.
> 
> But AFAICT vgem is mainly used for dmabuf testing, so maybe we don't
> need to care too much about use of cached mmap'ings.

Sadly this regresses with i915 interop.

Starting subtest: 4KiB-tiny-vgem-blt-early-read-child
(gem_concurrent_blit:8309) CRITICAL: Test assertion failure function dmabuf_cmp_bo, file ../tests/i915/gem_concurrent_all.c:408:
(gem_concurrent_blit:8309) CRITICAL: Failed assertion: v[((y)*(b->width) + (((y) + pass)%(b->width)))] == val
(gem_concurrent_blit:8309) CRITICAL: error: 0 != 0xdeadbeef

and igt/prime_vgem

Can you please cc intel-gfx so CI can pick up these changes?
-Chris

Rob Clark July 16, 2019, 5:03 p.m. UTC | #2

On Tue, Jul 16, 2019 at 10:01 AM Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Rob Clark (2019-07-16 17:42:15)
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Since there is no real device associated with vgem, it is impossible to
> > end up with appropriate dev->dma_ops, meaning that we have no way to
> > invalidate the shmem pages allocated by vgem.  So, at least on platforms
> > without drm_cflush_pages(), we end up with corruption when cache lines
> > from previous usage of vgem bo pages get evicted to memory.
> >
> > The only sane option is to use cached mappings.
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> > Possibly we could dma_sync_*_for_{device,cpu}() on dmabuf attach/detach,
> > although the ->gem_prime_{pin,unpin}() API isn't quite ideal for that as
> > it is.  And that doesn't really help for drivers that don't attach/
> > detach for each use.
> >
> > But AFAICT vgem is mainly used for dmabuf testing, so maybe we don't
> > need to care too much about use of cached mmap'ings.
>
> Sadly this regresses with i915 interop.
>
> Starting subtest: 4KiB-tiny-vgem-blt-early-read-child
> (gem_concurrent_blit:8309) CRITICAL: Test assertion failure function dmabuf_cmp_bo, file ../tests/i915/gem_concurrent_all.c:408:
> (gem_concurrent_blit:8309) CRITICAL: Failed assertion: v[((y)*(b->width) + (((y) + pass)%(b->width)))] == val
> (gem_concurrent_blit:8309) CRITICAL: error: 0 != 0xdeadbeef
>
> and igt/prime_vgem
>
> Can you please cc intel-gfx so CI can pick up these changes?
> -Chris

I suppose CI is actually reading the imported VGEM bo from GPU?  I can
try to wire up the attach/detach dma_sync, which might help..

BR,
-R

Daniel Vetter July 19, 2019, 9:09 a.m. UTC | #3

On Tue, Jul 16, 2019 at 09:42:15AM -0700, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
> 
> Since there is no real device associated with vgem, it is impossible to
> end up with appropriate dev->dma_ops, meaning that we have no way to
> invalidate the shmem pages allocated by vgem.  So, at least on platforms
> without drm_cflush_pages(), we end up with corruption when cache lines
> from previous usage of vgem bo pages get evicted to memory.
> 
> The only sane option is to use cached mappings.
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
> Possibly we could dma_sync_*_for_{device,cpu}() on dmabuf attach/detach,
> although the ->gem_prime_{pin,unpin}() API isn't quite ideal for that as
> it is.  And that doesn't really help for drivers that don't attach/
> detach for each use.
> 
> But AFAICT vgem is mainly used for dmabuf testing, so maybe we don't
> need to care too much about use of cached mmap'ings.

Isn't this going to horribly break testing buffer sharing with SoC
devices? I'd assume they all expect writecombining mode to make sure stuff
is coherent?

Also could we get away with this by simply extending drm_cflush_pages for
those arm platforms where we do have a clflush instruction? Yes I know
that'll get people screaming, I'll shrug :-)

If all we need patch 1/2 for is this vgem patch then the auditing needed for
patch 1 doesn't look appealing ...
-Daniel

> 
>  drivers/gpu/drm/vgem/vgem_drv.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index 11a8f99ba18c..ccf0c3fbd586 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -259,9 +259,6 @@ static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
>  	if (ret)
>  		return ret;
>  
> -	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
> -	 * are ordinary and not special.
> -	 */
>  	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
>  	return 0;
>  }
> @@ -382,7 +379,7 @@ static void *vgem_prime_vmap(struct drm_gem_object *obj)
>  	if (IS_ERR(pages))
>  		return NULL;
>  
> -	return vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> +	return vmap(pages, n_pages, 0, PAGE_KERNEL);
>  }
>  
>  static void vgem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
> @@ -411,7 +408,7 @@ static int vgem_prime_mmap(struct drm_gem_object *obj,
>  	fput(vma->vm_file);
>  	vma->vm_file = get_file(obj->filp);
>  	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> -	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> +	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
>  
>  	return 0;
>  }
> -- 
> 2.21.0
>

Rob Clark July 19, 2019, 3:04 p.m. UTC | #4

On Fri, Jul 19, 2019 at 2:09 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Tue, Jul 16, 2019 at 09:42:15AM -0700, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Since there is no real device associated with vgem, it is impossible to
> > end up with appropriate dev->dma_ops, meaning that we have no way to
> > invalidate the shmem pages allocated by vgem.  So, at least on platforms
> > without drm_cflush_pages(), we end up with corruption when cache lines
> > from previous usage of vgem bo pages get evicted to memory.
> >
> > The only sane option is to use cached mappings.
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> > Possibly we could dma_sync_*_for_{device,cpu}() on dmabuf attach/detach,
> > although the ->gem_prime_{pin,unpin}() API isn't quite ideal for that as
> > it is.  And that doesn't really help for drivers that don't attach/
> > detach for each use.
> >
> > But AFAICT vgem is mainly used for dmabuf testing, so maybe we don't
> > need to care too much about use of cached mmap'ings.
>
> Isn't this going to horribly break testing buffer sharing with SoC
> devices? I'd assume they all expect writecombining mode to make sure stuff
> is coherent?
>
> Also could we get away with this by simply extending drm_cflush_pages for
> those arm platforms where we do have a clflush instruction? Yes I know
> that'll get people screaming, I'll shrug :-)
>
> If all we need patch 1/2 for is this vgem patch then the auditing needed for
> patch 1 doesn't look appealing ...

I think we should go w/ the simpler approach in that keeps WC (but
kinda relies on an implementation detail of dma-mapping, ie.
dev->dma_ops==NULL => dma_direct

IMO the first patch in this series is probably a thing we should try
to do somehow, it is a bit rude that core helpers are forcing WC.  But
not sure about how to land that smoothly.  Perhaps something worth
adding to the TODO list at any rate.

BR,
-R

> -Daniel
>
> >
> >  drivers/gpu/drm/vgem/vgem_drv.c | 7 ++-----
> >  1 file changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> > index 11a8f99ba18c..ccf0c3fbd586 100644
> > --- a/drivers/gpu/drm/vgem/vgem_drv.c
> > +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> > @@ -259,9 +259,6 @@ static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> >       if (ret)
> >               return ret;
> >
> > -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
> > -      * are ordinary and not special.
> > -      */
> >       vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> >       return 0;
> >  }
> > @@ -382,7 +379,7 @@ static void *vgem_prime_vmap(struct drm_gem_object *obj)
> >       if (IS_ERR(pages))
> >               return NULL;
> >
> > -     return vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> > +     return vmap(pages, n_pages, 0, PAGE_KERNEL);
> >  }
> >
> >  static void vgem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
> > @@ -411,7 +408,7 @@ static int vgem_prime_mmap(struct drm_gem_object *obj,
> >       fput(vma->vm_file);
> >       vma->vm_file = get_file(obj->filp);
> >       vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > +     vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
> >
> >       return 0;
> >  }
> > --
> > 2.21.0
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 11a8f99ba18c..ccf0c3fbd586 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -259,9 +259,6 @@  static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
 	if (ret)
 		return ret;
 
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
 	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
 	return 0;
 }
@@ -382,7 +379,7 @@  static void *vgem_prime_vmap(struct drm_gem_object *obj)
 	if (IS_ERR(pages))
 		return NULL;
 
-	return vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
+	return vmap(pages, n_pages, 0, PAGE_KERNEL);
 }
 
 static void vgem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
@@ -411,7 +408,7 @@  static int vgem_prime_mmap(struct drm_gem_object *obj,
 	fput(vma->vm_file);
 	vma->vm_file = get_file(obj->filp);
 	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+	vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
 
 	return 0;
 }

[2/2] drm/vgem: use normal cached mmap'ings

Commit Message

Comments

Patch