diff mbox

[2/2] drm/i915: Do not force non-caching copies for pwrite along shmem path

Message ID 1394181037-30480-2-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson March 7, 2014, 8:30 a.m. UTC
We don't always want to write into main memory with pwrite. The shmem
fast path in particular is used for memory that is cacheable - under
such circumstances forcing the cache eviction is undesirable. As we will
always flush the cache when targeting incoherent buffers, we can rely on
that second pass to apply the cache coherency rules and so benefit from
in-cache copies otherwise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Daniel Vetter March 7, 2014, 8:39 a.m. UTC | #1
On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote:
> We don't always want to write into main memory with pwrite. The shmem
> fast path in particular is used for memory that is cacheable - under
> such circumstances forcing the cache eviction is undesirable. As we will
> always flush the cache when targeting incoherent buffers, we can rely on
> that second pass to apply the cache coherency rules and so benefit from
> in-cache copies otherwise.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Do you have some numbers on this? Looks good otherwise.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 877afb2c576d..e0ca6d6be2ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
>  	if (needs_clflush_before)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> -	ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> -						user_data,
> -						page_length);
> +	ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> +					user_data, page_length);
>  	if (needs_clflush_after)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson March 7, 2014, 9:50 a.m. UTC | #2
On Fri, Mar 07, 2014 at 09:39:44AM +0100, Daniel Vetter wrote:
> On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote:
> > We don't always want to write into main memory with pwrite. The shmem
> > fast path in particular is used for memory that is cacheable - under
> > such circumstances forcing the cache eviction is undesirable. As we will
> > always flush the cache when targeting incoherent buffers, we can rely on
> > that second pass to apply the cache coherency rules and so benefit from
> > in-cache copies otherwise.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Do you have some numbers on this? Looks good otherwise.

Comparative figures with 1333MHz DDR3 on crw:
0: Time to snooped copy 16384 bytes x 131072:       19.520µs, 839.4MiB/s
1: Time to snooped copy 16384 bytes x 131072:       19.444µs, 842.6MiB/s
2: Time to snooped copy 16384 bytes x 131072:       18.808µs, 871.1MiB/s

Oddly enough though, it was the removing the page flag accesses that
made the most impact at a higher level.

It will take a while longer to complete checks on pnv.
-Chris
bradley.d.volkin@intel.com March 7, 2014, 6:14 p.m. UTC | #3
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>

On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote:
> We don't always want to write into main memory with pwrite. The shmem
> fast path in particular is used for memory that is cacheable - under
> such circumstances forcing the cache eviction is undesirable. As we will
> always flush the cache when targeting incoherent buffers, we can rely on
> that second pass to apply the cache coherency rules and so benefit from
> in-cache copies otherwise.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 877afb2c576d..e0ca6d6be2ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
>  	if (needs_clflush_before)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> -	ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> -						user_data,
> -						page_length);
> +	ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> +					user_data, page_length);
>  	if (needs_clflush_after)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> -- 
> 1.9.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Daniel Vetter March 7, 2014, 11:03 p.m. UTC | #4
On Fri, Mar 07, 2014 at 10:14:58AM -0800, Volkin, Bradley D wrote:
> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com>
> 
> On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote:
> > We don't always want to write into main memory with pwrite. The shmem
> > fast path in particular is used for memory that is cacheable - under
> > such circumstances forcing the cache eviction is undesirable. As we will
> > always flush the cache when targeting incoherent buffers, we can rely on
> > that second pass to apply the cache coherency rules and so benefit from
> > in-cache copies otherwise.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Both patches merged, thanks.
-Daniel

> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 877afb2c576d..e0ca6d6be2ae 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
> >  	if (needs_clflush_before)
> >  		drm_clflush_virt_range(vaddr + shmem_page_offset,
> >  				       page_length);
> > -	ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
> > -						user_data,
> > -						page_length);
> > +	ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
> > +					user_data, page_length);
> >  	if (needs_clflush_after)
> >  		drm_clflush_virt_range(vaddr + shmem_page_offset,
> >  				       page_length);
> > -- 
> > 1.9.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 877afb2c576d..e0ca6d6be2ae 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -810,9 +810,8 @@  shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 	if (needs_clflush_before)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
-	ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset,
-						user_data,
-						page_length);
+	ret = __copy_from_user_inatomic(vaddr + shmem_page_offset,
+					user_data, page_length);
 	if (needs_clflush_after)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);