Message ID | 1394181037-30480-2-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote: > We don't always want to write into main memory with pwrite. The shmem > fast path in particular is used for memory that is cacheable - under > such circumstances forcing the cache eviction is undesirable. As we will > always flush the cache when targeting incoherent buffers, we can rely on > that second pass to apply the cache coherency rules and so benefit from > in-cache copies otherwise. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Do you have some numbers on this? Looks good otherwise. -Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 877afb2c576d..e0ca6d6be2ae 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > if (needs_clflush_before) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > - user_data, > - page_length); > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > + user_data, page_length); > if (needs_clflush_after) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Fri, Mar 07, 2014 at 09:39:44AM +0100, Daniel Vetter wrote: > On Fri, Mar 07, 2014 at 08:30:37AM +0000, Chris Wilson wrote: > > We don't always want to write into main memory with pwrite. The shmem > > fast path in particular is used for memory that is cacheable - under > > such circumstances forcing the cache eviction is undesirable. As we will > > always flush the cache when targeting incoherent buffers, we can rely on > > that second pass to apply the cache coherency rules and so benefit from > > in-cache copies otherwise. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Do you have some numbers on this? Looks good otherwise. Comparative figures with 1333MHz DDR3 on crw: 0: Time to snooped copy 16384 bytes x 131072: 19.520µs, 839.4MiB/s 1: Time to snooped copy 16384 bytes x 131072: 19.444µs, 842.6MiB/s 2: Time to snooped copy 16384 bytes x 131072: 18.808µs, 871.1MiB/s Oddly enough though, it was the removing the page flag accesses that made the most impact at a higher level. It will take a while longer to complete checks on pnv. -Chris
Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote: > We don't always want to write into main memory with pwrite. The shmem > fast path in particular is used for memory that is cacheable - under > such circumstances forcing the cache eviction is undesirable. As we will > always flush the cache when targeting incoherent buffers, we can rely on > that second pass to apply the cache coherency rules and so benefit from > in-cache copies otherwise. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 877afb2c576d..e0ca6d6be2ae 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > if (needs_clflush_before) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > - user_data, > - page_length); > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > + user_data, page_length); > if (needs_clflush_after) > drm_clflush_virt_range(vaddr + shmem_page_offset, > page_length); > -- > 1.9.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Fri, Mar 07, 2014 at 10:14:58AM -0800, Volkin, Bradley D wrote: > Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> > > On Fri, Mar 07, 2014 at 12:30:37AM -0800, Chris Wilson wrote: > > We don't always want to write into main memory with pwrite. The shmem > > fast path in particular is used for memory that is cacheable - under > > such circumstances forcing the cache eviction is undesirable. As we will > > always flush the cache when targeting incoherent buffers, we can rely on > > that second pass to apply the cache coherency rules and so benefit from > > in-cache copies otherwise. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Both patches merged, thanks. -Daniel > > --- > > drivers/gpu/drm/i915/i915_gem.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 877afb2c576d..e0ca6d6be2ae 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, > > if (needs_clflush_before) > > drm_clflush_virt_range(vaddr + shmem_page_offset, > > page_length); > > - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, > > - user_data, > > - page_length); > > + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, > > + user_data, page_length); > > if (needs_clflush_after) > > drm_clflush_virt_range(vaddr + shmem_page_offset, > > page_length); > > -- > > 1.9.0 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 877afb2c576d..e0ca6d6be2ae 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -810,9 +810,8 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length, if (needs_clflush_before) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length); - ret = __copy_from_user_inatomic_nocache(vaddr + shmem_page_offset, - user_data, - page_length); + ret = __copy_from_user_inatomic(vaddr + shmem_page_offset, + user_data, page_length); if (needs_clflush_after) drm_clflush_virt_range(vaddr + shmem_page_offset, page_length);
We don't always want to write into main memory with pwrite. The shmem fast path in particular is used for memory that is cacheable - under such circumstances forcing the cache eviction is undesirable. As we will always flush the cache when targeting incoherent buffers, we can rely on that second pass to apply the cache coherency rules and so benefit from in-cache copies otherwise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)