diff mbox

[1/8] drm/i915: Synchronize pread/pwrite with wait_rendering

Message ID 1377906241-8463-2-git-send-email-benjamin.widawsky@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ben Widawsky Aug. 30, 2013, 11:43 p.m. UTC
lifted from Daniel:
pread/pwrite isn't about the object's domain at all, but purely about
synchronizing for outstanding rendering. Replacing the call to
set_to_gtt_domain with a wait_rendering would imo improve code
readability. Furthermore we could pimp pread to only block for
outstanding writes and not for reads.

Since you're not the first one to trip over this: Can I volunteer you
for a follow-up patch to fix this?

Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

Comments

Chris Wilson Aug. 30, 2013, 11:50 p.m. UTC | #1
On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> lifted from Daniel:
> pread/pwrite isn't about the object's domain at all, but purely about
> synchronizing for outstanding rendering. Replacing the call to
> set_to_gtt_domain with a wait_rendering would imo improve code
> readability. Furthermore we could pimp pread to only block for
> outstanding writes and not for reads.
> 
> Since you're not the first one to trip over this: Can I volunteer you
> for a follow-up patch to fix this?
> 
> Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

This should fail i-g-t...
-Chris
Ben Widawsky Aug. 31, 2013, 3:39 a.m. UTC | #2
On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > lifted from Daniel:
> > pread/pwrite isn't about the object's domain at all, but purely about
> > synchronizing for outstanding rendering. Replacing the call to
> > set_to_gtt_domain with a wait_rendering would imo improve code
> > readability. Furthermore we could pimp pread to only block for
> > outstanding writes and not for reads.
> > 
> > Since you're not the first one to trip over this: Can I volunteer you
> > for a follow-up patch to fix this?
> > 
> > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> This should fail i-g-t...
> -Chris
> 

Daniel, how have I failed your plan?
Daniel Vetter Sept. 2, 2013, 6:32 a.m. UTC | #3
On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > lifted from Daniel:
> > > pread/pwrite isn't about the object's domain at all, but purely about
> > > synchronizing for outstanding rendering. Replacing the call to
> > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > readability. Furthermore we could pimp pread to only block for
> > > outstanding writes and not for reads.
> > > 
> > > Since you're not the first one to trip over this: Can I volunteer you
> > > for a follow-up patch to fix this?
> > > 
> > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > This should fail i-g-t...
> > -Chris
> > 
> 
> Daniel, how have I failed your plan?

It should work ... Since the enclosing if-block checks for !cpu domain
(for either reads or writes) that implies that going into the gtt domain
is a noop (or better should be) wrt clflushing and we only wait for
outstanding gpu rendering. wait_rendering is an interface that's been
added afterwards. Unfortunately I've failed to explain this trickery in
either a comment or the commit message. Bad me ;-)

What does QA's patch test system say on a non-llc machine here?
-Daniel
Chris Wilson Sept. 2, 2013, 1:14 p.m. UTC | #4
On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > lifted from Daniel:
> > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > synchronizing for outstanding rendering. Replacing the call to
> > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > readability. Furthermore we could pimp pread to only block for
> > > > outstanding writes and not for reads.
> > > > 
> > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > for a follow-up patch to fix this?
> > > > 
> > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > This should fail i-g-t...
> > > -Chris
> > > 
> > 
> > Daniel, how have I failed your plan?
> 
> It should work ... Since the enclosing if-block checks for !cpu domain
> (for either reads or writes) that implies that going into the gtt domain
> is a noop (or better should be) wrt clflushing and we only wait for
> outstanding gpu rendering. wait_rendering is an interface that's been
> added afterwards. Unfortunately I've failed to explain this trickery in
> either a comment or the commit message. Bad me ;-)

The issue is that in the patch pwrite is not waiting for any outstanding
GPU reads.
-Chris
Daniel Vetter Sept. 2, 2013, 2:12 p.m. UTC | #5
On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > lifted from Daniel:
> > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > readability. Furthermore we could pimp pread to only block for
> > > > > outstanding writes and not for reads.
> > > > > 
> > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > for a follow-up patch to fix this?
> > > > > 
> > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > 
> > > > This should fail i-g-t...
> > > > -Chris
> > > > 
> > > 
> > > Daniel, how have I failed your plan?
> > 
> > It should work ... Since the enclosing if-block checks for !cpu domain
> > (for either reads or writes) that implies that going into the gtt domain
> > is a noop (or better should be) wrt clflushing and we only wait for
> > outstanding gpu rendering. wait_rendering is an interface that's been
> > added afterwards. Unfortunately I've failed to explain this trickery in
> > either a comment or the commit message. Bad me ;-)
> 
> The issue is that in the patch pwrite is not waiting for any outstanding
> GPU reads.

Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
This /should/ have been caught by the gem_concurrent_blt subtests that
exercise pwrites ...

Ben can you please check that this indeed blew up on igt? Should fail on
any platform, no special caching mode required.
-Daniel
Daniel Vetter Sept. 3, 2013, 4:08 p.m. UTC | #6
On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote:
> On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > > lifted from Daniel:
> > > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > > readability. Furthermore we could pimp pread to only block for
> > > > > > outstanding writes and not for reads.
> > > > > > 
> > > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > > for a follow-up patch to fix this?
> > > > > > 
> > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > > 
> > > > > This should fail i-g-t...
> > > > > -Chris
> > > > > 
> > > > 
> > > > Daniel, how have I failed your plan?
> > > 
> > > It should work ... Since the enclosing if-block checks for !cpu domain
> > > (for either reads or writes) that implies that going into the gtt domain
> > > is a noop (or better should be) wrt clflushing and we only wait for
> > > outstanding gpu rendering. wait_rendering is an interface that's been
> > > added afterwards. Unfortunately I've failed to explain this trickery in
> > > either a comment or the commit message. Bad me ;-)
> > 
> > The issue is that in the patch pwrite is not waiting for any outstanding
> > GPU reads.
> 
> Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
> This /should/ have been caught by the gem_concurrent_blt subtests that
> exercise pwrites ...
> 
> Ben can you please check that this indeed blew up on igt? Should fail on
> any platform, no special caching mode required.

Actually it won't blow up since you always set readonly = false. But it'll
kill the neat read-read optimization ...
-Daniel
Ben Widawsky Sept. 3, 2013, 11:53 p.m. UTC | #7
On Tue, Sep 03, 2013 at 06:08:19PM +0200, Daniel Vetter wrote:
> On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote:
> > On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote:
> > > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote:
> > > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote:
> > > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote:
> > > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote:
> > > > > > > lifted from Daniel:
> > > > > > > pread/pwrite isn't about the object's domain at all, but purely about
> > > > > > > synchronizing for outstanding rendering. Replacing the call to
> > > > > > > set_to_gtt_domain with a wait_rendering would imo improve code
> > > > > > > readability. Furthermore we could pimp pread to only block for
> > > > > > > outstanding writes and not for reads.
> > > > > > > 
> > > > > > > Since you're not the first one to trip over this: Can I volunteer you
> > > > > > > for a follow-up patch to fix this?
> > > > > > > 
> > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > > > 
> > > > > > This should fail i-g-t...
> > > > > > -Chris
> > > > > > 
> > > > > 
> > > > > Daniel, how have I failed your plan?
> > > > 
> > > > It should work ... Since the enclosing if-block checks for !cpu domain
> > > > (for either reads or writes) that implies that going into the gtt domain
> > > > is a noop (or better should be) wrt clflushing and we only wait for
> > > > outstanding gpu rendering. wait_rendering is an interface that's been
> > > > added afterwards. Unfortunately I've failed to explain this trickery in
> > > > either a comment or the commit message. Bad me ;-)
> > > 
> > > The issue is that in the patch pwrite is not waiting for any outstanding
> > > GPU reads.
> > 
> > Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in.
> > This /should/ have been caught by the gem_concurrent_blt subtests that
> > exercise pwrites ...
> > 
> > Ben can you please check that this indeed blew up on igt? Should fail on
> > any platform, no special caching mode required.
> 
> Actually it won't blow up since you always set readonly = false. But it'll
> kill the neat read-read optimization ...
> -Daniel

Doh! Sorry about this. Fixed locally.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f026153..a839bcb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -41,6 +41,9 @@  static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj,
 						   bool force);
 static __must_check int
+i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
+			       bool readonly);
+static __must_check int
 i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 			   struct i915_address_space *vm,
 			   unsigned alignment,
@@ -430,11 +433,9 @@  i915_gem_shmem_pread(struct drm_device *dev,
 		 * optimizes for the case when the gpu will dirty the data
 		 * anyway again before the next pread happens. */
 		needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level);
-		if (i915_gem_obj_bound_any(obj)) {
-			ret = i915_gem_object_set_to_gtt_domain(obj, false);
-			if (ret)
-				return ret;
-		}
+		ret = i915_gem_object_wait_rendering(obj, false);
+		if (ret)
+			return ret;
 	}
 
 	ret = i915_gem_object_get_pages(obj);
@@ -746,11 +747,9 @@  i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * optimizes for the case when the gpu will use the data
 		 * right away and we therefore have to clflush anyway. */
 		needs_clflush_after = cpu_write_needs_clflush(obj);
-		if (i915_gem_obj_bound_any(obj)) {
-			ret = i915_gem_object_set_to_gtt_domain(obj, true);
-			if (ret)
-				return ret;
-		}
+		ret = i915_gem_object_wait_rendering(obj, false);
+		if (ret)
+			return ret;
 	}
 	/* Same trick applies to invalidate partially written cachelines read
 	 * before writing. */