Message ID | 1377906241-8463-2-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > lifted from Daniel: > pread/pwrite isn't about the object's domain at all, but purely about > synchronizing for outstanding rendering. Replacing the call to > set_to_gtt_domain with a wait_rendering would imo improve code > readability. Furthermore we could pimp pread to only block for > outstanding writes and not for reads. > > Since you're not the first one to trip over this: Can I volunteer you > for a follow-up patch to fix this? > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> This should fail i-g-t... -Chris
On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > lifted from Daniel: > > pread/pwrite isn't about the object's domain at all, but purely about > > synchronizing for outstanding rendering. Replacing the call to > > set_to_gtt_domain with a wait_rendering would imo improve code > > readability. Furthermore we could pimp pread to only block for > > outstanding writes and not for reads. > > > > Since you're not the first one to trip over this: Can I volunteer you > > for a follow-up patch to fix this? > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > This should fail i-g-t... > -Chris > Daniel, how have I failed your plan?
On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote: > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > > lifted from Daniel: > > > pread/pwrite isn't about the object's domain at all, but purely about > > > synchronizing for outstanding rendering. Replacing the call to > > > set_to_gtt_domain with a wait_rendering would imo improve code > > > readability. Furthermore we could pimp pread to only block for > > > outstanding writes and not for reads. > > > > > > Since you're not the first one to trip over this: Can I volunteer you > > > for a follow-up patch to fix this? > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > This should fail i-g-t... > > -Chris > > > > Daniel, how have I failed your plan? It should work ... Since the enclosing if-block checks for !cpu domain (for either reads or writes) that implies that going into the gtt domain is a noop (or better should be) wrt clflushing and we only wait for outstanding gpu rendering. wait_rendering is an interface that's been added afterwards. Unfortunately I've failed to explain this trickery in either a comment or the commit message. Bad me ;-) What does QA's patch test system say on a non-llc machine here? -Daniel
On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote: > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote: > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > > > lifted from Daniel: > > > > pread/pwrite isn't about the object's domain at all, but purely about > > > > synchronizing for outstanding rendering. Replacing the call to > > > > set_to_gtt_domain with a wait_rendering would imo improve code > > > > readability. Furthermore we could pimp pread to only block for > > > > outstanding writes and not for reads. > > > > > > > > Since you're not the first one to trip over this: Can I volunteer you > > > > for a follow-up patch to fix this? > > > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > > > This should fail i-g-t... > > > -Chris > > > > > > > Daniel, how have I failed your plan? > > It should work ... Since the enclosing if-block checks for !cpu domain > (for either reads or writes) that implies that going into the gtt domain > is a noop (or better should be) wrt clflushing and we only wait for > outstanding gpu rendering. wait_rendering is an interface that's been > added afterwards. Unfortunately I've failed to explain this trickery in > either a comment or the commit message. Bad me ;-) The issue is that in the patch pwrite is not waiting for any outstanding GPU reads. -Chris
On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote: > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote: > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote: > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > > > > lifted from Daniel: > > > > > pread/pwrite isn't about the object's domain at all, but purely about > > > > > synchronizing for outstanding rendering. Replacing the call to > > > > > set_to_gtt_domain with a wait_rendering would imo improve code > > > > > readability. Furthermore we could pimp pread to only block for > > > > > outstanding writes and not for reads. > > > > > > > > > > Since you're not the first one to trip over this: Can I volunteer you > > > > > for a follow-up patch to fix this? > > > > > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > > > > > This should fail i-g-t... > > > > -Chris > > > > > > > > > > Daniel, how have I failed your plan? > > > > It should work ... Since the enclosing if-block checks for !cpu domain > > (for either reads or writes) that implies that going into the gtt domain > > is a noop (or better should be) wrt clflushing and we only wait for > > outstanding gpu rendering. wait_rendering is an interface that's been > > added afterwards. Unfortunately I've failed to explain this trickery in > > either a comment or the commit message. Bad me ;-) > > The issue is that in the patch pwrite is not waiting for any outstanding > GPU reads. Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in. This /should/ have been caught by the gem_concurrent_blt subtests that exercise pwrites ... Ben can you please check that this indeed blew up on igt? Should fail on any platform, no special caching mode required. -Daniel
On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote: > On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote: > > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote: > > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote: > > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > > > > > lifted from Daniel: > > > > > > pread/pwrite isn't about the object's domain at all, but purely about > > > > > > synchronizing for outstanding rendering. Replacing the call to > > > > > > set_to_gtt_domain with a wait_rendering would imo improve code > > > > > > readability. Furthermore we could pimp pread to only block for > > > > > > outstanding writes and not for reads. > > > > > > > > > > > > Since you're not the first one to trip over this: Can I volunteer you > > > > > > for a follow-up patch to fix this? > > > > > > > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > > > > > > > This should fail i-g-t... > > > > > -Chris > > > > > > > > > > > > > Daniel, how have I failed your plan? > > > > > > It should work ... Since the enclosing if-block checks for !cpu domain > > > (for either reads or writes) that implies that going into the gtt domain > > > is a noop (or better should be) wrt clflushing and we only wait for > > > outstanding gpu rendering. wait_rendering is an interface that's been > > > added afterwards. Unfortunately I've failed to explain this trickery in > > > either a comment or the commit message. Bad me ;-) > > > > The issue is that in the patch pwrite is not waiting for any outstanding > > GPU reads. > > Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in. > This /should/ have been caught by the gem_concurrent_blt subtests that > exercise pwrites ... > > Ben can you please check that this indeed blew up on igt? Should fail on > any platform, no special caching mode required. Actually it won't blow up since you always set readonly = false. But it'll kill the neat read-read optimization ... -Daniel
On Tue, Sep 03, 2013 at 06:08:19PM +0200, Daniel Vetter wrote: > On Mon, Sep 02, 2013 at 04:12:36PM +0200, Daniel Vetter wrote: > > On Mon, Sep 02, 2013 at 02:14:12PM +0100, Chris Wilson wrote: > > > On Mon, Sep 02, 2013 at 08:32:24AM +0200, Daniel Vetter wrote: > > > > On Fri, Aug 30, 2013 at 08:39:46PM -0700, Ben Widawsky wrote: > > > > > On Sat, Aug 31, 2013 at 12:50:30AM +0100, Chris Wilson wrote: > > > > > > On Fri, Aug 30, 2013 at 04:43:54PM -0700, Ben Widawsky wrote: > > > > > > > lifted from Daniel: > > > > > > > pread/pwrite isn't about the object's domain at all, but purely about > > > > > > > synchronizing for outstanding rendering. Replacing the call to > > > > > > > set_to_gtt_domain with a wait_rendering would imo improve code > > > > > > > readability. Furthermore we could pimp pread to only block for > > > > > > > outstanding writes and not for reads. > > > > > > > > > > > > > > Since you're not the first one to trip over this: Can I volunteer you > > > > > > > for a follow-up patch to fix this? > > > > > > > > > > > > > > Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > > > > > > > > > This should fail i-g-t... > > > > > > -Chris > > > > > > > > > > > > > > > > Daniel, how have I failed your plan? > > > > > > > > It should work ... Since the enclosing if-block checks for !cpu domain > > > > (for either reads or writes) that implies that going into the gtt domain > > > > is a noop (or better should be) wrt clflushing and we only wait for > > > > outstanding gpu rendering. wait_rendering is an interface that's been > > > > added afterwards. Unfortunately I've failed to explain this trickery in > > > > either a comment or the commit message. Bad me ;-) > > > > > > The issue is that in the patch pwrite is not waiting for any outstanding > > > GPU reads. > > > > Oh right, silly me didn't spot the s/true/false/ switch Ben sneaked in. > > This /should/ have been caught by the gem_concurrent_blt subtests that > > exercise pwrites ... > > > > Ben can you please check that this indeed blew up on igt? Should fail on > > any platform, no special caching mode required. > > Actually it won't blow up since you always set readonly = false. But it'll > kill the neat read-read optimization ... > -Daniel Doh! Sorry about this. Fixed locally.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f026153..a839bcb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -41,6 +41,9 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj, bool force); static __must_check int +i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, + bool readonly); +static __must_check int i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj, struct i915_address_space *vm, unsigned alignment, @@ -430,11 +433,9 @@ i915_gem_shmem_pread(struct drm_device *dev, * optimizes for the case when the gpu will dirty the data * anyway again before the next pread happens. */ needs_clflush = !cpu_cache_is_coherent(dev, obj->cache_level); - if (i915_gem_obj_bound_any(obj)) { - ret = i915_gem_object_set_to_gtt_domain(obj, false); - if (ret) - return ret; - } + ret = i915_gem_object_wait_rendering(obj, false); + if (ret) + return ret; } ret = i915_gem_object_get_pages(obj); @@ -746,11 +747,9 @@ i915_gem_shmem_pwrite(struct drm_device *dev, * optimizes for the case when the gpu will use the data * right away and we therefore have to clflush anyway. */ needs_clflush_after = cpu_write_needs_clflush(obj); - if (i915_gem_obj_bound_any(obj)) { - ret = i915_gem_object_set_to_gtt_domain(obj, true); - if (ret) - return ret; - } + ret = i915_gem_object_wait_rendering(obj, false); + if (ret) + return ret; } /* Same trick applies to invalidate partially written cachelines read * before writing. */
lifted from Daniel: pread/pwrite isn't about the object's domain at all, but purely about synchronizing for outstanding rendering. Replacing the call to set_to_gtt_domain with a wait_rendering would imo improve code readability. Furthermore we could pimp pread to only block for outstanding writes and not for reads. Since you're not the first one to trip over this: Can I volunteer you for a follow-up patch to fix this? Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)