Message ID | 1376304377-11695-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 12, 2013 at 11:46:17AM +0100, Chris Wilson wrote: > By our earlier reckoning, move from a snooped/llc setting to an uncached > setting, leaves the CPU cache in a consistent state irrespective of our > domain tracking - so we can forgo the warning about the lack of > invalidation. Similarly for any writes posted to the snooped CPU domain, > we know will be safely clflushed to the uncached PTEs after forcing the > domain change. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> I ran into this several times while doing the PPGTT development, and was always scared to just remove it. Does it make sense to keep the write_domain assertion with this gone? > --- > drivers/gpu/drm/i915/i915_gem.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 925c77d..1d3e57e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3520,7 +3520,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > * Just set it to the CPU cache for now. > */ > WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); > > old_read_domains = obj->base.read_domains; > old_write_domain = obj->base.write_domain; > -- > 1.8.4.rc2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Mon, Aug 12, 2013 at 02:02:09PM -0700, Ben Widawsky wrote: > On Mon, Aug 12, 2013 at 11:46:17AM +0100, Chris Wilson wrote: > > By our earlier reckoning, move from a snooped/llc setting to an uncached > > setting, leaves the CPU cache in a consistent state irrespective of our > > domain tracking - so we can forgo the warning about the lack of > > invalidation. Similarly for any writes posted to the snooped CPU domain, > > we know will be safely clflushed to the uncached PTEs after forcing the > > domain change. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > I ran into this several times while doing the PPGTT development, and was > always scared to just remove it. Does it make sense to keep the > write_domain assertion with this gone? I think we've justified in the earlier series why we can drop the WARN_ON(write) with impunity. As we don't need to do so immediately, I'd like to sleep on it for a while. -Chris
On Mon, Aug 12, 2013 at 11:46:17AM +0100, Chris Wilson wrote: > By our earlier reckoning, move from a snooped/llc setting to an uncached > setting, leaves the CPU cache in a consistent state irrespective of our > domain tracking - so we can forgo the warning about the lack of > invalidation. Similarly for any writes posted to the snooped CPU domain, > we know will be safely clflushed to the uncached PTEs after forcing the > domain change. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68040 Tested-by: cancan,feng <cancan.feng@intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 925c77d..1d3e57e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3520,7 +3520,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > * Just set it to the CPU cache for now. > */ > WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); > > old_read_domains = obj->base.read_domains; > old_write_domain = obj->base.write_domain; > -- > 1.8.4.rc2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Aug 13, 2013 at 09:54:48AM +0200, Daniel Vetter wrote: > On Mon, Aug 12, 2013 at 11:46:17AM +0100, Chris Wilson wrote: > > By our earlier reckoning, move from a snooped/llc setting to an uncached > > setting, leaves the CPU cache in a consistent state irrespective of our > > domain tracking - so we can forgo the warning about the lack of > > invalidation. Similarly for any writes posted to the snooped CPU domain, > > we know will be safely clflushed to the uncached PTEs after forcing the > > domain change. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68040 > Tested-by: cancan,feng <cancan.feng@intel.com> Oh and QA blames that this WARN newly pops up on commit d46f1c3f1372e3a72fab97c60480aa4a1084387f Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Thu Aug 8 14:41:06 2013 +0100 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Sat Aug 10 11:24:18 2013 +0200 drm/i915: Allow the GPU to cache stolen memory
On Mon, Aug 12, 2013 at 11:46:17AM +0100, Chris Wilson wrote: > By our earlier reckoning, move from a snooped/llc setting to an uncached > setting, leaves the CPU cache in a consistent state irrespective of our > domain tracking - so we can forgo the warning about the lack of > invalidation. Similarly for any writes posted to the snooped CPU domain, > we know will be safely clflushed to the uncached PTEs after forcing the > domain change. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 925c77d..1d3e57e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3520,7 +3520,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, > * Just set it to the CPU cache for now. > */ > WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); > - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); AFAICS this can only be reached by stolen objs starting in GTT read domain. Normally set_cache_level checks if the object is bound and then calls finish_gtt, and unbind also calls finish_gtt, and GPU domain is handled in a similar way. So I don't see that we can end up here any other way. Based on that, both WARNs seem rather pointless actually. Then again I'm not really sure what we gain from setting stolen objs to GTT read domain initially. The write domain check might make a bit of sense, except for the fact that finish_gtt/gpu clears it just before. Thinking about this stuff a bit, I think I actually came up with a scenario where we would currently fail to invalidate the CPU cache between non-snooped GPU/GTT access and CPU access: 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) 2. set to CPU read domain (wd=0 rd|=cpu) 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) 5. set to CPU domain -> CPU cache is still stale > old_read_domains = obj->base.read_domains; > old_write_domain = obj->base.write_domain; > -- > 1.8.4.rc2
On Tue, Aug 13, 2013 at 03:12:59PM +0300, Ville Syrjälä wrote: > Thinking about this stuff a bit, I think I actually came up with a > scenario where we would currently fail to invalidate the CPU cache > between non-snooped GPU/GTT access and CPU access: > > 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) > 2. set to CPU read domain (wd=0 rd|=cpu) > 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point > 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) > 5. set to CPU domain -> CPU cache is still stale You will also find the scanout reads stale data as well. You've managed to shoot yourself in both feet. The kernel can't fix that, so should we care about the other foot? -Chris
On Tue, Aug 13, 2013 at 01:20:13PM +0100, Chris Wilson wrote: > On Tue, Aug 13, 2013 at 03:12:59PM +0300, Ville Syrjälä wrote: > > Thinking about this stuff a bit, I think I actually came up with a > > scenario where we would currently fail to invalidate the CPU cache > > between non-snooped GPU/GTT access and CPU access: > > > > 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) > > 2. set to CPU read domain (wd=0 rd|=cpu) > > 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point > > 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) > > 5. set to CPU domain -> CPU cache is still stale > > You will also find the scanout reads stale data as well. Well, assuming you actually write something to the bo w/ the CPU. If not, then it keeps scanning out the correct data. > You've managed > to shoot yourself in both feet. The kernel can't fix that, so should we > care about the other foot? Yeah, I suppose we shouldn't care too much about problems the user created for himself.
On Tue, Aug 13, 2013 at 03:37:56PM +0300, Ville Syrjälä wrote: > On Tue, Aug 13, 2013 at 01:20:13PM +0100, Chris Wilson wrote: > > On Tue, Aug 13, 2013 at 03:12:59PM +0300, Ville Syrjälä wrote: > > > Thinking about this stuff a bit, I think I actually came up with a > > > scenario where we would currently fail to invalidate the CPU cache > > > between non-snooped GPU/GTT access and CPU access: > > > > > > 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) > > > 2. set to CPU read domain (wd=0 rd|=cpu) > > > 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point > > > 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) > > > 5. set to CPU domain -> CPU cache is still stale > > > > You will also find the scanout reads stale data as well. > > Well, assuming you actually write something to the bo w/ the CPU. If > not, then it keeps scanning out the correct data. I think an if (obj->pin_display) return -EBUSY; in the set_caching ioctl would be good to fix this. -Daniel
On Wed, Aug 14, 2013 at 10:49:11AM +0200, Daniel Vetter wrote: > On Tue, Aug 13, 2013 at 03:37:56PM +0300, Ville Syrjälä wrote: > > On Tue, Aug 13, 2013 at 01:20:13PM +0100, Chris Wilson wrote: > > > On Tue, Aug 13, 2013 at 03:12:59PM +0300, Ville Syrjälä wrote: > > > > Thinking about this stuff a bit, I think I actually came up with a > > > > scenario where we would currently fail to invalidate the CPU cache > > > > between non-snooped GPU/GTT access and CPU access: > > > > > > > > 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) > > > > 2. set to CPU read domain (wd=0 rd|=cpu) > > > > 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point > > > > 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) > > > > 5. set to CPU domain -> CPU cache is still stale > > > > > > You will also find the scanout reads stale data as well. > > > > Well, assuming you actually write something to the bo w/ the CPU. If > > not, then it keeps scanning out the correct data. > > I think an if (obj->pin_display) return -EBUSY; in the set_caching ioctl > would be good to fix this. And we already do that check (as a result of obj->pin_count). Sorted. -Chris
On Wed, Aug 14, 2013 at 09:54:05AM +0100, Chris Wilson wrote: > On Wed, Aug 14, 2013 at 10:49:11AM +0200, Daniel Vetter wrote: > > On Tue, Aug 13, 2013 at 03:37:56PM +0300, Ville Syrjälä wrote: > > > On Tue, Aug 13, 2013 at 01:20:13PM +0100, Chris Wilson wrote: > > > > On Tue, Aug 13, 2013 at 03:12:59PM +0300, Ville Syrjälä wrote: > > > > > Thinking about this stuff a bit, I think I actually came up with a > > > > > scenario where we would currently fail to invalidate the CPU cache > > > > > between non-snooped GPU/GTT access and CPU access: > > > > > > > > > > 1. make bo non-snooped w/ pin_display=true (wd=0, rd|=gtt) > > > > > 2. set to CPU read domain (wd=0 rd|=cpu) > > > > > 3. set to GTT (or GPU) write domain (wd=gtt, rd=gtt) -> CPU cache is stale after this point > > > > > 4. make bo snooped -> pin_display=true still so we directly set (wd=cpu, rd=cpu) > > > > > 5. set to CPU domain -> CPU cache is still stale > > > > > > > > You will also find the scanout reads stale data as well. > > > > > > Well, assuming you actually write something to the bo w/ the CPU. If > > > not, then it keeps scanning out the correct data. > > > > I think an if (obj->pin_display) return -EBUSY; in the set_caching ioctl > > would be good to fix this. > > And we already do that check (as a result of obj->pin_count). > Sorted. Indeed. Patch merged to dinq (with a pimped commit message), thanks. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 925c77d..1d3e57e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3520,7 +3520,6 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, * Just set it to the CPU cache for now. */ WARN_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU); - WARN_ON(obj->base.read_domains & ~I915_GEM_DOMAIN_CPU); old_read_domains = obj->base.read_domains; old_write_domain = obj->base.write_domain;
By our earlier reckoning, move from a snooped/llc setting to an uncached setting, leaves the CPU cache in a consistent state irrespective of our domain tracking - so we can forgo the warning about the lack of invalidation. Similarly for any writes posted to the snooped CPU domain, we know will be safely clflushed to the uncached PTEs after forcing the domain change. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 1 - 1 file changed, 1 deletion(-)