Message ID | 1439555937-8016-3-git-send-email-imre.deak@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 14, 2015 at 03:38:57PM +0300, Imre Deak wrote: > Due to a coherency issue on BXT A steppings we can't guarantee a > coherent view of cached GPU mappings, so fall back to uncached mappings. > Note that this still won't fix cases where userspace expects a coherent > view without synchronizing (via a set domain call). It still makes sense > to limit the kernel's notion of the mapping to be uncached, for example > for relocations to work properly during execbuffer time. Also in case > user space does synchronize the buffer, this will still guarantee that > we'll do the proper clflushing for the buffer. > > v2: > - limit the WA to A steppings, on later stepping this HW issue is fixed This has to report the failure, ENODEV otherwise userspace will be terribly confused (it will try to CPU coherent access assuming it will be fast, when it is better to use alternative paths). -Chris
On pe, 2015-08-14 at 14:11 +0100, Chris Wilson wrote: > On Fri, Aug 14, 2015 at 03:38:57PM +0300, Imre Deak wrote: > > Due to a coherency issue on BXT A steppings we can't guarantee a > > coherent view of cached GPU mappings, so fall back to uncached mappings. > > Note that this still won't fix cases where userspace expects a coherent > > view without synchronizing (via a set domain call). It still makes sense > > to limit the kernel's notion of the mapping to be uncached, for example > > for relocations to work properly during execbuffer time. Also in case > > user space does synchronize the buffer, this will still guarantee that > > we'll do the proper clflushing for the buffer. > > > > v2: > > - limit the WA to A steppings, on later stepping this HW issue is fixed > > This has to report the failure, ENODEV otherwise userspace will be > terribly confused (it will try to CPU coherent access assuming it will > be fast, when it is better to use alternative paths). Ok, I was not sure how existing user space would handle the failure, but if it has the fall-back logic then ENODEV is the better solution. Will change this. > -Chris >
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 7201
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
ILK -1 302/302 301/302
SNB 315/315 315/315
IVB 336/336 336/336
BYT 283/283 283/283
HSW 378/378 378/378
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*ILK igt@kms_flip@flip-vs-dpms-interruptible PASS(1) DMESG_WARN(1)
Note: You need to pay more attention to line start with '*'
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 407b6b3..987ffa8 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3742,7 +3742,16 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, level = I915_CACHE_NONE; break; case I915_CACHING_CACHED: - level = I915_CACHE_LLC; + /* + * Due to a HW issue on BXT A stepping, GPU stores via a + * snooped mapping may leave stale data in a corresponding CPU + * cacheline, whereas normally such cachelines would get + * invalidated. As a workaround assume that these stores are + * not coherent, which means we'll flush the CPU cache manually + * whenever doing a CPU/GPU sync operation. + */ + level = IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0 ? + I915_CACHE_NONE : I915_CACHE_LLC; break; case I915_CACHING_DISPLAY: level = HAS_WT(dev) ? I915_CACHE_WT : I915_CACHE_NONE;
Due to a coherency issue on BXT A steppings we can't guarantee a coherent view of cached GPU mappings, so fall back to uncached mappings. Note that this still won't fix cases where userspace expects a coherent view without synchronizing (via a set domain call). It still makes sense to limit the kernel's notion of the mapping to be uncached, for example for relocations to work properly during execbuffer time. Also in case user space does synchronize the buffer, this will still guarantee that we'll do the proper clflushing for the buffer. v2: - limit the WA to A steppings, on later stepping this HW issue is fixed Testcast: igt/gem_store_dword_batches_loop/cached-mapping Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)