diff mbox

[v2,2/2] drm/i915/bxt: work around HW coherency issue for cached GEM mappings

Message ID 1439555937-8016-3-git-send-email-imre.deak@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Imre Deak Aug. 14, 2015, 12:38 p.m. UTC
Due to a coherency issue on BXT A steppings we can't guarantee a
coherent view of cached GPU mappings, so fall back to uncached mappings.
Note that this still won't fix cases where userspace expects a coherent
view without synchronizing (via a set domain call). It still makes sense
to limit the kernel's notion of the mapping to be uncached, for example
for relocations to work properly during execbuffer time. Also in case
user space does synchronize the buffer, this will still guarantee that
we'll do the proper clflushing for the buffer.

v2:
- limit the WA to A steppings, on later stepping this HW issue is fixed

Testcast: igt/gem_store_dword_batches_loop/cached-mapping
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Chris Wilson Aug. 14, 2015, 1:11 p.m. UTC | #1
On Fri, Aug 14, 2015 at 03:38:57PM +0300, Imre Deak wrote:
> Due to a coherency issue on BXT A steppings we can't guarantee a
> coherent view of cached GPU mappings, so fall back to uncached mappings.
> Note that this still won't fix cases where userspace expects a coherent
> view without synchronizing (via a set domain call). It still makes sense
> to limit the kernel's notion of the mapping to be uncached, for example
> for relocations to work properly during execbuffer time. Also in case
> user space does synchronize the buffer, this will still guarantee that
> we'll do the proper clflushing for the buffer.
> 
> v2:
> - limit the WA to A steppings, on later stepping this HW issue is fixed

This has to report the failure, ENODEV otherwise userspace will be
terribly confused (it will try to CPU coherent access assuming it will
be fast, when it is better to use alternative paths).
-Chris
Imre Deak Aug. 14, 2015, 1:29 p.m. UTC | #2
On pe, 2015-08-14 at 14:11 +0100, Chris Wilson wrote:
> On Fri, Aug 14, 2015 at 03:38:57PM +0300, Imre Deak wrote:
> > Due to a coherency issue on BXT A steppings we can't guarantee a
> > coherent view of cached GPU mappings, so fall back to uncached mappings.
> > Note that this still won't fix cases where userspace expects a coherent
> > view without synchronizing (via a set domain call). It still makes sense
> > to limit the kernel's notion of the mapping to be uncached, for example
> > for relocations to work properly during execbuffer time. Also in case
> > user space does synchronize the buffer, this will still guarantee that
> > we'll do the proper clflushing for the buffer.
> > 
> > v2:
> > - limit the WA to A steppings, on later stepping this HW issue is fixed
> 
> This has to report the failure, ENODEV otherwise userspace will be
> terribly confused (it will try to CPU coherent access assuming it will
> be fast, when it is better to use alternative paths).

Ok, I was not sure how existing user space would handle the failure, but
if it has the fall-back logic then ENODEV is the better solution. Will
change this.

> -Chris
>
Shuang He Aug. 16, 2015, 11:45 a.m. UTC | #3
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 7201
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
ILK                 -1              302/302              301/302
SNB                                  315/315              315/315
IVB                                  336/336              336/336
BYT                                  283/283              283/283
HSW                                  378/378              378/378
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*ILK  igt@kms_flip@flip-vs-dpms-interruptible      PASS(1)      DMESG_WARN(1)
Note: You need to pay more attention to line start with '*'
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 407b6b3..987ffa8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3742,7 +3742,16 @@  int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 		level = I915_CACHE_NONE;
 		break;
 	case I915_CACHING_CACHED:
-		level = I915_CACHE_LLC;
+		/*
+		 * Due to a HW issue on BXT A stepping, GPU stores via a
+		 * snooped mapping may leave stale data in a corresponding CPU
+		 * cacheline, whereas normally such cachelines would get
+		 * invalidated. As a workaround assume that these stores are
+		 * not coherent, which means we'll flush the CPU cache manually
+		 * whenever doing a CPU/GPU sync operation.
+		 */
+		level = IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0 ?
+			I915_CACHE_NONE : I915_CACHE_LLC;
 		break;
 	case I915_CACHING_DISPLAY:
 		level = HAS_WT(dev) ? I915_CACHE_WT : I915_CACHE_NONE;