diff mbox

drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

Message ID 20170108073137.18665-1-currojerez@riseup.net (mailing list archive)
State New, archived
Headers show

Commit Message

Francisco Jerez Jan. 8, 2017, 7:31 a.m. UTC
The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads.  Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).

The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit.  Do the same on KBL platforms.

Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
---
 drivers/gpu/drm/i915/intel_lrc.c        | 9 ---------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 8 --------
 2 files changed, 17 deletions(-)

Comments

kernel test robot Jan. 8, 2017, 11:57 a.m. UTC | #1
Hi Francisco,

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.10-rc2 next-20170106]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Francisco-Jerez/drm-i915-Remove-WaDisableLSQCROPERFforOCL-KBL-workaround/20170108-193533
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x012-201702 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_lrc.c: In function 'gen8_emit_flush_coherentl3_wa':
>> drivers/gpu/drm/i915/intel_lrc.c:972:27: error: unused variable 'dev_priv' [-Werror=unused-variable]
     struct drm_i915_private *dev_priv = engine->i915;
                              ^~~~~~~~
   cc1: all warnings being treated as errors

vim +/dev_priv +972 drivers/gpu/drm/i915/intel_lrc.c

9e000847 Arun Siluvery  2015-07-03  966   * code duplication.
9e000847 Arun Siluvery  2015-07-03  967   */
0bc40be8 Tvrtko Ursulin 2016-03-16  968  static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine,
6e5248b5 Daniel Vetter  2016-07-15  969  						uint32_t *batch,
9e000847 Arun Siluvery  2015-07-03  970  						uint32_t index)
9e000847 Arun Siluvery  2015-07-03  971  {
5e580523 Dave Airlie    2016-07-26 @972  	struct drm_i915_private *dev_priv = engine->i915;
9e000847 Arun Siluvery  2015-07-03  973  	uint32_t l3sqc4_flush = (0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES);
9e000847 Arun Siluvery  2015-07-03  974  
f1afe24f Arun Siluvery  2015-08-04  975  	wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |

:::::: The code at line 972 was first introduced by commit
:::::: 5e580523d9128a4d8364fe89d36c38fc7819c8dd Backmerge tag 'v4.7' into drm-next

:::::: TO: Dave Airlie <airlied@redhat.com>
:::::: CC: Dave Airlie <airlied@redhat.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6db246a..b1ac74e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -973,15 +973,6 @@  static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine,
 	struct drm_i915_private *dev_priv = engine->i915;
 	uint32_t l3sqc4_flush = (0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES);
 
-	/*
-	 * WaDisableLSQCROPERFforOCL:kbl
-	 * This WA is implemented in skl_init_clock_gating() but since
-	 * this batch updates GEN8_L3SQCREG4 with default value we need to
-	 * set this bit here to retain the WA during flush.
-	 */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
-		l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS;
-
 	wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |
 				   MI_SRM_LRM_GLOBAL_GTT));
 	wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0971ac3..7cb2ab4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1095,14 +1095,6 @@  static int kbl_init_workarounds(struct intel_engine_cs *engine)
 		WA_SET_BIT_MASKED(HDC_CHICKEN0,
 				  HDC_FENCE_DEST_SLM_DISABLE);
 
-	/* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes
-	 * involving this register should also be added to WA batch as required.
-	 */
-	if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
-		/* WaDisableLSQCROPERFforOCL:kbl */
-		I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) |
-			   GEN8_LQSC_RO_PERF_DIS);
-
 	/* WaToEnableHwFixForPushConstHWBug:kbl */
 	if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
 		WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,