diff mbox

[v3] drm/i915: Speed up idle detection by kicking the tasklets

Message ID 20180507093548.30487-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 7, 2018, 9:35 a.m. UTC
We rely on ksoftirqd to run in a timely fashion in order to drain the
execlists queue. Quite frequently, it does not. In some cases we may see
latencies of over 200ms triggering our idle timeouts and forcing us to
declare the driver wedged!

Thus we can speed up idle detection by bypassing ksoftirqd in these
cases and flush our tasklet to confirm if we are indeed still waiting
for the ELSP to drain.

v2: Put the execlists.first check back; it is required for handling
reset!
v3: Follow Mika's suggestion to try and limit kicking the tasklet to
only when we expect it to make a difference, i.e. in catch up after a CS
interrupt, and not just execute it everytime as that is likely just to
cover over our own bugs.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106373
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

kernel test robot May 7, 2018, 2:08 p.m. UTC | #1
Hi Chris,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.17-rc4 next-20180504]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Speed-up-idle-detection-by-kicking-the-tasklets/20180507-184057
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x0-05071422 (attached as .config)
compiler: gcc-5 (Debian 5.5.0-3) 5.4.1 20171010
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu//drm/i915/intel_engine_cs.c: In function 'intel_engine_is_idle':
>> drivers/gpu//drm/i915/intel_engine_cs.c:954:3: error: implicit declaration of function 'execlists_tasklet' [-Werror=implicit-function-declaration]
      execlists_tasklet(&engine->execlists);
      ^
   cc1: some warnings being treated as errors

vim +/execlists_tasklet +954 drivers/gpu//drm/i915/intel_engine_cs.c

   923	
   924	/**
   925	 * intel_engine_is_idle() - Report if the engine has finished process all work
   926	 * @engine: the intel_engine_cs
   927	 *
   928	 * Return true if there are no requests pending, nothing left to be submitted
   929	 * to hardware, and that the engine is idle.
   930	 */
   931	bool intel_engine_is_idle(struct intel_engine_cs *engine)
   932	{
   933		struct drm_i915_private *dev_priv = engine->i915;
   934	
   935		/* More white lies, if wedged, hw state is inconsistent */
   936		if (i915_terminally_wedged(&dev_priv->gpu_error))
   937			return true;
   938	
   939		/* Any inflight/incomplete requests? */
   940		if (!i915_seqno_passed(intel_engine_get_seqno(engine),
   941				       intel_engine_last_submit(engine)))
   942			return false;
   943	
   944		if (I915_SELFTEST_ONLY(engine->breadcrumbs.mock))
   945			return true;
   946	
   947		/*
   948		 * ksoftirqd has notorious latency that may cause us to
   949		 * timeout while waiting for the engine to idle as we wait for
   950		 * ksoftirqd to run the execlists tasklet to drain the ELSP.
   951		 * If we are expecting a context switch from the GPU, check now.
   952		 */
   953		if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
 > 954			execlists_tasklet(&engine->execlists);
   955	
   956		/* Waiting to drain ELSP? */
   957		if (READ_ONCE(engine->execlists.active))
   958			return false;
   959	
   960		/* ELSP is empty, but there are ready requests? E.g. after reset */
   961		if (READ_ONCE(engine->execlists.first))
   962			return false;
   963	
   964		/* Ring stopped? */
   965		if (!ring_is_idle(engine))
   966			return false;
   967	
   968		return true;
   969	}
   970	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 70325e0824e3..27f6b30e032f 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -944,11 +944,20 @@  bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (I915_SELFTEST_ONLY(engine->breadcrumbs.mock))
 		return true;
 
+	/*
+	 * ksoftirqd has notorious latency that may cause us to
+	 * timeout while waiting for the engine to idle as we wait for
+	 * ksoftirqd to run the execlists tasklet to drain the ELSP.
+	 * If we are expecting a context switch from the GPU, check now.
+	 */
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted))
+		execlists_tasklet(&engine->execlists);
+
 	/* Waiting to drain ELSP? */
 	if (READ_ONCE(engine->execlists.active))
 		return false;
 
-	/* ELSP is empty, but there are ready requests? */
+	/* ELSP is empty, but there are ready requests? E.g. after reset */
 	if (READ_ONCE(engine->execlists.first))
 		return false;