diff mbox

[06/71] drm/i915: Detect if we missed kicking the execlists tasklet

Message ID 20180503063757.22238-6-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 3, 2018, 6:36 a.m. UTC
If inside hangcheck we see that the engine has paused, but there is an
execlists interrupt still pending, we know that the tasklet did not
fire. Dump the GEM trace along with the current engine state, and kick
the tasklet to recovery without having to go through a GPU reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_hangcheck.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

Chris Wilson May 3, 2018, 1:08 p.m. UTC | #1
Quoting Chris Wilson (2018-05-03 07:36:52)
> If inside hangcheck we see that the engine has paused, but there is an
> execlists interrupt still pending, we know that the tasklet did not
> fire. Dump the GEM trace along with the current engine state, and kick
> the tasklet to recovery without having to go through a GPU reset.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Oh I thought I sent this earlier, but I appear not to have.
Please review as it can go in all by itself...

>  drivers/gpu/drm/i915/intel_hangcheck.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
> index 309e38b00e95..2d7f10492e35 100644
> --- a/drivers/gpu/drm/i915/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/intel_hangcheck.c
> @@ -267,6 +267,29 @@ engine_stuck(struct intel_engine_cs *engine, u64 acthd)
>                 }
>         }
>  
> +       if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
> +               struct intel_engine_execlists *execlists = &engine->execlists;
> +               enum intel_engine_hangcheck_action ret = ENGINE_WAIT;
> +
> +               if (GEM_SHOW_DEBUG()) {
> +                       struct drm_printer p = drm_debug_printer("hangcheck");
> +
> +                       GEM_TRACE_DUMP();
> +                       intel_engine_dump(engine, &p,
> +                                         "%s stuck\n", engine->name);
> +               }
> +
> +               if (tasklet_trylock(&execlists->tasklet)) {
> +                       execlists->tasklet.func(execlists->tasklet.data);
> +                       tasklet_unlock(&execlists->tasklet);
> +
> +                       ret = ENGINE_WAIT_KICK;
> +               }
> +
> +               tasklet_hi_schedule(&execlists->tasklet);
> +               return ret;
> +       }
> +
>         return ENGINE_DEAD;
>  }
>  
> -- 
> 2.17.0
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index 309e38b00e95..2d7f10492e35 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -267,6 +267,29 @@  engine_stuck(struct intel_engine_cs *engine, u64 acthd)
 		}
 	}
 
+	if (test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted)) {
+		struct intel_engine_execlists *execlists = &engine->execlists;
+		enum intel_engine_hangcheck_action ret = ENGINE_WAIT;
+
+		if (GEM_SHOW_DEBUG()) {
+			struct drm_printer p = drm_debug_printer("hangcheck");
+
+			GEM_TRACE_DUMP();
+			intel_engine_dump(engine, &p,
+					  "%s stuck\n", engine->name);
+		}
+
+		if (tasklet_trylock(&execlists->tasklet)) {
+			execlists->tasklet.func(execlists->tasklet.data);
+			tasklet_unlock(&execlists->tasklet);
+
+			ret = ENGINE_WAIT_KICK;
+		}
+
+		tasklet_hi_schedule(&execlists->tasklet);
+		return ret;
+	}
+
 	return ENGINE_DEAD;
 }