diff mbox series

[v3,6/6] drm/i915: Watchdog timeout: Blindly trust watchdog timeout for reset?

Message ID 20190214025713.34150-7-carlos.santa@intel.com (mailing list archive)
State New, archived
Headers show
Series GEN8+ GPU Watchdog Reset Support | expand

Commit Message

Santa, Carlos Feb. 14, 2019, 2:57 a.m. UTC
From: Michel Thierry <michel.thierry@intel.com>

XXX: What to do when the watchdog irq fired twice but our hangcheck
logic thinks the engine is not hung? For example, what if the
active-head moved since the irq handler?

One option is to just ignore the watchdog, if the engine is really hung,
then the driver will detect the hang by itself later on (I'm inclined to
this).

But the other option is to blindly trust the HW, which is what this patch
does...

v1: Rebase.

CC: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Carlos Santa <carlos.santa@intel.com>
---
 drivers/gpu/drm/i915/intel_hangcheck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index bc10acb24d9a..223b79001854 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -288,7 +288,8 @@  static void i915_hangcheck_elapsed(struct work_struct *work)
 		hangcheck_accumulate_sample(engine, &hc);
 		hangcheck_store_sample(engine, &hc);
 
-		if (hc.stalled) {
+		if (hc.stalled ||
+		    engine->hangcheck.watchdog == intel_engine_get_hangcheck_seqno(engine)) {
 			hung |= engine->mask;
 			if (hc.action != ENGINE_DEAD)
 				stuck |= engine->mask;