Message ID | 20180919195544.1511-12-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/40] drm: Use default dma_fence hooks where possible for null syncobj | expand |
On 19/09/2018 20:55, Chris Wilson wrote: > In the sequence > > <0>[ 531.960431] drv_self-4806 7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0 > <0>[ 531.960431] drv_self-4806 7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133 How such a unbelievably huge delta between the guilty request and hws? > <0>[ 531.960431] drv_self-4806 7d..1 527402571us : execlists_cancel_port_requests: rcs0:port0 global=71134 (fence 826b:198), (current 71133) > <0>[ 531.960431] drv_self-4806 7d..1 527402572us : execlists_cancel_port_requests: rcs0:port1 global=71135 (fence 826c:53), (current 71133) > <0>[ 531.960431] drv_self-4806 7d..1 527402572us : __i915_request_unsubmit: rcs0 fence 826c:53 <- global=71135, current 71133 > <0>[ 531.960431] drv_self-4806 7d..1 527402579us : __i915_request_unsubmit: rcs0 fence 826b:198 <- global=71134, current 71133 > <0>[ 531.960431] drv_self-4806 7.... 527402613us : intel_engine_cancel_stop_cs: rcs0 > <0>[ 531.960431] drv_self-4806 7.... 527402624us : execlists_reset_finish: rcs0 > > we are missing the execlists_submission_tasklet() invocation before the > execlists_reset_fini() implying that either the queue is empty, or we > failed to schedule and run the tasklet on finish. Add an assert so we > are sure that on unsubmitting the incomplete request after reset, the > queue is indeed populated. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/intel_lrc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index 3edb417caa7b..e8de250c3413 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -344,6 +344,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine) > last_prio = rq_prio(rq); > p = lookup_priolist(engine, last_prio); > } > + GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); > > GEM_BUG_ON(p->priority != rq_prio(rq)); > list_add(&rq->sched.link, &p->requests); > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko
Quoting Tvrtko Ursulin (2018-09-24 10:07:11) > > On 19/09/2018 20:55, Chris Wilson wrote: > > In the sequence > > > > <0>[ 531.960431] drv_self-4806 7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0 > > <0>[ 531.960431] drv_self-4806 7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133 > > How such a unbelievably huge delta between the guilty request and hws? 1? ;) -Chris
On 25/09/2018 08:41, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-09-24 10:07:11) >> >> On 19/09/2018 20:55, Chris Wilson wrote: >>> In the sequence >>> >>> <0>[ 531.960431] drv_self-4806 7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0 >>> <0>[ 531.960431] drv_self-4806 7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133 >> >> How such a unbelievably huge delta between the guilty request and hws? > > 1? > > ;) Oh bloody dec vs hex again! :D Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 3edb417caa7b..e8de250c3413 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -344,6 +344,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine) last_prio = rq_prio(rq); p = lookup_priolist(engine, last_prio); } + GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); GEM_BUG_ON(p->priority != rq_prio(rq)); list_add(&rq->sched.link, &p->requests);
In the sequence <0>[ 531.960431] drv_self-4806 7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0 <0>[ 531.960431] drv_self-4806 7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133 <0>[ 531.960431] drv_self-4806 7d..1 527402571us : execlists_cancel_port_requests: rcs0:port0 global=71134 (fence 826b:198), (current 71133) <0>[ 531.960431] drv_self-4806 7d..1 527402572us : execlists_cancel_port_requests: rcs0:port1 global=71135 (fence 826c:53), (current 71133) <0>[ 531.960431] drv_self-4806 7d..1 527402572us : __i915_request_unsubmit: rcs0 fence 826c:53 <- global=71135, current 71133 <0>[ 531.960431] drv_self-4806 7d..1 527402579us : __i915_request_unsubmit: rcs0 fence 826b:198 <- global=71134, current 71133 <0>[ 531.960431] drv_self-4806 7.... 527402613us : intel_engine_cancel_stop_cs: rcs0 <0>[ 531.960431] drv_self-4806 7.... 527402624us : execlists_reset_finish: rcs0 we are missing the execlists_submission_tasklet() invocation before the execlists_reset_fini() implying that either the queue is empty, or we failed to schedule and run the tasklet on finish. Add an assert so we are sure that on unsubmitting the incomplete request after reset, the queue is indeed populated. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/intel_lrc.c | 1 + 1 file changed, 1 insertion(+)