diff mbox series

[12/40] drm/i915/execlists: Assert the queue is non-empty on unsubmitting

Message ID 20180919195544.1511-12-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/40] drm: Use default dma_fence hooks where possible for null syncobj | expand

Commit Message

Chris Wilson Sept. 19, 2018, 7:55 p.m. UTC
In the sequence

<0>[  531.960431] drv_self-4806    7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0
<0>[  531.960431] drv_self-4806    7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133
<0>[  531.960431] drv_self-4806    7d..1 527402571us : execlists_cancel_port_requests: rcs0:port0 global=71134 (fence 826b:198), (current 71133)
<0>[  531.960431] drv_self-4806    7d..1 527402572us : execlists_cancel_port_requests: rcs0:port1 global=71135 (fence 826c:53), (current 71133)
<0>[  531.960431] drv_self-4806    7d..1 527402572us : __i915_request_unsubmit: rcs0 fence 826c:53 <- global=71135, current 71133
<0>[  531.960431] drv_self-4806    7d..1 527402579us : __i915_request_unsubmit: rcs0 fence 826b:198 <- global=71134, current 71133
<0>[  531.960431] drv_self-4806    7.... 527402613us : intel_engine_cancel_stop_cs: rcs0
<0>[  531.960431] drv_self-4806    7.... 527402624us : execlists_reset_finish: rcs0

we are missing the execlists_submission_tasklet() invocation before the
execlists_reset_fini() implying that either the queue is empty, or we
failed to schedule and run the tasklet on finish. Add an assert so we
are sure that on unsubmitting the incomplete request after reset, the
queue is indeed populated.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_lrc.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Tvrtko Ursulin Sept. 24, 2018, 9:07 a.m. UTC | #1
On 19/09/2018 20:55, Chris Wilson wrote:
> In the sequence
> 
> <0>[  531.960431] drv_self-4806    7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0
> <0>[  531.960431] drv_self-4806    7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133

How such a unbelievably huge delta between the guilty request and hws?

> <0>[  531.960431] drv_self-4806    7d..1 527402571us : execlists_cancel_port_requests: rcs0:port0 global=71134 (fence 826b:198), (current 71133)
> <0>[  531.960431] drv_self-4806    7d..1 527402572us : execlists_cancel_port_requests: rcs0:port1 global=71135 (fence 826c:53), (current 71133)
> <0>[  531.960431] drv_self-4806    7d..1 527402572us : __i915_request_unsubmit: rcs0 fence 826c:53 <- global=71135, current 71133
> <0>[  531.960431] drv_self-4806    7d..1 527402579us : __i915_request_unsubmit: rcs0 fence 826b:198 <- global=71134, current 71133
> <0>[  531.960431] drv_self-4806    7.... 527402613us : intel_engine_cancel_stop_cs: rcs0
> <0>[  531.960431] drv_self-4806    7.... 527402624us : execlists_reset_finish: rcs0
> 
> we are missing the execlists_submission_tasklet() invocation before the
> execlists_reset_fini() implying that either the queue is empty, or we
> failed to schedule and run the tasklet on finish. Add an assert so we
> are sure that on unsubmitting the incomplete request after reset, the
> queue is indeed populated.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3edb417caa7b..e8de250c3413 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -344,6 +344,7 @@ static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   			last_prio = rq_prio(rq);
>   			p = lookup_priolist(engine, last_prio);
>   		}
> +		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
>   
>   		GEM_BUG_ON(p->priority != rq_prio(rq));
>   		list_add(&rq->sched.link, &p->requests);
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
Chris Wilson Sept. 25, 2018, 7:41 a.m. UTC | #2
Quoting Tvrtko Ursulin (2018-09-24 10:07:11)
> 
> On 19/09/2018 20:55, Chris Wilson wrote:
> > In the sequence
> > 
> > <0>[  531.960431] drv_self-4806    7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0
> > <0>[  531.960431] drv_self-4806    7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133
> 
> How such a unbelievably huge delta between the guilty request and hws?

1?

;)
-Chris
Tvrtko Ursulin Sept. 25, 2018, 8:51 a.m. UTC | #3
On 25/09/2018 08:41, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-09-24 10:07:11)
>>
>> On 19/09/2018 20:55, Chris Wilson wrote:
>>> In the sequence
>>>
>>> <0>[  531.960431] drv_self-4806    7.... 527402570us : intel_gpu_reset: engine_mask=1, ret=0, retry=0
>>> <0>[  531.960431] drv_self-4806    7.... 527402571us : execlists_reset: rcs0 request global=115de, current=71133
>>
>> How such a unbelievably huge delta between the guilty request and hws?
> 
> 1?
> 
> ;)

Oh bloody dec vs hex again! :D

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3edb417caa7b..e8de250c3413 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -344,6 +344,7 @@  static void __unwind_incomplete_requests(struct intel_engine_cs *engine)
 			last_prio = rq_prio(rq);
 			p = lookup_priolist(engine, last_prio);
 		}
+		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
 		GEM_BUG_ON(p->priority != rq_prio(rq));
 		list_add(&rq->sched.link, &p->requests);