diff mbox series

[v2,3/5] drm/i915: Allow error capture of a pending request

Message ID 20230117213630.2897570-4-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show
Series Allow error capture without a request / on reset failure | expand

Commit Message

John Harrison Jan. 17, 2023, 9:36 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

A hang situation has been observed where the only requests on the
context were either completed or not yet started according to the
breaadcrumbs. However, the register state claimed a batch was (maybe)
in progress. So, allow capture of the pending request on the grounds
that this might be better than nothing.

v2: Reword 'not started' warning message (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

Tvrtko Ursulin Jan. 18, 2023, 4:35 p.m. UTC | #1
On 17/01/2023 21:36, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> A hang situation has been observed where the only requests on the
> context were either completed or not yet started according to the
> breaadcrumbs. However, the register state claimed a batch was (maybe)
> in progress. So, allow capture of the pending request on the grounds
> that this might be better than nothing.
> 
> v2: Reword 'not started' warning message (Tvrtko)
> 
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gpu_error.c | 9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 461489d599a7e..1d33822a8ca23 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1629,12 +1629,9 @@ capture_engine(struct intel_engine_cs *engine,
>   		intel_engine_clear_hung_context(engine);
>   		/* This will reference count the request (if found) */
>   		rq = intel_context_find_active_request(ce);
> -		if (rq && !i915_request_started(rq)) {
> -			drm_info(&engine->gt->i915->drm, "Got hung context on %s with no active request!\n",
> -				 engine->name);
> -			i915_request_put(rq);
> -			rq = NULL;
> -		}
> +		if (rq && !i915_request_started(rq))
> +			drm_info(&engine->gt->i915->drm, "Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
> +				 engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
>   	} else {
>   		/*
>   		 * Getting here with GuC enabled means it is a forced error capture

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 461489d599a7e..1d33822a8ca23 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1629,12 +1629,9 @@  capture_engine(struct intel_engine_cs *engine,
 		intel_engine_clear_hung_context(engine);
 		/* This will reference count the request (if found) */
 		rq = intel_context_find_active_request(ce);
-		if (rq && !i915_request_started(rq)) {
-			drm_info(&engine->gt->i915->drm, "Got hung context on %s with no active request!\n",
-				 engine->name);
-			i915_request_put(rq);
-			rq = NULL;
-		}
+		if (rq && !i915_request_started(rq))
+			drm_info(&engine->gt->i915->drm, "Got hung context on %s with active request %lld:%lld [0x%04X] not yet started\n",
+				 engine->name, rq->fence.context, rq->fence.seqno, ce->guc_id.id);
 	} else {
 		/*
 		 * Getting here with GuC enabled means it is a forced error capture