diff mbox series

drm/i915/resets: Don't set / test for per-engine reset bits with GuC submission

Message ID 20211028224224.32693-1-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/resets: Don't set / test for per-engine reset bits with GuC submission | expand

Commit Message

Matthew Brost Oct. 28, 2021, 10:42 p.m. UTC
Don't set, test for, or clear per-engine reset bits with GuC submission
as the GuC owns the per engine resets not the i915. Setting, testing
for, and clearing these bits is causing issues with the hangcheck
selftest. Rather than change to test to not use these bits, rip the use
of these bits out from the reset code.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

Comments

John Harrison Oct. 28, 2021, 11:48 p.m. UTC | #1
On 10/28/2021 15:42, Matthew Brost wrote:
> Don't set, test for, or clear per-engine reset bits with GuC submission
> as the GuC owns the per engine resets not the i915. Setting, testing
> for, and clearing these bits is causing issues with the hangcheck
> selftest. Rather than change to test to not use these bits, rip the use
> of these bits out from the reset code.
To be clear, there are other tests poking these bits in addition to 
hangcheck. Not sure if they would suffer from the same problems but I 
don't see why they wouldn't.

Reviewed-by: John Harrison <John.C.Harrison@Intel.com>


>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c | 27 +++++++++++++++++----------
>   1 file changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 91200c43951f..51b56b8e5003 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -1367,20 +1367,27 @@ void intel_gt_handle_error(struct intel_gt *gt,
>   	/* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
>   	synchronize_rcu_expedited();
>   
> -	/* Prevent any other reset-engine attempt. */
> -	for_each_engine(engine, gt, tmp) {
> -		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
> -					&gt->reset.flags))
> -			wait_on_bit(&gt->reset.flags,
> -				    I915_RESET_ENGINE + engine->id,
> -				    TASK_UNINTERRUPTIBLE);
> +	/*
> +	 * Prevent any other reset-engine attempt. We don't do this for GuC
> +	 * submission the GuC owns the per-engine reset, not the i915.
> +	 */
> +	if (!intel_uc_uses_guc_submission(&gt->uc)) {
> +		for_each_engine(engine, gt, tmp) {
> +			while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
> +						&gt->reset.flags))
> +				wait_on_bit(&gt->reset.flags,
> +					    I915_RESET_ENGINE + engine->id,
> +					    TASK_UNINTERRUPTIBLE);
> +		}
>   	}
>   
>   	intel_gt_reset_global(gt, engine_mask, msg);
>   
> -	for_each_engine(engine, gt, tmp)
> -		clear_bit_unlock(I915_RESET_ENGINE + engine->id,
> -				 &gt->reset.flags);
> +	if (!intel_uc_uses_guc_submission(&gt->uc)) {
> +		for_each_engine(engine, gt, tmp)
> +			clear_bit_unlock(I915_RESET_ENGINE + engine->id,
> +					 &gt->reset.flags);
> +	}
>   	clear_bit_unlock(I915_RESET_BACKOFF, &gt->reset.flags);
>   	smp_mb__after_atomic();
>   	wake_up_all(&gt->reset.queue);
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 91200c43951f..51b56b8e5003 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1367,20 +1367,27 @@  void intel_gt_handle_error(struct intel_gt *gt,
 	/* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
 	synchronize_rcu_expedited();
 
-	/* Prevent any other reset-engine attempt. */
-	for_each_engine(engine, gt, tmp) {
-		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
-					&gt->reset.flags))
-			wait_on_bit(&gt->reset.flags,
-				    I915_RESET_ENGINE + engine->id,
-				    TASK_UNINTERRUPTIBLE);
+	/*
+	 * Prevent any other reset-engine attempt. We don't do this for GuC
+	 * submission the GuC owns the per-engine reset, not the i915.
+	 */
+	if (!intel_uc_uses_guc_submission(&gt->uc)) {
+		for_each_engine(engine, gt, tmp) {
+			while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+						&gt->reset.flags))
+				wait_on_bit(&gt->reset.flags,
+					    I915_RESET_ENGINE + engine->id,
+					    TASK_UNINTERRUPTIBLE);
+		}
 	}
 
 	intel_gt_reset_global(gt, engine_mask, msg);
 
-	for_each_engine(engine, gt, tmp)
-		clear_bit_unlock(I915_RESET_ENGINE + engine->id,
-				 &gt->reset.flags);
+	if (!intel_uc_uses_guc_submission(&gt->uc)) {
+		for_each_engine(engine, gt, tmp)
+			clear_bit_unlock(I915_RESET_ENGINE + engine->id,
+					 &gt->reset.flags);
+	}
 	clear_bit_unlock(I915_RESET_BACKOFF, &gt->reset.flags);
 	smp_mb__after_atomic();
 	wake_up_all(&gt->reset.queue);