Message ID | 20211028224224.32693-1-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/resets: Don't set / test for per-engine reset bits with GuC submission | expand |
On 10/28/2021 15:42, Matthew Brost wrote: > Don't set, test for, or clear per-engine reset bits with GuC submission > as the GuC owns the per engine resets not the i915. Setting, testing > for, and clearing these bits is causing issues with the hangcheck > selftest. Rather than change to test to not use these bits, rip the use > of these bits out from the reset code. To be clear, there are other tests poking these bits in addition to hangcheck. Not sure if they would suffer from the same problems but I don't see why they wouldn't. Reviewed-by: John Harrison <John.C.Harrison@Intel.com> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_reset.c | 27 +++++++++++++++++---------- > 1 file changed, 17 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c > index 91200c43951f..51b56b8e5003 100644 > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > @@ -1367,20 +1367,27 @@ void intel_gt_handle_error(struct intel_gt *gt, > /* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */ > synchronize_rcu_expedited(); > > - /* Prevent any other reset-engine attempt. */ > - for_each_engine(engine, gt, tmp) { > - while (test_and_set_bit(I915_RESET_ENGINE + engine->id, > - >->reset.flags)) > - wait_on_bit(>->reset.flags, > - I915_RESET_ENGINE + engine->id, > - TASK_UNINTERRUPTIBLE); > + /* > + * Prevent any other reset-engine attempt. We don't do this for GuC > + * submission the GuC owns the per-engine reset, not the i915. > + */ > + if (!intel_uc_uses_guc_submission(>->uc)) { > + for_each_engine(engine, gt, tmp) { > + while (test_and_set_bit(I915_RESET_ENGINE + engine->id, > + >->reset.flags)) > + wait_on_bit(>->reset.flags, > + I915_RESET_ENGINE + engine->id, > + TASK_UNINTERRUPTIBLE); > + } > } > > intel_gt_reset_global(gt, engine_mask, msg); > > - for_each_engine(engine, gt, tmp) > - clear_bit_unlock(I915_RESET_ENGINE + engine->id, > - >->reset.flags); > + if (!intel_uc_uses_guc_submission(>->uc)) { > + for_each_engine(engine, gt, tmp) > + clear_bit_unlock(I915_RESET_ENGINE + engine->id, > + >->reset.flags); > + } > clear_bit_unlock(I915_RESET_BACKOFF, >->reset.flags); > smp_mb__after_atomic(); > wake_up_all(>->reset.queue);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 91200c43951f..51b56b8e5003 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1367,20 +1367,27 @@ void intel_gt_handle_error(struct intel_gt *gt, /* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */ synchronize_rcu_expedited(); - /* Prevent any other reset-engine attempt. */ - for_each_engine(engine, gt, tmp) { - while (test_and_set_bit(I915_RESET_ENGINE + engine->id, - >->reset.flags)) - wait_on_bit(>->reset.flags, - I915_RESET_ENGINE + engine->id, - TASK_UNINTERRUPTIBLE); + /* + * Prevent any other reset-engine attempt. We don't do this for GuC + * submission the GuC owns the per-engine reset, not the i915. + */ + if (!intel_uc_uses_guc_submission(>->uc)) { + for_each_engine(engine, gt, tmp) { + while (test_and_set_bit(I915_RESET_ENGINE + engine->id, + >->reset.flags)) + wait_on_bit(>->reset.flags, + I915_RESET_ENGINE + engine->id, + TASK_UNINTERRUPTIBLE); + } } intel_gt_reset_global(gt, engine_mask, msg); - for_each_engine(engine, gt, tmp) - clear_bit_unlock(I915_RESET_ENGINE + engine->id, - >->reset.flags); + if (!intel_uc_uses_guc_submission(>->uc)) { + for_each_engine(engine, gt, tmp) + clear_bit_unlock(I915_RESET_ENGINE + engine->id, + >->reset.flags); + } clear_bit_unlock(I915_RESET_BACKOFF, >->reset.flags); smp_mb__after_atomic(); wake_up_all(>->reset.queue);
Don't set, test for, or clear per-engine reset bits with GuC submission as the GuC owns the per engine resets not the i915. Setting, testing for, and clearing these bits is causing issues with the hangcheck selftest. Rather than change to test to not use these bits, rip the use of these bits out from the reset code. Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/i915/gt/intel_reset.c | 27 +++++++++++++++++---------- 1 file changed, 17 insertions(+), 10 deletions(-)