Message ID | 20190213224805.32021-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/selftests: Always use an active engine while resetting | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > Currently, we only try to reset a live engine for checking the whitelist > retention across a per-engine reset. For safety, it appears we need to > prime the system with a hanging spinner before performing a full-device > reset. (Figuring out the root cause behind the instability with handling > a reset during a no-op request is a challenge for another test, the > whitelist test has its own purpose.) > Agreed on the sentiment and it does what it says on the tin. Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > .../drm/i915/selftests/intel_workarounds.c | 27 ++++++------------- > 1 file changed, 8 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c > index b15c4f26c593..d6bb2005024d 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c > @@ -237,14 +237,8 @@ switch_to_scratch_context(struct intel_engine_cs *engine, > return PTR_ERR(ctx); > > rq = ERR_PTR(-ENODEV); > - with_intel_runtime_pm(engine->i915, wakeref) { > - if (spin) > - rq = igt_spinner_create_request(spin, > - ctx, engine, > - MI_NOOP); > - else > - rq = i915_request_alloc(engine, ctx); > - } > + with_intel_runtime_pm(engine->i915, wakeref) > + rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP); > > kernel_context_close(ctx); > > @@ -273,7 +267,6 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, > const char *name) > { > struct drm_i915_private *i915 = engine->i915; > - bool want_spin = reset == do_engine_reset; > struct i915_gem_context *ctx; > struct igt_spinner spin; > intel_wakeref_t wakeref; > @@ -282,11 +275,9 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, > pr_info("Checking %d whitelisted registers (RING_NONPRIV) [%s]\n", > engine->whitelist.count, name); > > - if (want_spin) { > - err = igt_spinner_init(&spin, i915); > - if (err) > - return err; > - } > + err = igt_spinner_init(&spin, i915); > + if (err) > + return err; > > ctx = kernel_context(i915); > if (IS_ERR(ctx)) > @@ -298,17 +289,15 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, > goto out; > } > > - err = switch_to_scratch_context(engine, want_spin ? &spin : NULL); > + err = switch_to_scratch_context(engine, &spin); > if (err) > goto out; > > with_intel_runtime_pm(i915, wakeref) > err = reset(engine); > > - if (want_spin) { > - igt_spinner_end(&spin); > - igt_spinner_fini(&spin); > - } > + igt_spinner_end(&spin); > + igt_spinner_fini(&spin); > > if (err) { > pr_err("%s reset failed\n", name); > -- > 2.20.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Mika Kuoppala (2019-02-15 09:09:46) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > Currently, we only try to reset a live engine for checking the whitelist > > retention across a per-engine reset. For safety, it appears we need to > > prime the system with a hanging spinner before performing a full-device > > reset. (Figuring out the root cause behind the instability with handling > > a reset during a no-op request is a challenge for another test, the > > whitelist test has its own purpose.) > > > > Agreed on the sentiment and it does what it says on the tin. > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Fwiw, I'm working on a live_hangcheck pass to try and hit the same issue (although CI hints that we may have fixed it already with "drm/i915: Only try to park engines after a failed reset") -Chris
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c index b15c4f26c593..d6bb2005024d 100644 --- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c +++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c @@ -237,14 +237,8 @@ switch_to_scratch_context(struct intel_engine_cs *engine, return PTR_ERR(ctx); rq = ERR_PTR(-ENODEV); - with_intel_runtime_pm(engine->i915, wakeref) { - if (spin) - rq = igt_spinner_create_request(spin, - ctx, engine, - MI_NOOP); - else - rq = i915_request_alloc(engine, ctx); - } + with_intel_runtime_pm(engine->i915, wakeref) + rq = igt_spinner_create_request(spin, ctx, engine, MI_NOOP); kernel_context_close(ctx); @@ -273,7 +267,6 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, const char *name) { struct drm_i915_private *i915 = engine->i915; - bool want_spin = reset == do_engine_reset; struct i915_gem_context *ctx; struct igt_spinner spin; intel_wakeref_t wakeref; @@ -282,11 +275,9 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, pr_info("Checking %d whitelisted registers (RING_NONPRIV) [%s]\n", engine->whitelist.count, name); - if (want_spin) { - err = igt_spinner_init(&spin, i915); - if (err) - return err; - } + err = igt_spinner_init(&spin, i915); + if (err) + return err; ctx = kernel_context(i915); if (IS_ERR(ctx)) @@ -298,17 +289,15 @@ static int check_whitelist_across_reset(struct intel_engine_cs *engine, goto out; } - err = switch_to_scratch_context(engine, want_spin ? &spin : NULL); + err = switch_to_scratch_context(engine, &spin); if (err) goto out; with_intel_runtime_pm(i915, wakeref) err = reset(engine); - if (want_spin) { - igt_spinner_end(&spin); - igt_spinner_fini(&spin); - } + igt_spinner_end(&spin); + igt_spinner_fini(&spin); if (err) { pr_err("%s reset failed\n", name);
Currently, we only try to reset a live engine for checking the whitelist retention across a per-engine reset. For safety, it appears we need to prime the system with a hanging spinner before performing a full-device reset. (Figuring out the root cause behind the instability with handling a reset during a no-op request is a challenge for another test, the whitelist test has its own purpose.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- .../drm/i915/selftests/intel_workarounds.c | 27 ++++++------------- 1 file changed, 8 insertions(+), 19 deletions(-)