Message ID | 20190128010245.20148-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/28] drm/i915: Wait for a moment before forcibly resetting the device | expand |
Chris Wilson <chris@chris-wilson.co.uk> writes: > During igt, we ask to reset the device if any requests are still > outstanding at the end of a test, as this quickly kills off any > erroneous hanging request streams that may escape a test. However, since > it may take the device a few milliseconds to flush itself after the end > of a normal test, *cough* guc *cough*, we may accidentally tell the > device to reset itself after it idles. If we wait a moment, our usual > I915_IDLE_ENGINES_TIMEOUT of 200ms (seems a bit high, but still better > than umpteen hangchecks!), we can differentiate better between a stuck > engine and a healthy one, and so avoid prematurely forcing the reset and > any extra complications that may entail. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > --- > drivers/gpu/drm/i915/i915_debugfs.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > index 3b995f9fdc06..e46de507fea2 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -4051,7 +4051,8 @@ i915_drop_caches_set(void *data, u64 val) > val, val & DROP_ALL); > wakeref = intel_runtime_pm_get(i915); > > - if (val & DROP_RESET_ACTIVE && !intel_engines_are_idle(i915)) > + if (val & DROP_RESET_ACTIVE && > + wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT)) > i915_gem_set_wedged(i915); Some of the compilications have been welcomed. But it is still better to try to entail them into tests explicitly rather than using indirect test harness stress. Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > /* No need to check and wait for gpu resets, only libdrm auto-restarts > -- > 2.20.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Mika Kuoppala (2019-01-28 09:24:12) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > During igt, we ask to reset the device if any requests are still > > outstanding at the end of a test, as this quickly kills off any > > erroneous hanging request streams that may escape a test. However, since > > it may take the device a few milliseconds to flush itself after the end > > of a normal test, *cough* guc *cough*, we may accidentally tell the > > device to reset itself after it idles. If we wait a moment, our usual > > I915_IDLE_ENGINES_TIMEOUT of 200ms (seems a bit high, but still better > > than umpteen hangchecks!), we can differentiate better between a stuck > > engine and a healthy one, and so avoid prematurely forcing the reset and > > any extra complications that may entail. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > --- > > drivers/gpu/drm/i915/i915_debugfs.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c > > index 3b995f9fdc06..e46de507fea2 100644 > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > @@ -4051,7 +4051,8 @@ i915_drop_caches_set(void *data, u64 val) > > val, val & DROP_ALL); > > wakeref = intel_runtime_pm_get(i915); > > > > - if (val & DROP_RESET_ACTIVE && !intel_engines_are_idle(i915)) > > + if (val & DROP_RESET_ACTIVE && > > + wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT)) > > i915_gem_set_wedged(i915); > > Some of the compilications have been welcomed. But it is still > better to try to entail them into tests explicitly rather > than using indirect test harness stress. > > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Pushed to mask potential problems in *-guc BAT. -Chris
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 3b995f9fdc06..e46de507fea2 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -4051,7 +4051,8 @@ i915_drop_caches_set(void *data, u64 val) val, val & DROP_ALL); wakeref = intel_runtime_pm_get(i915); - if (val & DROP_RESET_ACTIVE && !intel_engines_are_idle(i915)) + if (val & DROP_RESET_ACTIVE && + wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT)) i915_gem_set_wedged(i915); /* No need to check and wait for gpu resets, only libdrm auto-restarts
During igt, we ask to reset the device if any requests are still outstanding at the end of a test, as this quickly kills off any erroneous hanging request streams that may escape a test. However, since it may take the device a few milliseconds to flush itself after the end of a normal test, *cough* guc *cough*, we may accidentally tell the device to reset itself after it idles. If we wait a moment, our usual I915_IDLE_ENGINES_TIMEOUT of 200ms (seems a bit high, but still better than umpteen hangchecks!), we can differentiate better between a stuck engine and a healthy one, and so avoid prematurely forcing the reset and any extra complications that may entail. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> --- drivers/gpu/drm/i915/i915_debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)