Message ID | 20180709104237.21880-2-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Chris Wilson <chris@chris-wilson.co.uk> writes: > With a broken GPU we expect it to fail during the initial > GPU setup where do a couple of context switches to record the defaults. > This is a task that takes a few milliseconds even on the slowest of > devices, but we may have to wait 60s for hangcheck to give in and > declare the machine inoperable. In this a case where any gpu hang is > unacceptable, both from a timeliness and practical standpoint. > > We can therefore set a timeout on our wait-for-idle that is shorter than > the hangcheck (which may be up to 60s for a declaring a wedged driver) > and so detect the broken GPU much more quickly during driver load (and > so prevent stalling userspace for ages). > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 093d98ed7755..05a9486dd4cf 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5361,11 +5361,10 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915) if (err) goto err_active; - err = i915_gem_wait_for_idle(i915, - I915_WAIT_LOCKED, - MAX_SCHEDULE_TIMEOUT); - if (err) + if (i915_gem_wait_for_idle(i915, I915_WAIT_LOCKED, HZ / 5)) { + err = -EIO; /* Caller will declare us wedged */ goto err_active; + } assert_kernel_context_is_current(i915);
With a broken GPU we expect it to fail during the initial GPU setup where do a couple of context switches to record the defaults. This is a task that takes a few milliseconds even on the slowest of devices, but we may have to wait 60s for hangcheck to give in and declare the machine inoperable. In this a case where any gpu hang is unacceptable, both from a timeliness and practical standpoint. We can therefore set a timeout on our wait-for-idle that is shorter than the hangcheck (which may be up to 60s for a declaring a wedged driver) and so detect the broken GPU much more quickly during driver load (and so prevent stalling userspace for ages). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)