Message ID | 20180409094905.4516-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Chris Wilson (2018-04-09 10:49:05) > If we try to suspend a wedged device following a GPU reset failure, we > will also fail to turn off the rc6 powerwells (on vlv), leading to a > *ERROR*. This is quite expected in this case, so the best we can do is > shake our heads and reduce the *ERROR* to a debug so CI stops > complaining. > > Testcase: igt/gem_eio/in-flight-suspend #vlv Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105583 -Chris
Quoting Chris Wilson (2018-04-09 10:54:46) > Quoting Chris Wilson (2018-04-09 10:49:05) > > If we try to suspend a wedged device following a GPU reset failure, we > > will also fail to turn off the rc6 powerwells (on vlv), leading to a > > *ERROR*. This is quite expected in this case, so the best we can do is > > shake our heads and reduce the *ERROR* to a debug so CI stops > > complaining. > > > > Testcase: igt/gem_eio/in-flight-suspend #vlv > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105583 Does anyone want to keep this *ERROR* that we can trigger at will? -Chris
On Wed, 02 May 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Quoting Chris Wilson (2018-04-09 10:54:46) >> Quoting Chris Wilson (2018-04-09 10:49:05) >> > If we try to suspend a wedged device following a GPU reset failure, we >> > will also fail to turn off the rc6 powerwells (on vlv), leading to a >> > *ERROR*. This is quite expected in this case, so the best we can do is >> > shake our heads and reduce the *ERROR* to a debug so CI stops >> > complaining. >> > >> > Testcase: igt/gem_eio/in-flight-suspend #vlv >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105583 > > Does anyone want to keep this *ERROR* that we can trigger at will? No strong feelings. I guess my question is, can this happen when there's no GPU reset failures, and the change would silence that? Anyway, if it makes CI less noisy, Acked-by: Jani Nikula <jani.nikula@intel.com>
Quoting Jani Nikula (2018-05-03 09:58:17) > On Wed, 02 May 2018, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Chris Wilson (2018-04-09 10:54:46) > >> Quoting Chris Wilson (2018-04-09 10:49:05) > >> > If we try to suspend a wedged device following a GPU reset failure, we > >> > will also fail to turn off the rc6 powerwells (on vlv), leading to a > >> > *ERROR*. This is quite expected in this case, so the best we can do is > >> > shake our heads and reduce the *ERROR* to a debug so CI stops > >> > complaining. > >> > > >> > Testcase: igt/gem_eio/in-flight-suspend #vlv > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105583 > > > > Does anyone want to keep this *ERROR* that we can trigger at will? > > No strong feelings. I guess my question is, can this happen when there's > no GPU reset failures, and the change would silence that? It hasn't occurred yet, but we see reset failures in the wild (just not on byt afair). From the code pov, it is marked as a debug feature and called before system suspend, so should be harmless (famous last words). > Anyway, if it makes CI less noisy, > > Acked-by: Jani Nikula <jani.nikula@intel.com> Thanks, -Chris
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f770be18b2d7..db6fc176ec3c 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2462,10 +2462,13 @@ static void vlv_wait_for_gt_wells(struct drm_i915_private *dev_priv, /* * RC6 transitioning can be delayed up to 2 msec (see * valleyview_enable_rps), use 3 msec for safety. + * + * This can fail to turn off the rc6 if the GPU is stuck after a failed + * reset and we are trying to force the machine to sleep. */ if (vlv_wait_for_pw_status(dev_priv, mask, val)) - DRM_ERROR("timeout waiting for GT wells to go %s\n", - onoff(wait_for_on)); + DRM_DEBUG_DRIVER("timeout waiting for GT wells to go %s\n", + onoff(wait_for_on)); } static void vlv_check_no_gt_access(struct drm_i915_private *dev_priv)
If we try to suspend a wedged device following a GPU reset failure, we will also fail to turn off the rc6 powerwells (on vlv), leading to a *ERROR*. This is quite expected in this case, so the best we can do is shake our heads and reduce the *ERROR* to a debug so CI stops complaining. Testcase: igt/gem_eio/in-flight-suspend #vlv Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> --- drivers/gpu/drm/i915/i915_drv.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)