Message ID | 20180717084121.28185-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 17/07/2018 09:41, Chris Wilson wrote: > If the driver is wedged, we skip idling the GPU. However, we may still > have a few requests still not retired following the wedging (since they > will be waiting for a background worker trying to acquire struct_mutex). > As we hold the struct_mutex, always do a quick request retirement in > order to flush the wedged path. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107257 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_gem.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 42d24410a98c..cc875e1dc7f6 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -5074,6 +5074,8 @@ int i915_gem_suspend(struct drm_i915_private *i915) > > assert_kernel_context_is_current(i915); > } > + i915_retire_requests(i915); /* ensure we flush after wedging */ > + We cannot do this in i915_gem_set_wedged due not having the mutex? I think it should go in an else block of the !terminally_wedged block to signify the alternative idling method for that case. And also to make sure the !terminally_wedged case does not start relying on this extra retire pass. Or alternative teach i915_gem_wait_for_idle how to handle the wedged case and only make switching to kernel context dependant on terminally_wedged status in i915_gem_suspend? Simple else block sounds good enough to me. For that option: Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko > mutex_unlock(&i915->drm.struct_mutex); > > intel_uc_suspend(i915); >
Quoting Tvrtko Ursulin (2018-07-18 13:53:16) > > On 17/07/2018 09:41, Chris Wilson wrote: > > If the driver is wedged, we skip idling the GPU. However, we may still > > have a few requests still not retired following the wedging (since they > > will be waiting for a background worker trying to acquire struct_mutex). > > As we hold the struct_mutex, always do a quick request retirement in > > order to flush the wedged path. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107257 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/i915/i915_gem.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 42d24410a98c..cc875e1dc7f6 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -5074,6 +5074,8 @@ int i915_gem_suspend(struct drm_i915_private *i915) > > > > assert_kernel_context_is_current(i915); > > } > > + i915_retire_requests(i915); /* ensure we flush after wedging */ > > + > > We cannot do this in i915_gem_set_wedged due not having the mutex? Correct. > I think it should go in an else block of the !terminally_wedged block to > signify the alternative idling method for that case. And also to make > sure the !terminally_wedged case does not start relying on this extra > retire pass. I liked the safety net and clarity of not making it conditional. > Or alternative teach i915_gem_wait_for_idle how to handle the wedged > case and only make switching to kernel context dependant on > terminally_wedged status in i915_gem_suspend? Right, it's the checks that actually worry here. Those depend on the gpu being in a fairly sane state... -Chris
On 18/07/2018 14:25, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-07-18 13:53:16) >> >> On 17/07/2018 09:41, Chris Wilson wrote: >>> If the driver is wedged, we skip idling the GPU. However, we may still >>> have a few requests still not retired following the wedging (since they >>> will be waiting for a background worker trying to acquire struct_mutex). >>> As we hold the struct_mutex, always do a quick request retirement in >>> order to flush the wedged path. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107257 >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> --- >>> drivers/gpu/drm/i915/i915_gem.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c >>> index 42d24410a98c..cc875e1dc7f6 100644 >>> --- a/drivers/gpu/drm/i915/i915_gem.c >>> +++ b/drivers/gpu/drm/i915/i915_gem.c >>> @@ -5074,6 +5074,8 @@ int i915_gem_suspend(struct drm_i915_private *i915) >>> >>> assert_kernel_context_is_current(i915); >>> } >>> + i915_retire_requests(i915); /* ensure we flush after wedging */ >>> + >> >> We cannot do this in i915_gem_set_wedged due not having the mutex? > > Correct. > >> I think it should go in an else block of the !terminally_wedged block to >> signify the alternative idling method for that case. And also to make >> sure the !terminally_wedged case does not start relying on this extra >> retire pass. > > I liked the safety net and clarity of not making it conditional. And I disliked the sweeping under the carpet potential of it. :) >> Or alternative teach i915_gem_wait_for_idle how to handle the wedged >> case and only make switching to kernel context dependant on >> terminally_wedged status in i915_gem_suspend? > > Right, it's the checks that actually worry here. Those depend on the gpu > being in a fairly sane state... What checks? i915_terminally_wedged cannot be. Request retirement itself? Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 42d24410a98c..cc875e1dc7f6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -5074,6 +5074,8 @@ int i915_gem_suspend(struct drm_i915_private *i915) assert_kernel_context_is_current(i915); } + i915_retire_requests(i915); /* ensure we flush after wedging */ + mutex_unlock(&i915->drm.struct_mutex); intel_uc_suspend(i915);
If the driver is wedged, we skip idling the GPU. However, we may still have a few requests still not retired following the wedging (since they will be waiting for a background worker trying to acquire struct_mutex). As we hold the struct_mutex, always do a quick request retirement in order to flush the wedged path. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107257 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ 1 file changed, 2 insertions(+)