Message ID | 152301754847.28413.2797600546019323260@mail.alporthouse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 06 Apr 2018 14:25:48 +0200, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Quoting Sagar Arun Kamble (2018-04-05 12:54:38) >> >> >> On 4/5/2018 4:32 PM, Chris Wilson wrote: >> > As different backends may have different park/unpark callbacks, we >> > should only ever switch backends (reset_default_submission on wedge >> > recovery, or on enabling the guc) while parked. >> > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> >> > --- >> > drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++++ >> > drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++ >> > drivers/gpu/drm/i915/intel_guc_submission.c | 1 + >> > 3 files changed, 15 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/i915/i915_gem.c >> b/drivers/gpu/drm/i915/i915_gem.c >> > index e148db310ea6..e2880de2fc7e 100644 >> > --- a/drivers/gpu/drm/i915/i915_gem.c >> > +++ b/drivers/gpu/drm/i915/i915_gem.c >> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct >> drm_i915_private *i915) >> > i915_retire_requests(i915); >> > GEM_BUG_ON(i915->gt.active_requests); >> > >> > + /* >> > + * Park before disengaging the old submit mechanism as different >> > + * backends may have different park/unpack callbacks. >> > + * >> > + * We are idle; the idle-worker will be queued, but we need to >> run >> > + * it now. As we already hold the struct mutex, we can get park >> > + * the GPU right away, letting the lazy worker see that we are >> > + * already active again by the time it acquires the mutex. >> > + */ >> > + i915_gem_park(i915); >> I think we should be calling this before gem_unset_wedged in i915_reset. >> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable. > > Right, we really do need to restore guc submission before restarting. So > how can we fit in something like > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > b/drivers/gpu/drm/i915/i915_gem.c > index 9650a7b10c5f..95fa30d9aec6 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private > *dev_priv) > i915_retire_requests(dev_priv); > + if (USES_GUC_SUBMISSION(dev_priv)) > + (void)intel_guc_submission_enable(guc); > + > for_each_engine(engine, dev_priv, id) { > struct i915_gem_context *ctx; > ? In series [1] I was trying to fix symmetry in calls to uc_init_hw/fini_hw where we are enabling/disabling GuC submission. In particular, patch [2] fixes reset path. So maybe we should try to merge* both series ? Michal *) without my patch 6/12 [3] [1] https://patchwork.freedesktop.org/series/41159/ [2] https://patchwork.freedesktop.org/patch/214967/ [3] https://patchwork.freedesktop.org/patch/214976/
Quoting Michal Wajdeczko (2018-04-06 14:39:28) > On Fri, 06 Apr 2018 14:25:48 +0200, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > > > Quoting Sagar Arun Kamble (2018-04-05 12:54:38) > >> > >> > >> On 4/5/2018 4:32 PM, Chris Wilson wrote: > >> > As different backends may have different park/unpark callbacks, we > >> > should only ever switch backends (reset_default_submission on wedge > >> > recovery, or on enabling the guc) while parked. > >> > > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > >> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> > >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > >> > --- > >> > drivers/gpu/drm/i915/i915_gem.c | 11 +++++++++++ > >> > drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++ > >> > drivers/gpu/drm/i915/intel_guc_submission.c | 1 + > >> > 3 files changed, 15 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/i915/i915_gem.c > >> b/drivers/gpu/drm/i915/i915_gem.c > >> > index e148db310ea6..e2880de2fc7e 100644 > >> > --- a/drivers/gpu/drm/i915/i915_gem.c > >> > +++ b/drivers/gpu/drm/i915/i915_gem.c > >> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct > >> drm_i915_private *i915) > >> > i915_retire_requests(i915); > >> > GEM_BUG_ON(i915->gt.active_requests); > >> > > >> > + /* > >> > + * Park before disengaging the old submit mechanism as different > >> > + * backends may have different park/unpack callbacks. > >> > + * > >> > + * We are idle; the idle-worker will be queued, but we need to > >> run > >> > + * it now. As we already hold the struct mutex, we can get park > >> > + * the GPU right away, letting the lazy worker see that we are > >> > + * already active again by the time it acquires the mutex. > >> > + */ > >> > + i915_gem_park(i915); > >> I think we should be calling this before gem_unset_wedged in i915_reset. > >> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable. > > > > Right, we really do need to restore guc submission before restarting. So > > how can we fit in something like > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c > > index 9650a7b10c5f..95fa30d9aec6 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private > > *dev_priv) > > i915_retire_requests(dev_priv); > > + if (USES_GUC_SUBMISSION(dev_priv)) > > + (void)intel_guc_submission_enable(guc); > > + > > for_each_engine(engine, dev_priv, id) { > > struct i915_gem_context *ctx; > > ? > > In series [1] I was trying to fix symmetry in calls to uc_init_hw/fini_hw > where we are enabling/disabling GuC submission. In particular, patch [2] > fixes reset path. Tbh, I first misread that as i915_gem_fini(). Hmm, but we still should be re-enabling guc prior to submitting our first requests here in i915_gem_reset() prior to calling i915_gem_init_hw(). We want to start from the repopulated request lists, so sticking the current init_hw() is overkill. I guess I have this sketch in mind: i915_reset: i915_gem_reset_prepare(); i915_gem_disable_hw(); // placeholder! i915_gem_fini_hw(); // revert to default state intel_gpu_reset(); i915_gem_init_hw(); // just the HW setup i915_gem_reset(); i915_gem_enable_hw(); // tell the HW to run i915_gem_reset_finish(); That i915_gem_reset() sticks out like a sore thumb. I think that's what we need to break up and rearrange. Hmm, there's some of that in the RPS series. Looks like we're too early to try and do this neatly. > So maybe we should try to merge* both series ? Hmm, looking over that series doesn't patch 5 hit the GEM_BUG_ON removed in patch 6? Otherwise, merging the first 5 with Sagar's blessing would be on the cards as they look sane otherwise. Maybe start with the first 4 then? -Chris
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9650a7b10c5f..95fa30d9aec6 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private *dev_priv) i915_retire_requests(dev_priv); + if (USES_GUC_SUBMISSION(dev_priv)) + (void)intel_guc_submission_enable(guc); + for_each_engine(engine, dev_priv, id) { struct i915_gem_context *ctx;