Message ID | 20180516064741.30912-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Chris Wilson <chris@chris-wilson.co.uk> writes: > The idea was to try and let the existing tasklet run to completion > before we began the reset, but it involves a racy check against anything > else that tries to run the tasklet. Rather than acknowledge and ignore > the race, let it be and don't try and be too clever. > > The tasklet will resume execution after reset (after spinning a bit > during reset), but before we allow it to resume we will have cleared all > the pending state. The disable works only on all future reschedules and the dequeue is behind timeline lock. But what guards against the tasklet being currently reading the ports? -Mika > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_gem.c | 9 --------- > 1 file changed, 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 0a2070112b66..0dc369a9ec4d 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -3035,16 +3035,7 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine) > * calling engine->init_hw() and also writing the ELSP. > * Turning off the execlists->tasklet until the reset is over > * prevents the race. > - * > - * Note that this needs to be a single atomic operation on the > - * tasklet (flush existing tasks, prevent new tasks) to prevent > - * a race between reset and set-wedged. It is not, so we do the best > - * we can atm and make sure we don't lock the machine up in the more > - * common case of recursively being called from set-wedged from inside > - * i915_reset. > */ > - if (!atomic_read(&engine->execlists.tasklet.count)) > - tasklet_kill(&engine->execlists.tasklet); > tasklet_disable(&engine->execlists.tasklet); > > /* > -- > 2.17.0
Quoting Mika Kuoppala (2018-05-16 09:39:14) > Chris Wilson <chris@chris-wilson.co.uk> writes: > > > The idea was to try and let the existing tasklet run to completion > > before we began the reset, but it involves a racy check against anything > > else that tries to run the tasklet. Rather than acknowledge and ignore > > the race, let it be and don't try and be too clever. > > > > The tasklet will resume execution after reset (after spinning a bit > > during reset), but before we allow it to resume we will have cleared all > > the pending state. > > The disable works only on all future reschedules and > the dequeue is behind timeline lock. But what guards against > the tasklet being currently reading the ports? tasklet_disable() itself is synchronous and waits for completion of the current execution before returning. See tasklet_disable_nosync() for the complimentary function just to prevent future schedules from executing. -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Mika Kuoppala (2018-05-16 09:39:14) >> Chris Wilson <chris@chris-wilson.co.uk> writes: >> >> > The idea was to try and let the existing tasklet run to completion >> > before we began the reset, but it involves a racy check against anything >> > else that tries to run the tasklet. Rather than acknowledge and ignore >> > the race, let it be and don't try and be too clever. >> > >> > The tasklet will resume execution after reset (after spinning a bit >> > during reset), but before we allow it to resume we will have cleared all >> > the pending state. >> >> The disable works only on all future reschedules and >> the dequeue is behind timeline lock. But what guards against >> the tasklet being currently reading the ports? > > tasklet_disable() itself is synchronous and waits for completion of the > current execution before returning. See tasklet_disable_nosync() for the > complimentary function just to prevent future schedules from executing. Ok, so the spinning is on the tasklet infra when it spins on scheduling, not being able to call the tasklet->func(). Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 0a2070112b66..0dc369a9ec4d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3035,16 +3035,7 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine) * calling engine->init_hw() and also writing the ELSP. * Turning off the execlists->tasklet until the reset is over * prevents the race. - * - * Note that this needs to be a single atomic operation on the - * tasklet (flush existing tasks, prevent new tasks) to prevent - * a race between reset and set-wedged. It is not, so we do the best - * we can atm and make sure we don't lock the machine up in the more - * common case of recursively being called from set-wedged from inside - * i915_reset. */ - if (!atomic_read(&engine->execlists.tasklet.count)) - tasklet_kill(&engine->execlists.tasklet); tasklet_disable(&engine->execlists.tasklet); /*
The idea was to try and let the existing tasklet run to completion before we began the reset, but it involves a racy check against anything else that tries to run the tasklet. Rather than acknowledge and ignore the race, let it be and don't try and be too clever. The tasklet will resume execution after reset (after spinning a bit during reset), but before we allow it to resume we will have cleared all the pending state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/i915_gem.c | 9 --------- 1 file changed, 9 deletions(-)