diff mbox

[1/7] drm/i915: Remove tasklet flush before disable

Message ID 20180516064741.30912-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson May 16, 2018, 6:47 a.m. UTC
The idea was to try and let the existing tasklet run to completion
before we began the reset, but it involves a racy check against anything
else that tries to run the tasklet. Rather than acknowledge and ignore
the race, let it be and don't try and be too clever.

The tasklet will resume execution after reset (after spinning a bit
during reset), but before we allow it to resume we will have cleared all
the pending state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 ---------
 1 file changed, 9 deletions(-)

Comments

Mika Kuoppala May 16, 2018, 8:39 a.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> The idea was to try and let the existing tasklet run to completion
> before we began the reset, but it involves a racy check against anything
> else that tries to run the tasklet. Rather than acknowledge and ignore
> the race, let it be and don't try and be too clever.
>
> The tasklet will resume execution after reset (after spinning a bit
> during reset), but before we allow it to resume we will have cleared all
> the pending state.

The disable works only on all future reschedules and
the dequeue is behind timeline lock. But what guards against
the tasklet being currently reading the ports?

-Mika

>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 9 ---------
>  1 file changed, 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 0a2070112b66..0dc369a9ec4d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3035,16 +3035,7 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
>  	 * calling engine->init_hw() and also writing the ELSP.
>  	 * Turning off the execlists->tasklet until the reset is over
>  	 * prevents the race.
> -	 *
> -	 * Note that this needs to be a single atomic operation on the
> -	 * tasklet (flush existing tasks, prevent new tasks) to prevent
> -	 * a race between reset and set-wedged. It is not, so we do the best
> -	 * we can atm and make sure we don't lock the machine up in the more
> -	 * common case of recursively being called from set-wedged from inside
> -	 * i915_reset.
>  	 */
> -	if (!atomic_read(&engine->execlists.tasklet.count))
> -		tasklet_kill(&engine->execlists.tasklet);
>  	tasklet_disable(&engine->execlists.tasklet);
>  
>  	/*
> -- 
> 2.17.0
Chris Wilson May 16, 2018, 8:41 a.m. UTC | #2
Quoting Mika Kuoppala (2018-05-16 09:39:14)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > The idea was to try and let the existing tasklet run to completion
> > before we began the reset, but it involves a racy check against anything
> > else that tries to run the tasklet. Rather than acknowledge and ignore
> > the race, let it be and don't try and be too clever.
> >
> > The tasklet will resume execution after reset (after spinning a bit
> > during reset), but before we allow it to resume we will have cleared all
> > the pending state.
> 
> The disable works only on all future reschedules and
> the dequeue is behind timeline lock. But what guards against
> the tasklet being currently reading the ports?

tasklet_disable() itself is synchronous and waits for completion of the
current execution before returning. See tasklet_disable_nosync() for the
complimentary function just to prevent future schedules from executing.
-Chris
Mika Kuoppala May 16, 2018, 9:57 a.m. UTC | #3
Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-05-16 09:39:14)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > The idea was to try and let the existing tasklet run to completion
>> > before we began the reset, but it involves a racy check against anything
>> > else that tries to run the tasklet. Rather than acknowledge and ignore
>> > the race, let it be and don't try and be too clever.
>> >
>> > The tasklet will resume execution after reset (after spinning a bit
>> > during reset), but before we allow it to resume we will have cleared all
>> > the pending state.
>> 
>> The disable works only on all future reschedules and
>> the dequeue is behind timeline lock. But what guards against
>> the tasklet being currently reading the ports?
>
> tasklet_disable() itself is synchronous and waits for completion of the
> current execution before returning. See tasklet_disable_nosync() for the
> complimentary function just to prevent future schedules from executing.

Ok, so the spinning is on the tasklet infra when it spins on
scheduling, not being able to call the tasklet->func().

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0a2070112b66..0dc369a9ec4d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3035,16 +3035,7 @@  i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
 	 * calling engine->init_hw() and also writing the ELSP.
 	 * Turning off the execlists->tasklet until the reset is over
 	 * prevents the race.
-	 *
-	 * Note that this needs to be a single atomic operation on the
-	 * tasklet (flush existing tasks, prevent new tasks) to prevent
-	 * a race between reset and set-wedged. It is not, so we do the best
-	 * we can atm and make sure we don't lock the machine up in the more
-	 * common case of recursively being called from set-wedged from inside
-	 * i915_reset.
 	 */
-	if (!atomic_read(&engine->execlists.tasklet.count))
-		tasklet_kill(&engine->execlists.tasklet);
 	tasklet_disable(&engine->execlists.tasklet);
 
 	/*