diff mbox

drm/i915/execlists: Clear user-active flag on preemption completion

Message ID 20180324125829.27026-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson March 24, 2018, 12:58 p.m. UTC
When cancelling the requests and clearing out the ports following a
successful preemption completion, also clear the active flag. I had
assumed that all preemptions would be followed by an immediate dequeue
(preserving the active user flag), but under rare circumstances we may
be triggering a preemption for the second port only for it to have
completed before the preemotion kicks in; leaving execlists->active set
even though the system is now idle.

We can clear the flag inside the common execlists_cancel_port_requests()
as the other users also expect the semantics of active being cleared.

Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Mika Kuoppala March 27, 2018, 8:18 a.m. UTC | #1
Chris Wilson <chris@chris-wilson.co.uk> writes:

> When cancelling the requests and clearing out the ports following a
> successful preemption completion, also clear the active flag. I had
> assumed that all preemptions would be followed by an immediate dequeue
> (preserving the active user flag), but under rare circumstances we may
> be triggering a preemption for the second port only for it to have
> completed before the preemotion kicks in; leaving execlists->active set
> even though the system is now idle.
>
> We can clear the flag inside the common execlists_cancel_port_requests()
> as the other users also expect the semantics of active being cleared.
>
> Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 08d8ac9d1f8f..f9edfe4540e2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -577,6 +577,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  		 * know the next preemption status we see corresponds
>  		 * to this ELSP update.
>  		 */
> +		GEM_BUG_ON(!execlists_is_active(execlists,
> +						EXECLISTS_ACTIVE_USER));

We have a similar type of check in function exit.
But that would trigger only if we are lite restoring to port[0]. 

So more coverage with this and being explicit...

>  		GEM_BUG_ON(!port_count(&port[0]));
>  		if (port_count(&port[0]) > 1)
>  			goto unlock;
> @@ -738,6 +740,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
>  		memset(port, 0, sizeof(*port));
>  		port++;
>  	}
> +
> +	execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
>  }
>  
>  static void clear_gtiir(struct intel_engine_cs *engine)
> @@ -1042,6 +1046,11 @@ static void execlists_submission_tasklet(unsigned long data)
>  
>  	if (fw)
>  		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
> +
> +	/* If the engine is now idle, so should be the flag; and vice versa. */
> +	GEM_BUG_ON(execlists_is_active(&engine->execlists,
> +				       EXECLISTS_ACTIVE_USER) ==
> +		   !port_isset(engine->execlists.port));

But this here looks like we could get rid of the
	GEM_BUG_ON(port_isset(execlists->port) &&
		   !execlists_is_active(execlists, EXECLISTS_ACTIVE_USER));

on end of dequeue and trust this master check you added here.
-Mika

>  }
>  
>  static void queue_request(struct intel_engine_cs *engine,
> -- 
> 2.16.3
Chris Wilson March 27, 2018, 8:47 a.m. UTC | #2
Quoting Mika Kuoppala (2018-03-27 09:18:55)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > When cancelling the requests and clearing out the ports following a
> > successful preemption completion, also clear the active flag. I had
> > assumed that all preemptions would be followed by an immediate dequeue
> > (preserving the active user flag), but under rare circumstances we may
> > be triggering a preemption for the second port only for it to have
> > completed before the preemotion kicks in; leaving execlists->active set
> > even though the system is now idle.
> >
> > We can clear the flag inside the common execlists_cancel_port_requests()
> > as the other users also expect the semantics of active being cleared.
> >
> > Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 08d8ac9d1f8f..f9edfe4540e2 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -577,6 +577,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >                * know the next preemption status we see corresponds
> >                * to this ELSP update.
> >                */
> > +             GEM_BUG_ON(!execlists_is_active(execlists,
> > +                                             EXECLISTS_ACTIVE_USER));
> 
> We have a similar type of check in function exit.

Yes.

> But that would trigger only if we are lite restoring to port[0]. 
> 
> So more coverage with this and being explicit...

And my purpose here was to reinforce the notion that execlists *must* be
active if we still have an active ELSP[0] (same level as asserting
port_count).

> 
> >               GEM_BUG_ON(!port_count(&port[0]));
> >               if (port_count(&port[0]) > 1)
> >                       goto unlock;
> > @@ -738,6 +740,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
> >               memset(port, 0, sizeof(*port));
> >               port++;
> >       }
> > +
> > +     execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
> >  }
> >  
> >  static void clear_gtiir(struct intel_engine_cs *engine)
> > @@ -1042,6 +1046,11 @@ static void execlists_submission_tasklet(unsigned long data)
> >  
> >       if (fw)
> >               intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
> > +
> > +     /* If the engine is now idle, so should be the flag; and vice versa. */
> > +     GEM_BUG_ON(execlists_is_active(&engine->execlists,
> > +                                    EXECLISTS_ACTIVE_USER) ==
> > +                !port_isset(engine->execlists.port));
> 
> But this here looks like we could get rid of the
>         GEM_BUG_ON(port_isset(execlists->port) &&
>                    !execlists_is_active(execlists, EXECLISTS_ACTIVE_USER));
> 
> on end of dequeue and trust this master check you added here.

We could but there's a plan to split this path up a bit, and I want to
move that earlier check around.
-Chris
Mika Kuoppala March 27, 2018, 9:22 a.m. UTC | #3
Chris Wilson <chris@chris-wilson.co.uk> writes:

> When cancelling the requests and clearing out the ports following a
> successful preemption completion, also clear the active flag. I had
> assumed that all preemptions would be followed by an immediate dequeue
> (preserving the active user flag), but under rare circumstances we may
> be triggering a preemption for the second port only for it to have
> completed before the preemotion kicks in; leaving execlists->active set
> even though the system is now idle.
>
> We can clear the flag inside the common execlists_cancel_port_requests()
> as the other users also expect the semantics of active being cleared.
>
> Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 08d8ac9d1f8f..f9edfe4540e2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -577,6 +577,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  		 * know the next preemption status we see corresponds
>  		 * to this ELSP update.
>  		 */
> +		GEM_BUG_ON(!execlists_is_active(execlists,
> +						EXECLISTS_ACTIVE_USER));
>  		GEM_BUG_ON(!port_count(&port[0]));
>  		if (port_count(&port[0]) > 1)
>  			goto unlock;
> @@ -738,6 +740,8 @@ execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
>  		memset(port, 0, sizeof(*port));
>  		port++;
>  	}
> +
> +	execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
>  }
>  
>  static void clear_gtiir(struct intel_engine_cs *engine)
> @@ -1042,6 +1046,11 @@ static void execlists_submission_tasklet(unsigned long data)
>  
>  	if (fw)
>  		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
> +
> +	/* If the engine is now idle, so should be the flag; and vice versa. */
> +	GEM_BUG_ON(execlists_is_active(&engine->execlists,
> +				       EXECLISTS_ACTIVE_USER) ==
> +		   !port_isset(engine->execlists.port));
>  }
>  
>  static void queue_request(struct intel_engine_cs *engine,
> -- 
> 2.16.3
Chris Wilson March 27, 2018, 11:16 a.m. UTC | #4
Quoting Mika Kuoppala (2018-03-27 10:22:11)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > When cancelling the requests and clearing out the ports following a
> > successful preemption completion, also clear the active flag. I had
> > assumed that all preemptions would be followed by an immediate dequeue
> > (preserving the active user flag), but under rare circumstances we may
> > be triggering a preemption for the second port only for it to have
> > completed before the preemotion kicks in; leaving execlists->active set
> > even though the system is now idle.
> >
> > We can clear the flag inside the common execlists_cancel_port_requests()
> > as the other users also expect the semantics of active being cleared.
> >
> > Fixes: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Applied to have one less random failure around preemption. Unlikely CI
will hit as we simply don't apply enough stress.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 08d8ac9d1f8f..f9edfe4540e2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -577,6 +577,8 @@  static void execlists_dequeue(struct intel_engine_cs *engine)
 		 * know the next preemption status we see corresponds
 		 * to this ELSP update.
 		 */
+		GEM_BUG_ON(!execlists_is_active(execlists,
+						EXECLISTS_ACTIVE_USER));
 		GEM_BUG_ON(!port_count(&port[0]));
 		if (port_count(&port[0]) > 1)
 			goto unlock;
@@ -738,6 +740,8 @@  execlists_cancel_port_requests(struct intel_engine_execlists * const execlists)
 		memset(port, 0, sizeof(*port));
 		port++;
 	}
+
+	execlists_clear_active(execlists, EXECLISTS_ACTIVE_USER);
 }
 
 static void clear_gtiir(struct intel_engine_cs *engine)
@@ -1042,6 +1046,11 @@  static void execlists_submission_tasklet(unsigned long data)
 
 	if (fw)
 		intel_uncore_forcewake_put(dev_priv, execlists->fw_domains);
+
+	/* If the engine is now idle, so should be the flag; and vice versa. */
+	GEM_BUG_ON(execlists_is_active(&engine->execlists,
+				       EXECLISTS_ACTIVE_USER) ==
+		   !port_isset(engine->execlists.port));
 }
 
 static void queue_request(struct intel_engine_cs *engine,