diff mbox series

[03/12] drm/i915/execlists: Suppress redundant preemption

Message ID 20190204084116.3013-3-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series [01/12] drm/i915: Allow normal clients to always preempt idle priority clients | expand

Commit Message

Chris Wilson Feb. 4, 2019, 8:41 a.m. UTC
On unwinding the active request we give it a small (limited to internal
priority levels) boost to prevent it from being gazumped a second time.
However, this means that it can be promoted to above the request that
triggered the preemption request, causing a preempt-to-idle cycle for no
change. We can avoid this if we take the boost into account when
checking if the preemption request is valid.

v2: After preemption the active request will be after the preemptee if
they end up with equal priority.

v3: Tvrtko pointed out that this, the existing logic, makes
I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!

v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
v5: Except not all priorities were made equal, and the WAIT not preempting
is only if we start off as !NEWCLIENT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

Comments

Tvrtko Ursulin Feb. 4, 2019, 12:05 p.m. UTC | #1
On 04/02/2019 08:41, Chris Wilson wrote:
> On unwinding the active request we give it a small (limited to internal
> priority levels) boost to prevent it from being gazumped a second time.
> However, this means that it can be promoted to above the request that
> triggered the preemption request, causing a preempt-to-idle cycle for no
> change. We can avoid this if we take the boost into account when
> checking if the preemption request is valid.

After the previous patch to not preempt on wait this is not true any 
more right?

> 
> v2: After preemption the active request will be after the preemptee if
> they end up with equal priority.
> 
> v3: Tvrtko pointed out that this, the existing logic, makes
> I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
> 
> v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
> v5: Except not all priorities were made equal, and the WAIT not preempting
> is only if we start off as !NEWCLIENT.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
>   1 file changed, 34 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 773df0bd685b..9b6b3acb9070 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -164,6 +164,8 @@
>   #define WA_TAIL_DWORDS 2
>   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
>   
> +#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
> +
>   static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>   					    struct intel_engine_cs *engine,
>   					    struct intel_context *ce);
> @@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
>   
>   static int effective_prio(const struct i915_request *rq)
>   {
> +	int prio = rq_prio(rq);
> +
> +	/*
> +	 * On unwinding the active request, we give it a priority bump
> +	 * equivalent to a freshly submitted request. This protects it from
> +	 * being gazumped again, but it would be preferable if we didn't
> +	 * let it be gazumped in the first place!
> +	 *
> +	 * See __unwind_incomplete_requests()
> +	 */
> +	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {

So effectively a started request cannot be preempted by a new client, on 
the same base priority level.

I am still not sure that we should give special meaning to a started 
request. We don't know how long it would execute or anything. It may be 
the only request, or it may be a last in a coalesced context. And in 
both cases it can be either short or long.

> +		/*
> +		 * After preemption, we insert the active request at the
> +		 * end of the new priority level. This means that we will be
> +		 * _lower_ priority than the preemptee all things equal (and
> +		 * so the preemption is valid), so adjust our comparison
> +		 * accordingly.
> +		 */
> +		prio |= ACTIVE_PRIORITY;
> +		prio--;
> +	}
> +
>   	/* Restrict mere WAIT boosts from triggering preemption */
> -	return rq_prio(rq) | __NO_PREEMPTION;
> +	return prio | __NO_PREEMPTION;
>   }
>   
>   static int queue_prio(const struct intel_engine_execlists *execlists)
> @@ -360,7 +384,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   {
>   	struct i915_request *rq, *rn, *active = NULL;
>   	struct list_head *uninitialized_var(pl);
> -	int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
> +	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
>   
>   	lockdep_assert_held(&engine->timeline.lock);
>   
> @@ -391,9 +415,15 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>   	 * The active request is now effectively the start of a new client
>   	 * stream, so give it the equivalent small priority bump to prevent
>   	 * it being gazumped a second time by another peer.
> +	 *
> +	 * One consequence of this preemption boost is that we may jump
> +	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
> +	 * making those priorities non-preemptible. They will be moved 
forward

Not fully preemptible, just in the same user prio bucket.

> +	 * in the priority queue, but they will not gain immediate access to
> +	 * the GPU.
>   	 */
> -	if (!(prio & I915_PRIORITY_NEWCLIENT)) {
> -		prio |= I915_PRIORITY_NEWCLIENT;
> +	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
> +		prio |= ACTIVE_PRIORITY;
>   		active->sched.attr.priority = prio;
>   		list_move_tail(&active->sched.link,
>   			       i915_sched_lookup_priolist(engine, prio));
> 

And here it is a big change as well. We would stop giving boost to 
preempted requests if they haven't started executing yet. And we have no 
accounting to how many times something is preempted, to maybe keep 
bumping their priorities in those cases. Which would probably move 
towards different priority management. Like:

	if (preempted || new_client || wait)
		rq->effective_prio++;

#define USER_PRIO_SHIFT 16 // 65536 prio bumps before overflow?

I don't know, I find the total logic (all these patches combined) quite 
hard to understand so I started to think if we could have something more 
generic and simpler.

Regards,

Tvrtko
Chris Wilson Feb. 4, 2019, 12:25 p.m. UTC | #2
Quoting Tvrtko Ursulin (2019-02-04 12:05:34)
> 
> On 04/02/2019 08:41, Chris Wilson wrote:
> > On unwinding the active request we give it a small (limited to internal
> > priority levels) boost to prevent it from being gazumped a second time.
> > However, this means that it can be promoted to above the request that
> > triggered the preemption request, causing a preempt-to-idle cycle for no
> > change. We can avoid this if we take the boost into account when
> > checking if the preemption request is valid.
> 
> After the previous patch to not preempt on wait this is not true any 
> more right?
> 
> > 
> > v2: After preemption the active request will be after the preemptee if
> > they end up with equal priority.
> > 
> > v3: Tvrtko pointed out that this, the existing logic, makes
> > I915_PRIORITY_WAIT non-preemptible. Document this interesting quirk!
> > 
> > v4: Prove Tvrtko was right about WAIT being non-preemptible and test it.
> > v5: Except not all priorities were made equal, and the WAIT not preempting
> > is only if we start off as !NEWCLIENT.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++----
> >   1 file changed, 34 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 773df0bd685b..9b6b3acb9070 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -164,6 +164,8 @@
> >   #define WA_TAIL_DWORDS 2
> >   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
> >   
> > +#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
> > +
> >   static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
> >                                           struct intel_engine_cs *engine,
> >                                           struct intel_context *ce);
> > @@ -190,8 +192,30 @@ static inline int rq_prio(const struct i915_request *rq)
> >   
> >   static int effective_prio(const struct i915_request *rq)
> >   {
> > +     int prio = rq_prio(rq);
> > +
> > +     /*
> > +      * On unwinding the active request, we give it a priority bump
> > +      * equivalent to a freshly submitted request. This protects it from
> > +      * being gazumped again, but it would be preferable if we didn't
> > +      * let it be gazumped in the first place!
> > +      *
> > +      * See __unwind_incomplete_requests()
> > +      */
> > +     if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
> 
> So effectively a started request cannot be preempted by a new client, on 
> the same base priority level.
> 
> I am still not sure that we should give special meaning to a started 
> request. We don't know how long it would execute or anything. It may be 
> the only request, or it may be a last in a coalesced context. And in 
> both cases it can be either short or long.

It's a mere reflection of unwind_incomplete_requests. We don't do the
preemption in this case, so why even start.

> > +             /*
> > +              * After preemption, we insert the active request at the
> > +              * end of the new priority level. This means that we will be
> > +              * _lower_ priority than the preemptee all things equal (and
> > +              * so the preemption is valid), so adjust our comparison
> > +              * accordingly.
> > +              */
> > +             prio |= ACTIVE_PRIORITY;
> > +             prio--;
> > +     }
> > +
> >       /* Restrict mere WAIT boosts from triggering preemption */
> > -     return rq_prio(rq) | __NO_PREEMPTION;
> > +     return prio | __NO_PREEMPTION;
> >   }
> >   
> >   static int queue_prio(const struct intel_engine_execlists *execlists)
> > @@ -360,7 +384,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >   {
> >       struct i915_request *rq, *rn, *active = NULL;
> >       struct list_head *uninitialized_var(pl);
> > -     int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
> > +     int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
> >   
> >       lockdep_assert_held(&engine->timeline.lock);
> >   
> > @@ -391,9 +415,15 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
> >        * The active request is now effectively the start of a new client
> >        * stream, so give it the equivalent small priority bump to prevent
> >        * it being gazumped a second time by another peer.
> > +      *
> > +      * One consequence of this preemption boost is that we may jump
> > +      * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
> > +      * making those priorities non-preemptible. They will be moved 
> forward
> 
> Not fully preemptible, just in the same user prio bucket.

It means that the preemption request was ignored; it was unable to
preempt, non-preemptible.
 
> > +      * in the priority queue, but they will not gain immediate access to
> > +      * the GPU.
> >        */
> > -     if (!(prio & I915_PRIORITY_NEWCLIENT)) {
> > -             prio |= I915_PRIORITY_NEWCLIENT;
> > +     if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
> > +             prio |= ACTIVE_PRIORITY;
> >               active->sched.attr.priority = prio;
> >               list_move_tail(&active->sched.link,
> >                              i915_sched_lookup_priolist(engine, prio));
> > 
> 
> And here it is a big change as well. We would stop giving boost to 
> preempted requests if they haven't started executing yet. And we have no 
> accounting to how many times something is preempted, to maybe keep 
> bumping their priorities in those cases. Which would probably move 
> towards different priority management. Like:

What? There's no change here. The logic is important though, as it can
only apply if the request wasn't waiting.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 773df0bd685b..9b6b3acb9070 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -164,6 +164,8 @@ 
 #define WA_TAIL_DWORDS 2
 #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
 
+#define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT)
+
 static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 					    struct intel_engine_cs *engine,
 					    struct intel_context *ce);
@@ -190,8 +192,30 @@  static inline int rq_prio(const struct i915_request *rq)
 
 static int effective_prio(const struct i915_request *rq)
 {
+	int prio = rq_prio(rq);
+
+	/*
+	 * On unwinding the active request, we give it a priority bump
+	 * equivalent to a freshly submitted request. This protects it from
+	 * being gazumped again, but it would be preferable if we didn't
+	 * let it be gazumped in the first place!
+	 *
+	 * See __unwind_incomplete_requests()
+	 */
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(rq)) {
+		/*
+		 * After preemption, we insert the active request at the
+		 * end of the new priority level. This means that we will be
+		 * _lower_ priority than the preemptee all things equal (and
+		 * so the preemption is valid), so adjust our comparison
+		 * accordingly.
+		 */
+		prio |= ACTIVE_PRIORITY;
+		prio--;
+	}
+
 	/* Restrict mere WAIT boosts from triggering preemption */
-	return rq_prio(rq) | __NO_PREEMPTION;
+	return prio | __NO_PREEMPTION;
 }
 
 static int queue_prio(const struct intel_engine_execlists *execlists)
@@ -360,7 +384,7 @@  __unwind_incomplete_requests(struct intel_engine_cs *engine)
 {
 	struct i915_request *rq, *rn, *active = NULL;
 	struct list_head *uninitialized_var(pl);
-	int prio = I915_PRIORITY_INVALID | I915_PRIORITY_NEWCLIENT;
+	int prio = I915_PRIORITY_INVALID | ACTIVE_PRIORITY;
 
 	lockdep_assert_held(&engine->timeline.lock);
 
@@ -391,9 +415,15 @@  __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	 * The active request is now effectively the start of a new client
 	 * stream, so give it the equivalent small priority bump to prevent
 	 * it being gazumped a second time by another peer.
+	 *
+	 * One consequence of this preemption boost is that we may jump
+	 * over lesser priorities (such as I915_PRIORITY_WAIT), effectively
+	 * making those priorities non-preemptible. They will be moved forward
+	 * in the priority queue, but they will not gain immediate access to
+	 * the GPU.
 	 */
-	if (!(prio & I915_PRIORITY_NEWCLIENT)) {
-		prio |= I915_PRIORITY_NEWCLIENT;
+	if (~prio & ACTIVE_PRIORITY && __i915_request_has_started(active)) {
+		prio |= ACTIVE_PRIORITY;
 		active->sched.attr.priority = prio;
 		list_move_tail(&active->sched.link,
 			       i915_sched_lookup_priolist(engine, prio));