diff mbox

[01/18] drm/i915/execlists: Set queue priority from secondary port

Message ID 20180409111413.6352-2-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 9, 2018, 11:13 a.m. UTC
We can refine our current execlists->queue_priority if we inspect
ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
the unsubmitted queue and say that if a subsequent request is more than
important than the current queue, we will rerun the submission tasklet
to evaluate the need for preemption. However, we only want to preempt if
we need to jump ahead of a currently executing request in ELSP. The
second reason for running the submission tasklet is amalgamate requests
into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
(Though repeatedly amalgamating requests into the active context and
triggering many lite-restore is off question gain, the goal really is to
put a context into ELSP[1] to cover the interrupt.) So if instead of
looking at the head of the queue, we look at the context in ELSP[1] we
can answer both of the questions more accurately -- we don't need to
rerun the submission tasklet unless our new request is important enough
to feed into, at least, ELSP[1].

References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Tvrtko Ursulin April 10, 2018, 2:05 p.m. UTC | #1
On 09/04/2018 12:13, Chris Wilson wrote:
> We can refine our current execlists->queue_priority if we inspect
> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> the unsubmitted queue and say that if a subsequent request is more than
> important than the current queue, we will rerun the submission tasklet

s/more than important/more important/

> to evaluate the need for preemption. However, we only want to preempt if
> we need to jump ahead of a currently executing request in ELSP. The
> second reason for running the submission tasklet is amalgamate requests
> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> (Though repeatedly amalgamating requests into the active context and
> triggering many lite-restore is off question gain, the goal really is to
> put a context into ELSP[1] to cover the interrupt.) So if instead of
> looking at the head of the queue, we look at the context in ELSP[1] we
> can answer both of the questions more accurately -- we don't need to
> rerun the submission tasklet unless our new request is important enough
> to feed into, at least, ELSP[1].
> 
> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 3592288e4696..b47d53d284ca 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			kmem_cache_free(engine->i915->priorities, p);
>   	}
>   done:
> -	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> +	execlists->queue_priority =
> +		port != execlists->port ? rq_prio(last) : INT_MIN;

Please bear with my questions since I am not 100% up to date with 
preemption.

Should this be rq_prio("port[1]") for future proofing? Or you really 
mean last port? In which case it wouldn't be the highest pending 
priority as kerneldoc for execlists->queue_priority says.

So this patch changes the meaning of "pending". From pending == "not 
submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems 
to make sense, although it is not the easiest job to figure out the 
consequences.

It even feels like a bugfix since it prevents tasklet scheduling when 
all ports are filled with higher priority requests than the new one.

Although I failed to understand what do we do in both cases if a new 
request arrives of higher prio than the one in ELSP[1]. Looks like 
nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so 
because we can't safely or I misread something?

Also, don't you need to manage execlists->queue_priority after CSB 
processing now? So that it correctly reflects the priority of request in 
ELSP[1] after ELSP[0] gets completed? It seems that without it would get 
stuck at the previous value and then submission could decide to skip 
scheduling the tasklet if new priority is lower than what was in ELSP[1] 
before, and so would fail to fill ELSP[1].

>   	execlists->first = rb;
>   	if (submit)
>   		port_assign(port, last);
> 

Regards,

Tvrtko
Chris Wilson April 10, 2018, 2:24 p.m. UTC | #2
Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> 
> On 09/04/2018 12:13, Chris Wilson wrote:
> > We can refine our current execlists->queue_priority if we inspect
> > ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> > the unsubmitted queue and say that if a subsequent request is more than
> > important than the current queue, we will rerun the submission tasklet
> 
> s/more than important/more important/
> 
> > to evaluate the need for preemption. However, we only want to preempt if
> > we need to jump ahead of a currently executing request in ELSP. The
> > second reason for running the submission tasklet is amalgamate requests
> > into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> > (Though repeatedly amalgamating requests into the active context and
> > triggering many lite-restore is off question gain, the goal really is to
> > put a context into ELSP[1] to cover the interrupt.) So if instead of
> > looking at the head of the queue, we look at the context in ELSP[1] we
> > can answer both of the questions more accurately -- we don't need to
> > rerun the submission tasklet unless our new request is important enough
> > to feed into, at least, ELSP[1].
> > 
> > References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 3592288e4696..b47d53d284ca 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >                       kmem_cache_free(engine->i915->priorities, p);
> >       }
> >   done:
> > -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> > +     execlists->queue_priority =
> > +             port != execlists->port ? rq_prio(last) : INT_MIN;
> 
> Please bear with my questions since I am not 100% up to date with 
> preemption.
> 
> Should this be rq_prio("port[1]") for future proofing? Or you really 
> mean last port? In which case it wouldn't be the highest pending 
> priority as kerneldoc for execlists->queue_priority says.

I mean "secondary" port, so yes using last executing port under the
assumption that we grow into a ring of many useless submission ports.
The kerneldoc is no more or no less accurate. :)

> So this patch changes the meaning of "pending". From pending == "not 
> submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems 
> to make sense, although it is not the easiest job to figure out the 
> consequences.

Yes.
 
> It even feels like a bugfix since it prevents tasklet scheduling when 
> all ports are filled with higher priority requests than the new one.

It won't fix any bugs, since we just reduce the number of kicks. Kicking
and evaluating we have nothing to do is just wasted work. So yes I do
agree that it is a bug fix, just not enough to merit a Fixes.
 
> Although I failed to understand what do we do in both cases if a new 
> request arrives of higher prio than the one in ELSP[1]. Looks like 
> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so 
> because we can't safely or I misread something?

This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
request higher than ELSP[1], we start a preemption as

	if (need_preempt(engine, last, execlists->queue_priority)) {

will evaluate to true. It's either the lowest executing priority (new
code), or lowest pending priority (old code). In either case, if the new
request is more important than the queue_priority, we preempt.

We won't evaluate preemption if we are still awaiting the HWACK from the
last ELSP write, or if we are still preempting. In both of those cases,
we expect to receive an interrupt promptly, upon which we then redo our
evaluations.

> Also, don't you need to manage execlists->queue_priority after CSB 
> processing now? So that it correctly reflects the priority of request in 
> ELSP[1] after ELSP[0] gets completed? It seems that without it would get 
> stuck at the previous value and then submission could decide to skip 
> scheduling the tasklet if new priority is lower than what was in ELSP[1] 
> before, and so would fail to fill ELSP[1].

Yes, that is also done here. Since we are always looking one request
ahead, we either update the priority based on the queue following the
resubmission on interrupt, or it is left as INT_MIN on completion.
Indeed, making sure we reset back to INT_MIN is essential so that we
don't any future submissions from idle.

We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
check this condition.
-Chris
Tvrtko Ursulin April 10, 2018, 2:42 p.m. UTC | #3
On 10/04/2018 15:24, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
>>
>> On 09/04/2018 12:13, Chris Wilson wrote:
>>> We can refine our current execlists->queue_priority if we inspect
>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
>>> the unsubmitted queue and say that if a subsequent request is more than
>>> important than the current queue, we will rerun the submission tasklet
>>
>> s/more than important/more important/
>>
>>> to evaluate the need for preemption. However, we only want to preempt if
>>> we need to jump ahead of a currently executing request in ELSP. The
>>> second reason for running the submission tasklet is amalgamate requests
>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
>>> (Though repeatedly amalgamating requests into the active context and
>>> triggering many lite-restore is off question gain, the goal really is to
>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
>>> looking at the head of the queue, we look at the context in ELSP[1] we
>>> can answer both of the questions more accurately -- we don't need to
>>> rerun the submission tasklet unless our new request is important enough
>>> to feed into, at least, ELSP[1].
>>>
>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index 3592288e4696..b47d53d284ca 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>>                        kmem_cache_free(engine->i915->priorities, p);
>>>        }
>>>    done:
>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>>> +     execlists->queue_priority =
>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
>>
>> Please bear with my questions since I am not 100% up to date with
>> preemption.
>>
>> Should this be rq_prio("port[1]") for future proofing? Or you really
>> mean last port? In which case it wouldn't be the highest pending
>> priority as kerneldoc for execlists->queue_priority says.
> 
> I mean "secondary" port, so yes using last executing port under the
> assumption that we grow into a ring of many useless submission ports.
> The kerneldoc is no more or no less accurate. :)

"That we _don't_ grow"?

> 
>> So this patch changes the meaning of "pending". From pending == "not
>> submitted to ELSP" to pending == "not submitted to ELSP[0]". Which seems
>> to make sense, although it is not the easiest job to figure out the
>> consequences.
> 
> Yes.
>   
>> It even feels like a bugfix since it prevents tasklet scheduling when
>> all ports are filled with higher priority requests than the new one.
> 
> It won't fix any bugs, since we just reduce the number of kicks. Kicking
> and evaluating we have nothing to do is just wasted work. So yes I do
> agree that it is a bug fix, just not enough to merit a Fixes.

Yeah that's fine.

>   
>> Although I failed to understand what do we do in both cases if a new
>> request arrives of higher prio than the one in ELSP[1]. Looks like
>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
>> because we can't safely or I misread something?
> 
> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> request higher than ELSP[1], we start a preemption as
> 
> 	if (need_preempt(engine, last, execlists->queue_priority)) {
> 
> will evaluate to true. It's either the lowest executing priority (new
> code), or lowest pending priority (old code). In either case, if the new
> request is more important than the queue_priority, we preempt.

How when "last" is request from ELSP[0]? And also 
execlists->queue_priority has not yet been updated to reflect the new 
priority?

Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
if there were any appends to ELSP[0] we will also fail to act in this 
situation?

> We won't evaluate preemption if we are still awaiting the HWACK from the
> last ELSP write, or if we are still preempting. In both of those cases,
> we expect to receive an interrupt promptly, upon which we then redo our
> evaluations.
> 
>> Also, don't you need to manage execlists->queue_priority after CSB
>> processing now? So that it correctly reflects the priority of request in
>> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
>> stuck at the previous value and then submission could decide to skip
>> scheduling the tasklet if new priority is lower than what was in ELSP[1]
>> before, and so would fail to fill ELSP[1].
> 
> Yes, that is also done here. Since we are always looking one request
> ahead, we either update the priority based on the queue following the
> resubmission on interrupt, or it is left as INT_MIN on completion.
> Indeed, making sure we reset back to INT_MIN is essential so that we
> don't any future submissions from idle.

Okay I see it - because execlists_dequeue is called and runs to the 
queue_priority update bit even when there is nothing in the queue.

> We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
> check this condition.

Looks like we don't have these hooks set for execlists so it's probably 
more hassle than it would be worth.

Regards,

Tvrtko
Chris Wilson April 10, 2018, 2:56 p.m. UTC | #4
Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> 
> On 10/04/2018 15:24, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>
> >> On 09/04/2018 12:13, Chris Wilson wrote:
> >>> We can refine our current execlists->queue_priority if we inspect
> >>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>> the unsubmitted queue and say that if a subsequent request is more than
> >>> important than the current queue, we will rerun the submission tasklet
> >>
> >> s/more than important/more important/
> >>
> >>> to evaluate the need for preemption. However, we only want to preempt if
> >>> we need to jump ahead of a currently executing request in ELSP. The
> >>> second reason for running the submission tasklet is amalgamate requests
> >>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>> (Though repeatedly amalgamating requests into the active context and
> >>> triggering many lite-restore is off question gain, the goal really is to
> >>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>> can answer both of the questions more accurately -- we don't need to
> >>> rerun the submission tasklet unless our new request is important enough
> >>> to feed into, at least, ELSP[1].
> >>>
> >>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Michał Winiarski <michal.winiarski@intel.com>
> >>> Cc: Michel Thierry <michel.thierry@intel.com>
> >>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>    1 file changed, 2 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index 3592288e4696..b47d53d284ca 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >>>                        kmem_cache_free(engine->i915->priorities, p);
> >>>        }
> >>>    done:
> >>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >>> +     execlists->queue_priority =
> >>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
> >>
> >> Please bear with my questions since I am not 100% up to date with
> >> preemption.
> >>
> >> Should this be rq_prio("port[1]") for future proofing? Or you really
> >> mean last port? In which case it wouldn't be the highest pending
> >> priority as kerneldoc for execlists->queue_priority says.
> > 
> > I mean "secondary" port, so yes using last executing port under the
> > assumption that we grow into a ring of many useless submission ports.
> > The kerneldoc is no more or no less accurate. :)
> 
> "That we _don't_ grow"?

Hmm, no, when we get "last_port" it slots right into here. I just don't
have the future facing code to prevent Mika from having to think a
little. The intent is that when there is a ELSP slot available,
queue_priority is INT_MIN, when there are none, then rq_prio(last).

My bad for remembering what I want the code to be without remembering
what the code says.
 
> >> Although I failed to understand what do we do in both cases if a new
> >> request arrives of higher prio than the one in ELSP[1]. Looks like
> >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >> because we can't safely or I misread something?
> > 
> > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > request higher than ELSP[1], we start a preemption as
> > 
> >       if (need_preempt(engine, last, execlists->queue_priority)) {
> > 
> > will evaluate to true. It's either the lowest executing priority (new
> > code), or lowest pending priority (old code). In either case, if the new
> > request is more important than the queue_priority, we preempt.
> 
> How when "last" is request from ELSP[0]? And also 
> execlists->queue_priority has not yet been updated to reflect the new 
> priority?

When we start executing last on ELSP[0] there will have been another
execlists_dequeue() where we see an empty slot (or fill it) and update
queue_priority. If we are down to the last request, it will be set to
INT_MIN. Upon its completion, it will remain INT_MIN.

> Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> if there were any appends to ELSP[0] we will also fail to act in this 
> situation?

If we only write into ELSP[0], then ELSP[1] remains empty and the
queue_priority still needs to INT_MIN so that we service any new
i915_request_add immediately.
 
> > We won't evaluate preemption if we are still awaiting the HWACK from the
> > last ELSP write, or if we are still preempting. In both of those cases,
> > we expect to receive an interrupt promptly, upon which we then redo our
> > evaluations.
> > 
> >> Also, don't you need to manage execlists->queue_priority after CSB
> >> processing now? So that it correctly reflects the priority of request in
> >> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
> >> stuck at the previous value and then submission could decide to skip
> >> scheduling the tasklet if new priority is lower than what was in ELSP[1]
> >> before, and so would fail to fill ELSP[1].
> > 
> > Yes, that is also done here. Since we are always looking one request
> > ahead, we either update the priority based on the queue following the
> > resubmission on interrupt, or it is left as INT_MIN on completion.
> > Indeed, making sure we reset back to INT_MIN is essential so that we
> > don't any future submissions from idle.
> 
> Okay I see it - because execlists_dequeue is called and runs to the 
> queue_priority update bit even when there is nothing in the queue.

Phew, I can get away from having to draw ascii diagrams. I'll leave that
to Mika as he figures out how to hook up N ports ;)

> > We can add GEM_BUG_ON(queue_priority != INT_MIN) in engines_park to
> > check this condition.
> 
> Looks like we don't have these hooks set for execlists so it's probably 
> more hassle than it would be worth.

For ringbuffer, it's permanently INT_MIN and guc is hooked up to the
same logic as execlists.
-Chris
Chris Wilson April 10, 2018, 3:08 p.m. UTC | #5
Quoting Chris Wilson (2018-04-10 15:56:03)
> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> > 
> > On 10/04/2018 15:24, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> > >> Although I failed to understand what do we do in both cases if a new
> > >> request arrives of higher prio than the one in ELSP[1]. Looks like
> > >> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> > >> because we can't safely or I misread something?
> > > 
> > > This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> > > request higher than ELSP[1], we start a preemption as
> > > 
> > >       if (need_preempt(engine, last, execlists->queue_priority)) {
> > > 
> > > will evaluate to true. It's either the lowest executing priority (new
> > > code), or lowest pending priority (old code). In either case, if the new
> > > request is more important than the queue_priority, we preempt.
> > 
> > How when "last" is request from ELSP[0]? And also 
> > execlists->queue_priority has not yet been updated to reflect the new 
> > priority?
> 
> When we start executing last on ELSP[0] there will have been another
> execlists_dequeue() where we see an empty slot (or fill it) and update
> queue_priority. If we are down to the last request, it will be set to
> INT_MIN. Upon its completion, it will remain INT_MIN.
> 
> > Then there is also "if (port_count(port0)) goto unlock;" suggesting that 
> > if there were any appends to ELSP[0] we will also fail to act in this 
> > situation?
> 
> If we only write into ELSP[0], then ELSP[1] remains empty and the
> queue_priority still needs to INT_MIN so that we service any new
> i915_request_add immediately.
>  
> > > We won't evaluate preemption if we are still awaiting the HWACK from the
> > > last ELSP write, or if we are still preempting. In both of those cases,
> > > we expect to receive an interrupt promptly, upon which we then redo our
> > > evaluations.
> > > 
> > >> Also, don't you need to manage execlists->queue_priority after CSB
> > >> processing now? So that it correctly reflects the priority of request in
> > >> ELSP[1] after ELSP[0] gets completed? It seems that without it would get
> > >> stuck at the previous value and then submission could decide to skip
> > >> scheduling the tasklet if new priority is lower than what was in ELSP[1]
> > >> before, and so would fail to fill ELSP[1].
> > > 
> > > Yes, that is also done here. Since we are always looking one request
> > > ahead, we either update the priority based on the queue following the
> > > resubmission on interrupt, or it is left as INT_MIN on completion.
> > > Indeed, making sure we reset back to INT_MIN is essential so that we
> > > don't any future submissions from idle.
> > 
> > Okay I see it - because execlists_dequeue is called and runs to the 
> > queue_priority update bit even when there is nothing in the queue.
> 
> Phew, I can get away from having to draw ascii diagrams. I'll leave that
> to Mika as he figures out how to hook up N ports ;)

        /*
         * Here be a bit of magic! Or sleight-of-hand, whichever you prefer.
         *
         * We choose queue_priority such that if we add a request of greater
         * priority than this, we kick the submission tasklet to decide on
         * the right order of submitting the requests to hardware. We must
         * also be prepared to reorder requests as they are in-flight on the
         * HW. We derive the queue_priority then as the first "hole" in
         * the HW submission ports and if there are no available slots,
         * it the priority of the lowest executing request, i.e. the last one.
         */
        execlists->queue_priority =
                port != execlists->port ? rq_prio(last) : INT_MIN;

Does that help, or do I need ASCII art?
-Chris
Tvrtko Ursulin April 11, 2018, 10:23 a.m. UTC | #6
On 10/04/2018 15:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
>>
>> On 10/04/2018 15:24, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
>>>>
>>>> On 09/04/2018 12:13, Chris Wilson wrote:
>>>>> We can refine our current execlists->queue_priority if we inspect
>>>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
>>>>> the unsubmitted queue and say that if a subsequent request is more than
>>>>> important than the current queue, we will rerun the submission tasklet
>>>>
>>>> s/more than important/more important/
>>>>
>>>>> to evaluate the need for preemption. However, we only want to preempt if
>>>>> we need to jump ahead of a currently executing request in ELSP. The
>>>>> second reason for running the submission tasklet is amalgamate requests
>>>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
>>>>> (Though repeatedly amalgamating requests into the active context and
>>>>> triggering many lite-restore is off question gain, the goal really is to
>>>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
>>>>> looking at the head of the queue, we look at the context in ELSP[1] we
>>>>> can answer both of the questions more accurately -- we don't need to
>>>>> rerun the submission tasklet unless our new request is important enough
>>>>> to feed into, at least, ELSP[1].
>>>>>
>>>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>>>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> index 3592288e4696..b47d53d284ca 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>>>>                         kmem_cache_free(engine->i915->priorities, p);
>>>>>         }
>>>>>     done:
>>>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>>>>> +     execlists->queue_priority =
>>>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
>>>>
>>>> Please bear with my questions since I am not 100% up to date with
>>>> preemption.
>>>>
>>>> Should this be rq_prio("port[1]") for future proofing? Or you really
>>>> mean last port? In which case it wouldn't be the highest pending
>>>> priority as kerneldoc for execlists->queue_priority says.
>>>
>>> I mean "secondary" port, so yes using last executing port under the
>>> assumption that we grow into a ring of many useless submission ports.
>>> The kerneldoc is no more or no less accurate. :)
>>
>> "That we _don't_ grow"?
> 
> Hmm, no, when we get "last_port" it slots right into here. I just don't
> have the future facing code to prevent Mika from having to think a
> little. The intent is that when there is a ELSP slot available,
> queue_priority is INT_MIN, when there are none, then rq_prio(last).
> 
> My bad for remembering what I want the code to be without remembering
> what the code says.
>   
>>>> Although I failed to understand what do we do in both cases if a new
>>>> request arrives of higher prio than the one in ELSP[1]. Looks like
>>>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
>>>> because we can't safely or I misread something?
>>>
>>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
>>> request higher than ELSP[1], we start a preemption as
>>>
>>>        if (need_preempt(engine, last, execlists->queue_priority)) {
>>>
>>> will evaluate to true. It's either the lowest executing priority (new
>>> code), or lowest pending priority (old code). In either case, if the new
>>> request is more important than the queue_priority, we preempt.
>>
>> How when "last" is request from ELSP[0]? And also
>> execlists->queue_priority has not yet been updated to reflect the new
>> priority?
> 
> When we start executing last on ELSP[0] there will have been another
> execlists_dequeue() where we see an empty slot (or fill it) and update
> queue_priority. If we are down to the last request, it will be set to
> INT_MIN. Upon its completion, it will remain INT_MIN.

I don't see it yet, let me walk through it:

0. Initial situation, GPU busy with two requests, no outstanding ones:

ELSP[0] = prio 2
ELSP[1] = prio 0

1. queue_priority = 0
2. New execbuf comes along with prio=1.
3. We choose to schedule the tasklet - good.
4. execlists_dequeue runs

last = prio 2

if (need_preempt(engine, last, execlists->queue_priority))

queue_priority = 0, so will not preempt last which is prio 2 - so no 
preemption - good.

queue_priority remains at zero since we "goto unlock" with both ports 
busy and no preemption is triggered.

5. ELSP[0] completes, new ELSP[0] with prio 0.

(Before we missed the opportunity to replace ELSP[1] with higher prio 1 
waiting request before ELSP[0] completed - perhaps we have no choice? 
But ok.. carrying on..)

6. execlist_dequeue

lasts = prio 0

if (need_preempt(engine, last, execlists->queue_priority))

queue_priority = 0, so again preemption not triggered.

Perhaps I made a mistake somewhere..

Regards,

Tvrtko
Chris Wilson April 11, 2018, 10:36 a.m. UTC | #7
Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
> 
> On 10/04/2018 15:56, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> >>
> >> On 10/04/2018 15:24, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>>>
> >>>> On 09/04/2018 12:13, Chris Wilson wrote:
> >>>>> We can refine our current execlists->queue_priority if we inspect
> >>>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>>>> the unsubmitted queue and say that if a subsequent request is more than
> >>>>> important than the current queue, we will rerun the submission tasklet
> >>>>
> >>>> s/more than important/more important/
> >>>>
> >>>>> to evaluate the need for preemption. However, we only want to preempt if
> >>>>> we need to jump ahead of a currently executing request in ELSP. The
> >>>>> second reason for running the submission tasklet is amalgamate requests
> >>>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>>>> (Though repeatedly amalgamating requests into the active context and
> >>>>> triggering many lite-restore is off question gain, the goal really is to
> >>>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>>>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>>>> can answer both of the questions more accurately -- we don't need to
> >>>>> rerun the submission tasklet unless our new request is important enough
> >>>>> to feed into, at least, ELSP[1].
> >>>>>
> >>>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
> >>>>> Cc: Michel Thierry <michel.thierry@intel.com>
> >>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>> ---
> >>>>>     drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>>>     1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>> index 3592288e4696..b47d53d284ca 100644
> >>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >>>>>                         kmem_cache_free(engine->i915->priorities, p);
> >>>>>         }
> >>>>>     done:
> >>>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >>>>> +     execlists->queue_priority =
> >>>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
> >>>>
> >>>> Please bear with my questions since I am not 100% up to date with
> >>>> preemption.
> >>>>
> >>>> Should this be rq_prio("port[1]") for future proofing? Or you really
> >>>> mean last port? In which case it wouldn't be the highest pending
> >>>> priority as kerneldoc for execlists->queue_priority says.
> >>>
> >>> I mean "secondary" port, so yes using last executing port under the
> >>> assumption that we grow into a ring of many useless submission ports.
> >>> The kerneldoc is no more or no less accurate. :)
> >>
> >> "That we _don't_ grow"?
> > 
> > Hmm, no, when we get "last_port" it slots right into here. I just don't
> > have the future facing code to prevent Mika from having to think a
> > little. The intent is that when there is a ELSP slot available,
> > queue_priority is INT_MIN, when there are none, then rq_prio(last).
> > 
> > My bad for remembering what I want the code to be without remembering
> > what the code says.
> >   
> >>>> Although I failed to understand what do we do in both cases if a new
> >>>> request arrives of higher prio than the one in ELSP[1]. Looks like
> >>>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >>>> because we can't safely or I misread something?
> >>>
> >>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> >>> request higher than ELSP[1], we start a preemption as
> >>>
> >>>        if (need_preempt(engine, last, execlists->queue_priority)) {
> >>>
> >>> will evaluate to true. It's either the lowest executing priority (new
> >>> code), or lowest pending priority (old code). In either case, if the new
> >>> request is more important than the queue_priority, we preempt.
> >>
> >> How when "last" is request from ELSP[0]? And also
> >> execlists->queue_priority has not yet been updated to reflect the new
> >> priority?
> > 
> > When we start executing last on ELSP[0] there will have been another
> > execlists_dequeue() where we see an empty slot (or fill it) and update
> > queue_priority. If we are down to the last request, it will be set to
> > INT_MIN. Upon its completion, it will remain INT_MIN.
> 
> I don't see it yet, let me walk through it:
> 
> 0. Initial situation, GPU busy with two requests, no outstanding ones:
> 
> ELSP[0] = prio 2
> ELSP[1] = prio 0
> 
> 1. queue_priority = 0
> 2. New execbuf comes along with prio=1.
> 3. We choose to schedule the tasklet - good.
> 4. execlists_dequeue runs
> 
> last = prio 2
> 
> if (need_preempt(engine, last, execlists->queue_priority))
> 
> queue_priority = 0, so will not preempt last which is prio 2 - so no 
> preemption - good.
> 
> queue_priority remains at zero since we "goto unlock" with both ports 
> busy and no preemption is triggered.
> 
> 5. ELSP[0] completes, new ELSP[0] with prio 0.
> 
> (Before we missed the opportunity to replace ELSP[1] with higher prio 1 
> waiting request before ELSP[0] completed - perhaps we have no choice? 
> But ok.. carrying on..)

We don't want to interrupt the higher priority task in ELSP[0] to sort
out ELSP[1].
 
> 6. execlist_dequeue
> 
> lasts = prio 0
> 
> if (need_preempt(engine, last, execlists->queue_priority))
> 
> queue_priority = 0, so again preemption not triggered.

queue_priority is 1 from queue_request().
-Chris
Tvrtko Ursulin April 11, 2018, 10:47 a.m. UTC | #8
On 11/04/2018 11:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
>>
>> On 10/04/2018 15:56, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
>>>>
>>>> On 10/04/2018 15:24, Chris Wilson wrote:
>>>>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
>>>>>>
>>>>>> On 09/04/2018 12:13, Chris Wilson wrote:
>>>>>>> We can refine our current execlists->queue_priority if we inspect
>>>>>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
>>>>>>> the unsubmitted queue and say that if a subsequent request is more than
>>>>>>> important than the current queue, we will rerun the submission tasklet
>>>>>>
>>>>>> s/more than important/more important/
>>>>>>
>>>>>>> to evaluate the need for preemption. However, we only want to preempt if
>>>>>>> we need to jump ahead of a currently executing request in ELSP. The
>>>>>>> second reason for running the submission tasklet is amalgamate requests
>>>>>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
>>>>>>> (Though repeatedly amalgamating requests into the active context and
>>>>>>> triggering many lite-restore is off question gain, the goal really is to
>>>>>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
>>>>>>> looking at the head of the queue, we look at the context in ELSP[1] we
>>>>>>> can answer both of the questions more accurately -- we don't need to
>>>>>>> rerun the submission tasklet unless our new request is important enough
>>>>>>> to feed into, at least, ELSP[1].
>>>>>>>
>>>>>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
>>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>>>>>>> Cc: Michel Thierry <michel.thierry@intel.com>
>>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
>>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>>>>>> index 3592288e4696..b47d53d284ca 100644
>>>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>>>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>>>>>>                          kmem_cache_free(engine->i915->priorities, p);
>>>>>>>          }
>>>>>>>      done:
>>>>>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
>>>>>>> +     execlists->queue_priority =
>>>>>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
>>>>>>
>>>>>> Please bear with my questions since I am not 100% up to date with
>>>>>> preemption.
>>>>>>
>>>>>> Should this be rq_prio("port[1]") for future proofing? Or you really
>>>>>> mean last port? In which case it wouldn't be the highest pending
>>>>>> priority as kerneldoc for execlists->queue_priority says.
>>>>>
>>>>> I mean "secondary" port, so yes using last executing port under the
>>>>> assumption that we grow into a ring of many useless submission ports.
>>>>> The kerneldoc is no more or no less accurate. :)
>>>>
>>>> "That we _don't_ grow"?
>>>
>>> Hmm, no, when we get "last_port" it slots right into here. I just don't
>>> have the future facing code to prevent Mika from having to think a
>>> little. The intent is that when there is a ELSP slot available,
>>> queue_priority is INT_MIN, when there are none, then rq_prio(last).
>>>
>>> My bad for remembering what I want the code to be without remembering
>>> what the code says.
>>>    
>>>>>> Although I failed to understand what do we do in both cases if a new
>>>>>> request arrives of higher prio than the one in ELSP[1]. Looks like
>>>>>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
>>>>>> because we can't safely or I misread something?
>>>>>
>>>>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
>>>>> request higher than ELSP[1], we start a preemption as
>>>>>
>>>>>         if (need_preempt(engine, last, execlists->queue_priority)) {
>>>>>
>>>>> will evaluate to true. It's either the lowest executing priority (new
>>>>> code), or lowest pending priority (old code). In either case, if the new
>>>>> request is more important than the queue_priority, we preempt.
>>>>
>>>> How when "last" is request from ELSP[0]? And also
>>>> execlists->queue_priority has not yet been updated to reflect the new
>>>> priority?
>>>
>>> When we start executing last on ELSP[0] there will have been another
>>> execlists_dequeue() where we see an empty slot (or fill it) and update
>>> queue_priority. If we are down to the last request, it will be set to
>>> INT_MIN. Upon its completion, it will remain INT_MIN.
>>
>> I don't see it yet, let me walk through it:
>>
>> 0. Initial situation, GPU busy with two requests, no outstanding ones:
>>
>> ELSP[0] = prio 2
>> ELSP[1] = prio 0
>>
>> 1. queue_priority = 0
>> 2. New execbuf comes along with prio=1.
>> 3. We choose to schedule the tasklet - good.
>> 4. execlists_dequeue runs
>>
>> last = prio 2
>>
>> if (need_preempt(engine, last, execlists->queue_priority))
>>
>> queue_priority = 0, so will not preempt last which is prio 2 - so no
>> preemption - good.
>>
>> queue_priority remains at zero since we "goto unlock" with both ports
>> busy and no preemption is triggered.
>>
>> 5. ELSP[0] completes, new ELSP[0] with prio 0.
>>
>> (Before we missed the opportunity to replace ELSP[1] with higher prio 1
>> waiting request before ELSP[0] completed - perhaps we have no choice?
>> But ok.. carrying on..)
> 
> We don't want to interrupt the higher priority task in ELSP[0] to sort
> out ELSP[1].

I'll assume that means no safe way to just replace ELSP[1] without 
preempting ELSP[0].

>   
>> 6. execlist_dequeue
>>
>> lasts = prio 0
>>
>> if (need_preempt(engine, last, execlists->queue_priority))
>>
>> queue_priority = 0, so again preemption not triggered.
> 
> queue_priority is 1 from queue_request().

Tunnel vision. :)

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
Chris Wilson April 11, 2018, 11:33 a.m. UTC | #9
Quoting Tvrtko Ursulin (2018-04-11 11:47:00)
> 
> On 11/04/2018 11:36, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-04-11 11:23:01)
> >>
> >> On 10/04/2018 15:56, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-04-10 15:42:07)
> >>>>
> >>>> On 10/04/2018 15:24, Chris Wilson wrote:
> >>>>> Quoting Tvrtko Ursulin (2018-04-10 15:05:33)
> >>>>>>
> >>>>>> On 09/04/2018 12:13, Chris Wilson wrote:
> >>>>>>> We can refine our current execlists->queue_priority if we inspect
> >>>>>>> ELSP[1] rather than the head of the unsubmitted queue. Currently, we use
> >>>>>>> the unsubmitted queue and say that if a subsequent request is more than
> >>>>>>> important than the current queue, we will rerun the submission tasklet
> >>>>>>
> >>>>>> s/more than important/more important/
> >>>>>>
> >>>>>>> to evaluate the need for preemption. However, we only want to preempt if
> >>>>>>> we need to jump ahead of a currently executing request in ELSP. The
> >>>>>>> second reason for running the submission tasklet is amalgamate requests
> >>>>>>> into the active context on ELSP[0] to avoid a stall when ELSP[0] drains.
> >>>>>>> (Though repeatedly amalgamating requests into the active context and
> >>>>>>> triggering many lite-restore is off question gain, the goal really is to
> >>>>>>> put a context into ELSP[1] to cover the interrupt.) So if instead of
> >>>>>>> looking at the head of the queue, we look at the context in ELSP[1] we
> >>>>>>> can answer both of the questions more accurately -- we don't need to
> >>>>>>> rerun the submission tasklet unless our new request is important enough
> >>>>>>> to feed into, at least, ELSP[1].
> >>>>>>>
> >>>>>>> References: f6322eddaff7 ("drm/i915/preemption: Allow preemption between submission ports")
> >>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>>>> Cc: Michał Winiarski <michal.winiarski@intel.com>
> >>>>>>> Cc: Michel Thierry <michel.thierry@intel.com>
> >>>>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>>>>>> ---
> >>>>>>>      drivers/gpu/drm/i915/intel_lrc.c | 3 ++-
> >>>>>>>      1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> index 3592288e4696..b47d53d284ca 100644
> >>>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>>>>>> @@ -713,7 +713,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >>>>>>>                          kmem_cache_free(engine->i915->priorities, p);
> >>>>>>>          }
> >>>>>>>      done:
> >>>>>>> -     execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
> >>>>>>> +     execlists->queue_priority =
> >>>>>>> +             port != execlists->port ? rq_prio(last) : INT_MIN;
> >>>>>>
> >>>>>> Please bear with my questions since I am not 100% up to date with
> >>>>>> preemption.
> >>>>>>
> >>>>>> Should this be rq_prio("port[1]") for future proofing? Or you really
> >>>>>> mean last port? In which case it wouldn't be the highest pending
> >>>>>> priority as kerneldoc for execlists->queue_priority says.
> >>>>>
> >>>>> I mean "secondary" port, so yes using last executing port under the
> >>>>> assumption that we grow into a ring of many useless submission ports.
> >>>>> The kerneldoc is no more or no less accurate. :)
> >>>>
> >>>> "That we _don't_ grow"?
> >>>
> >>> Hmm, no, when we get "last_port" it slots right into here. I just don't
> >>> have the future facing code to prevent Mika from having to think a
> >>> little. The intent is that when there is a ELSP slot available,
> >>> queue_priority is INT_MIN, when there are none, then rq_prio(last).
> >>>
> >>> My bad for remembering what I want the code to be without remembering
> >>> what the code says.
> >>>    
> >>>>>> Although I failed to understand what do we do in both cases if a new
> >>>>>> request arrives of higher prio than the one in ELSP[1]. Looks like
> >>>>>> nothing? Wait until GPU moves it to ELSP[0] and preempt then? Is this so
> >>>>>> because we can't safely or I misread something?
> >>>>>
> >>>>> This is covered by igt/gem_exec_schedule/preempt-queue*. If we receive a
> >>>>> request higher than ELSP[1], we start a preemption as
> >>>>>
> >>>>>         if (need_preempt(engine, last, execlists->queue_priority)) {
> >>>>>
> >>>>> will evaluate to true. It's either the lowest executing priority (new
> >>>>> code), or lowest pending priority (old code). In either case, if the new
> >>>>> request is more important than the queue_priority, we preempt.
> >>>>
> >>>> How when "last" is request from ELSP[0]? And also
> >>>> execlists->queue_priority has not yet been updated to reflect the new
> >>>> priority?
> >>>
> >>> When we start executing last on ELSP[0] there will have been another
> >>> execlists_dequeue() where we see an empty slot (or fill it) and update
> >>> queue_priority. If we are down to the last request, it will be set to
> >>> INT_MIN. Upon its completion, it will remain INT_MIN.
> >>
> >> I don't see it yet, let me walk through it:
> >>
> >> 0. Initial situation, GPU busy with two requests, no outstanding ones:
> >>
> >> ELSP[0] = prio 2
> >> ELSP[1] = prio 0
> >>
> >> 1. queue_priority = 0
> >> 2. New execbuf comes along with prio=1.
> >> 3. We choose to schedule the tasklet - good.
> >> 4. execlists_dequeue runs
> >>
> >> last = prio 2
> >>
> >> if (need_preempt(engine, last, execlists->queue_priority))
> >>
> >> queue_priority = 0, so will not preempt last which is prio 2 - so no
> >> preemption - good.
> >>
> >> queue_priority remains at zero since we "goto unlock" with both ports
> >> busy and no preemption is triggered.
> >>
> >> 5. ELSP[0] completes, new ELSP[0] with prio 0.
> >>
> >> (Before we missed the opportunity to replace ELSP[1] with higher prio 1
> >> waiting request before ELSP[0] completed - perhaps we have no choice?
> >> But ok.. carrying on..)
> > 
> > We don't want to interrupt the higher priority task in ELSP[0] to sort
> > out ELSP[1].
> 
> I'll assume that means no safe way to just replace ELSP[1] without 
> preempting ELSP[0].

It easily doesn't fall out of the current tracking as we are not expecting
a lite-restore on ELSP[1], and would need to shadow the existing pair of
ELSP as well as the new pair, then figure out just which one we
preempted. It didn't look pretty, but I did try that approach early on
while trying to avoid the preempt-to-idle. (When the dust finally
settles, we should give that another go but it's probably already moot
if gen11 has zero extra preempt latency...
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3592288e4696..b47d53d284ca 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -713,7 +713,8 @@  static void execlists_dequeue(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 done:
-	execlists->queue_priority = rb ? to_priolist(rb)->priority : INT_MIN;
+	execlists->queue_priority =
+		port != execlists->port ? rq_prio(last) : INT_MIN;
 	execlists->first = rb;
 	if (submit)
 		port_assign(port, last);