diff mbox

[12/37] drm/i915: Priority boost for new clients

Message ID 20180629075348.27358-12-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson June 29, 2018, 7:53 a.m. UTC
Taken from an idea used for FQ_CODEL, we give new request flows a
priority boost. These flows are likely to correspond with interactive
tasks and so be more latency sensitive than the long queues. As soon as
the client has more than one request in the queue, further requests are
not boosted and it settles down into ordinary steady state behaviour.
Such small kicks dramatically help combat the starvation issue, by
allowing each client the opportunity to run even when the system is
under heavy throughput load (within the constraints of the user
selected priority).

Testcase: igt/benchmarks/rrul
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
 drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
 2 files changed, 17 insertions(+), 3 deletions(-)

Comments

Tvrtko Ursulin June 29, 2018, 10:04 a.m. UTC | #1
On 29/06/2018 08:53, Chris Wilson wrote:
> Taken from an idea used for FQ_CODEL, we give new request flows a
> priority boost. These flows are likely to correspond with interactive
> tasks and so be more latency sensitive than the long queues. As soon as
> the client has more than one request in the queue, further requests are
> not boosted and it settles down into ordinary steady state behaviour.
> Such small kicks dramatically help combat the starvation issue, by
> allowing each client the opportunity to run even when the system is
> under heavy throughput load (within the constraints of the user
> selected priority).

Any effect on non-micro benchmarks to mention in the commit message as 
the selling point?

> Testcase: igt/benchmarks/rrul
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
>   drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
>   2 files changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 14bf0be6f994..2d7a785dd88c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
>   	 */
>   	local_bh_disable();
>   	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> -	if (engine->schedule)
> -		engine->schedule(request, &request->gem_context->sched);
> +	if (engine->schedule) {
> +		struct i915_sched_attr attr = request->gem_context->sched;
> +
> +		/*
> +		 * Boost priorities to new clients (new request flows).
> +		 *
> +		 * Allow interactive/synchronous clients to jump ahead of
> +		 * the bulk clients. (FQ_CODEL)
> +		 */
> +		if (!prev || i915_request_completed(prev))
> +			attr.priority |= I915_PRIORITY_NEWCLIENT;

Now a "lucky" client can always get higher priority an keep preempting 
everyone else by just timing it's submissions right. So I have big 
reservations on this one.

Regards,

Tvrtko

> +
> +		engine->schedule(request, &attr);
> +	}
>   	rcu_read_unlock();
>   	i915_sw_fence_commit(&request->submit);
>   	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 7edfad0abfd7..e9fb6c1d5e42 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -19,12 +19,14 @@ enum {
>   	I915_PRIORITY_INVALID = INT_MIN
>   };
>   
> -#define I915_USER_PRIORITY_SHIFT 0
> +#define I915_USER_PRIORITY_SHIFT 1
>   #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
>   
>   #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
>   #define I915_PRIORITY_MASK (-I915_PRIORITY_COUNT)
>   
> +#define I915_PRIORITY_NEWCLIENT BIT(0)
> +
>   struct i915_sched_attr {
>   	/**
>   	 * @priority: execution and service priority
>
Chris Wilson June 29, 2018, 10:09 a.m. UTC | #2
Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
> 
> On 29/06/2018 08:53, Chris Wilson wrote:
> > Taken from an idea used for FQ_CODEL, we give new request flows a
> > priority boost. These flows are likely to correspond with interactive
> > tasks and so be more latency sensitive than the long queues. As soon as
> > the client has more than one request in the queue, further requests are
> > not boosted and it settles down into ordinary steady state behaviour.
> > Such small kicks dramatically help combat the starvation issue, by
> > allowing each client the opportunity to run even when the system is
> > under heavy throughput load (within the constraints of the user
> > selected priority).
> 
> Any effect on non-micro benchmarks to mention in the commit message as 
> the selling point?

Desktop interactivity, subjective.
wsim showed a major impact
 
> > Testcase: igt/benchmarks/rrul
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
> >   drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
> >   2 files changed, 17 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 14bf0be6f994..2d7a785dd88c 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
> >        */
> >       local_bh_disable();
> >       rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> > -     if (engine->schedule)
> > -             engine->schedule(request, &request->gem_context->sched);
> > +     if (engine->schedule) {
> > +             struct i915_sched_attr attr = request->gem_context->sched;
> > +
> > +             /*
> > +              * Boost priorities to new clients (new request flows).
> > +              *
> > +              * Allow interactive/synchronous clients to jump ahead of
> > +              * the bulk clients. (FQ_CODEL)
> > +              */
> > +             if (!prev || i915_request_completed(prev))
> > +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
> 
> Now a "lucky" client can always get higher priority an keep preempting 
> everyone else by just timing it's submissions right. So I have big 
> reservations on this one.

Lucky being someone who is _idle_, everyone else being steady state. You
can't keep getting lucky and stealing the show.
-Chris
Tvrtko Ursulin June 29, 2018, 10:36 a.m. UTC | #3
On 29/06/2018 11:09, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
>>
>> On 29/06/2018 08:53, Chris Wilson wrote:
>>> Taken from an idea used for FQ_CODEL, we give new request flows a
>>> priority boost. These flows are likely to correspond with interactive
>>> tasks and so be more latency sensitive than the long queues. As soon as
>>> the client has more than one request in the queue, further requests are
>>> not boosted and it settles down into ordinary steady state behaviour.
>>> Such small kicks dramatically help combat the starvation issue, by
>>> allowing each client the opportunity to run even when the system is
>>> under heavy throughput load (within the constraints of the user
>>> selected priority).
>>
>> Any effect on non-micro benchmarks to mention in the commit message as
>> the selling point?
> 
> Desktop interactivity, subjective.
> wsim showed a major impact
>   
>>> Testcase: igt/benchmarks/rrul
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
>>>    drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
>>>    2 files changed, 17 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 14bf0be6f994..2d7a785dd88c 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
>>>         */
>>>        local_bh_disable();
>>>        rcu_read_lock(); /* RCU serialisation for set-wedged protection */
>>> -     if (engine->schedule)
>>> -             engine->schedule(request, &request->gem_context->sched);
>>> +     if (engine->schedule) {
>>> +             struct i915_sched_attr attr = request->gem_context->sched;
>>> +
>>> +             /*
>>> +              * Boost priorities to new clients (new request flows).
>>> +              *
>>> +              * Allow interactive/synchronous clients to jump ahead of
>>> +              * the bulk clients. (FQ_CODEL)
>>> +              */
>>> +             if (!prev || i915_request_completed(prev))
>>> +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
>>
>> Now a "lucky" client can always get higher priority an keep preempting
>> everyone else by just timing it's submissions right. So I have big
>> reservations on this one.
> 
> Lucky being someone who is _idle_, everyone else being steady state. You
> can't keep getting lucky and stealing the show.

Why couldn't it? All it is needed is to send a new execbuf after the 
previous has completed.

1. First ctx A eb -> priority boost
2. Other people get back in and start executing
3. Another ctx A eb -> first has completed -> another priority boost -> 
work from 2) is preempted
4. Goto 2.

So work from 2) gets preempted for as long as this client A keeps 
submitting steadily but not too fast.

We would need some sort of priority bump for preempted ones as well for 
this to be safer.

Regards,

Tvrtko
Chris Wilson June 29, 2018, 10:41 a.m. UTC | #4
Quoting Tvrtko Ursulin (2018-06-29 11:36:50)
> 
> On 29/06/2018 11:09, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
> >>
> >> On 29/06/2018 08:53, Chris Wilson wrote:
> >>> Taken from an idea used for FQ_CODEL, we give new request flows a
> >>> priority boost. These flows are likely to correspond with interactive
> >>> tasks and so be more latency sensitive than the long queues. As soon as
> >>> the client has more than one request in the queue, further requests are
> >>> not boosted and it settles down into ordinary steady state behaviour.
> >>> Such small kicks dramatically help combat the starvation issue, by
> >>> allowing each client the opportunity to run even when the system is
> >>> under heavy throughput load (within the constraints of the user
> >>> selected priority).
> >>
> >> Any effect on non-micro benchmarks to mention in the commit message as
> >> the selling point?
> > 
> > Desktop interactivity, subjective.
> > wsim showed a major impact
> >   
> >>> Testcase: igt/benchmarks/rrul
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
> >>>    drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
> >>>    2 files changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 14bf0be6f994..2d7a785dd88c 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
> >>>         */
> >>>        local_bh_disable();
> >>>        rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> >>> -     if (engine->schedule)
> >>> -             engine->schedule(request, &request->gem_context->sched);
> >>> +     if (engine->schedule) {
> >>> +             struct i915_sched_attr attr = request->gem_context->sched;
> >>> +
> >>> +             /*
> >>> +              * Boost priorities to new clients (new request flows).
> >>> +              *
> >>> +              * Allow interactive/synchronous clients to jump ahead of
> >>> +              * the bulk clients. (FQ_CODEL)
> >>> +              */
> >>> +             if (!prev || i915_request_completed(prev))
> >>> +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
> >>
> >> Now a "lucky" client can always get higher priority an keep preempting
> >> everyone else by just timing it's submissions right. So I have big
> >> reservations on this one.
> > 
> > Lucky being someone who is _idle_, everyone else being steady state. You
> > can't keep getting lucky and stealing the show.
> 
> Why couldn't it? All it is needed is to send a new execbuf after the 
> previous has completed.

That being the point.
-Chris
Chris Wilson June 29, 2018, 10:51 a.m. UTC | #5
Quoting Tvrtko Ursulin (2018-06-29 11:36:50)
> 
> On 29/06/2018 11:09, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
> >>
> >> On 29/06/2018 08:53, Chris Wilson wrote:
> >>> Taken from an idea used for FQ_CODEL, we give new request flows a
> >>> priority boost. These flows are likely to correspond with interactive
> >>> tasks and so be more latency sensitive than the long queues. As soon as
> >>> the client has more than one request in the queue, further requests are
> >>> not boosted and it settles down into ordinary steady state behaviour.
> >>> Such small kicks dramatically help combat the starvation issue, by
> >>> allowing each client the opportunity to run even when the system is
> >>> under heavy throughput load (within the constraints of the user
> >>> selected priority).
> >>
> >> Any effect on non-micro benchmarks to mention in the commit message as
> >> the selling point?
> > 
> > Desktop interactivity, subjective.
> > wsim showed a major impact
> >   
> >>> Testcase: igt/benchmarks/rrul
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
> >>>    drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
> >>>    2 files changed, 17 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 14bf0be6f994..2d7a785dd88c 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
> >>>         */
> >>>        local_bh_disable();
> >>>        rcu_read_lock(); /* RCU serialisation for set-wedged protection */
> >>> -     if (engine->schedule)
> >>> -             engine->schedule(request, &request->gem_context->sched);
> >>> +     if (engine->schedule) {
> >>> +             struct i915_sched_attr attr = request->gem_context->sched;
> >>> +
> >>> +             /*
> >>> +              * Boost priorities to new clients (new request flows).
> >>> +              *
> >>> +              * Allow interactive/synchronous clients to jump ahead of
> >>> +              * the bulk clients. (FQ_CODEL)
> >>> +              */
> >>> +             if (!prev || i915_request_completed(prev))
> >>> +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
> >>
> >> Now a "lucky" client can always get higher priority an keep preempting
> >> everyone else by just timing it's submissions right. So I have big
> >> reservations on this one.
> > 
> > Lucky being someone who is _idle_, everyone else being steady state. You
> > can't keep getting lucky and stealing the show.
> 
> Why couldn't it? All it is needed is to send a new execbuf after the 
> previous has completed.
> 
> 1. First ctx A eb -> priority boost
> 2. Other people get back in and start executing
> 3. Another ctx A eb -> first has completed -> another priority boost -> 
> work from 2) is preempted
> 4. Goto 2.

So you have one client spinning, it's going to win most races and starve
the system, simply by owning struct_mutex. We give the other starving
steady-state clients an opportunity to submit ahead of the spinner when
they come to resubmit.
-Chris
Tvrtko Ursulin June 29, 2018, 11:10 a.m. UTC | #6
On 29/06/2018 11:51, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-06-29 11:36:50)
>>
>> On 29/06/2018 11:09, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
>>>>
>>>> On 29/06/2018 08:53, Chris Wilson wrote:
>>>>> Taken from an idea used for FQ_CODEL, we give new request flows a
>>>>> priority boost. These flows are likely to correspond with interactive
>>>>> tasks and so be more latency sensitive than the long queues. As soon as
>>>>> the client has more than one request in the queue, further requests are
>>>>> not boosted and it settles down into ordinary steady state behaviour.
>>>>> Such small kicks dramatically help combat the starvation issue, by
>>>>> allowing each client the opportunity to run even when the system is
>>>>> under heavy throughput load (within the constraints of the user
>>>>> selected priority).
>>>>
>>>> Any effect on non-micro benchmarks to mention in the commit message as
>>>> the selling point?
>>>
>>> Desktop interactivity, subjective.
>>> wsim showed a major impact
>>>    
>>>>> Testcase: igt/benchmarks/rrul
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
>>>>>     drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
>>>>>     2 files changed, 17 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 14bf0be6f994..2d7a785dd88c 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request *request)
>>>>>          */
>>>>>         local_bh_disable();
>>>>>         rcu_read_lock(); /* RCU serialisation for set-wedged protection */
>>>>> -     if (engine->schedule)
>>>>> -             engine->schedule(request, &request->gem_context->sched);
>>>>> +     if (engine->schedule) {
>>>>> +             struct i915_sched_attr attr = request->gem_context->sched;
>>>>> +
>>>>> +             /*
>>>>> +              * Boost priorities to new clients (new request flows).
>>>>> +              *
>>>>> +              * Allow interactive/synchronous clients to jump ahead of
>>>>> +              * the bulk clients. (FQ_CODEL)
>>>>> +              */
>>>>> +             if (!prev || i915_request_completed(prev))
>>>>> +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
>>>>
>>>> Now a "lucky" client can always get higher priority an keep preempting
>>>> everyone else by just timing it's submissions right. So I have big
>>>> reservations on this one.
>>>
>>> Lucky being someone who is _idle_, everyone else being steady state. You
>>> can't keep getting lucky and stealing the show.
>>
>> Why couldn't it? All it is needed is to send a new execbuf after the
>> previous has completed.
>>
>> 1. First ctx A eb -> priority boost
>> 2. Other people get back in and start executing
>> 3. Another ctx A eb -> first has completed -> another priority boost ->
>> work from 2) is preempted
>> 4. Goto 2.
> 
> So you have one client spinning, it's going to win most races and starve
> the system, simply by owning struct_mutex. We give the other starving
> steady-state clients an opportunity to submit ahead of the spinner when
> they come to resubmit.

What do you mean by spinning and owning struct_mutex?

I was thinking for example the client A sending a 1ms batch and client B 
executing a 10ms batch.

If client A keeps submitting its batch at 1ms intervals when will client 
B complete?

Regards,

Tvrtko
Tvrtko Ursulin July 2, 2018, 10:19 a.m. UTC | #7
On 29/06/2018 12:10, Tvrtko Ursulin wrote:
> 
> On 29/06/2018 11:51, Chris Wilson wrote:
>> Quoting Tvrtko Ursulin (2018-06-29 11:36:50)
>>>
>>> On 29/06/2018 11:09, Chris Wilson wrote:
>>>> Quoting Tvrtko Ursulin (2018-06-29 11:04:36)
>>>>>
>>>>> On 29/06/2018 08:53, Chris Wilson wrote:
>>>>>> Taken from an idea used for FQ_CODEL, we give new request flows a
>>>>>> priority boost. These flows are likely to correspond with interactive
>>>>>> tasks and so be more latency sensitive than the long queues. As 
>>>>>> soon as
>>>>>> the client has more than one request in the queue, further 
>>>>>> requests are
>>>>>> not boosted and it settles down into ordinary steady state behaviour.
>>>>>> Such small kicks dramatically help combat the starvation issue, by
>>>>>> allowing each client the opportunity to run even when the system is
>>>>>> under heavy throughput load (within the constraints of the user
>>>>>> selected priority).
>>>>>
>>>>> Any effect on non-micro benchmarks to mention in the commit message as
>>>>> the selling point?
>>>>
>>>> Desktop interactivity, subjective.
>>>> wsim showed a major impact
>>>>>> Testcase: igt/benchmarks/rrul
>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/i915/i915_request.c   | 16 ++++++++++++++--
>>>>>>     drivers/gpu/drm/i915/i915_scheduler.h |  4 +++-
>>>>>>     2 files changed, 17 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c 
>>>>>> b/drivers/gpu/drm/i915/i915_request.c
>>>>>> index 14bf0be6f994..2d7a785dd88c 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>>> @@ -1052,8 +1052,20 @@ void i915_request_add(struct i915_request 
>>>>>> *request)
>>>>>>          */
>>>>>>         local_bh_disable();
>>>>>>         rcu_read_lock(); /* RCU serialisation for set-wedged 
>>>>>> protection */
>>>>>> -     if (engine->schedule)
>>>>>> -             engine->schedule(request, 
>>>>>> &request->gem_context->sched);
>>>>>> +     if (engine->schedule) {
>>>>>> +             struct i915_sched_attr attr = 
>>>>>> request->gem_context->sched;
>>>>>> +
>>>>>> +             /*
>>>>>> +              * Boost priorities to new clients (new request flows).
>>>>>> +              *
>>>>>> +              * Allow interactive/synchronous clients to jump 
>>>>>> ahead of
>>>>>> +              * the bulk clients. (FQ_CODEL)
>>>>>> +              */
>>>>>> +             if (!prev || i915_request_completed(prev))
>>>>>> +                     attr.priority |= I915_PRIORITY_NEWCLIENT;
>>>>>
>>>>> Now a "lucky" client can always get higher priority an keep preempting
>>>>> everyone else by just timing it's submissions right. So I have big
>>>>> reservations on this one.
>>>>
>>>> Lucky being someone who is _idle_, everyone else being steady state. 
>>>> You
>>>> can't keep getting lucky and stealing the show.
>>>
>>> Why couldn't it? All it is needed is to send a new execbuf after the
>>> previous has completed.
>>>
>>> 1. First ctx A eb -> priority boost
>>> 2. Other people get back in and start executing
>>> 3. Another ctx A eb -> first has completed -> another priority boost ->
>>> work from 2) is preempted
>>> 4. Goto 2.
>>
>> So you have one client spinning, it's going to win most races and starve
>> the system, simply by owning struct_mutex. We give the other starving
>> steady-state clients an opportunity to submit ahead of the spinner when
>> they come to resubmit.
> 
> What do you mean by spinning and owning struct_mutex?
> 
> I was thinking for example the client A sending a 1ms batch and client B 
> executing a 10ms batch.
> 
> If client A keeps submitting its batch at 1ms intervals when will client 
> B complete?

I was thinking over the weekend that a scheme where we check that at 
least something from a different timeline has been completed on an 
engine might be acceptable. Which effectively may be just checking that 
a greater global seqno than the previous request on this timeline has 
been completed.

But then there is a problem with the virtual engine which would make the 
test too strict - it could miss to apply the new client boost depending 
on which physical engine the request got assigned to. Unless we take the 
position that too strict is better than too lax.

Secondary problem is about the dependency chains. Just because 
client/request is classified as new on an engine, doesn't mean it is OK 
to boost all its dependencies.

But applying the "something executed since its last submission = new 
client" which would look at the whole dependency chain feels perhaps to 
complicated. It would require a pass over the dep list and checking with 
this criteria on all involved engines before new client boost could be 
applied.

Thoughts?

Regards,

Tvrtko
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 14bf0be6f994..2d7a785dd88c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1052,8 +1052,20 @@  void i915_request_add(struct i915_request *request)
 	 */
 	local_bh_disable();
 	rcu_read_lock(); /* RCU serialisation for set-wedged protection */
-	if (engine->schedule)
-		engine->schedule(request, &request->gem_context->sched);
+	if (engine->schedule) {
+		struct i915_sched_attr attr = request->gem_context->sched;
+
+		/*
+		 * Boost priorities to new clients (new request flows).
+		 *
+		 * Allow interactive/synchronous clients to jump ahead of
+		 * the bulk clients. (FQ_CODEL)
+		 */
+		if (!prev || i915_request_completed(prev))
+			attr.priority |= I915_PRIORITY_NEWCLIENT;
+
+		engine->schedule(request, &attr);
+	}
 	rcu_read_unlock();
 	i915_sw_fence_commit(&request->submit);
 	local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 7edfad0abfd7..e9fb6c1d5e42 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -19,12 +19,14 @@  enum {
 	I915_PRIORITY_INVALID = INT_MIN
 };
 
-#define I915_USER_PRIORITY_SHIFT 0
+#define I915_USER_PRIORITY_SHIFT 1
 #define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
 
 #define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
 #define I915_PRIORITY_MASK (-I915_PRIORITY_COUNT)
 
+#define I915_PRIORITY_NEWCLIENT BIT(0)
+
 struct i915_sched_attr {
 	/**
 	 * @priority: execution and service priority