mbox series

[RFC,00/14] Deadline scheduler and other ideas

Message ID 20241230165259.95855-1-tursulin@igalia.com (mailing list archive)
Headers show
Series Deadline scheduler and other ideas | expand

Message

Tvrtko Ursulin Dec. 30, 2024, 4:52 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

<tldr>
Replacing FIFO with a flavour of deadline driven scheduling and removing round-
robin. Connecting the scheduler with dma-fence deadlines. First draft and
testing by different drivers and feedback would be nice. I was only able to test
it with amdgpu. Other drivers may not even compile.
</tldr>

If I remember correctly Christian mentioned recently (give or take) that maybe
round-robin could be removed. That got me thinking how and what could be
improved and simplified. So I played a bit in the scheduler code and came up
with something which appears to not crash at least. Whether or not there are
significant advantages apart from maybe code consolidation and reduction is the
main thing to be determined.

One big question is whether round-robin can really be removed. Does anyone use
it, rely on it, or what are even use cases where it is much better than FIFO.

See "drm/sched: Add deadline policy" commit message for a short description on
what flavour of deadline scheduling it is. But in essence it should a more fair
FIFO where higher priority can not forever starve lower priorities.

"drm/sched: Connect with dma-fence deadlines" wires up dma-fence deadlines to
the scheduler because it is easy and makes logical sense with this. And I
noticed userspace already uses it so why not wire it up fully.

Otherwise the series is a bit of progression from consolidating RR into FIFO
code paths and going from there to deadline and then to a change in how
dependencies are handled. And code simplification to 1:1 run queue to scheduler
relationship, because deadline does not need per priority run queues.

There is quite a bit of code to go throught here so I think it could be even
better if other drivers could give it a spin as is and see if some improvements
can be detected. Or at least no regressions.

Cc: Christian König <christian.koenig@amd.com>
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Philipp Stanner <pstanner@redhat.com>

Tvrtko Ursulin (14):
  drm/sched: Delete unused update_job_credits
  drm/sched: Remove idle entity from tree
  drm/sched: Implement RR via FIFO
  drm/sched: Consolidate entity run queue management
  drm/sched: Move run queue related code into a separate file
  drm/sched: Ignore own fence earlier
  drm/sched: Resolve same scheduler dependencies earlier
  drm/sched: Add deadline policy
  drm/sched: Remove FIFO and RR and simplify to a single run queue
  drm/sched: Queue all free credits in one worker invocation
  drm/sched: Connect with dma-fence deadlines
  drm/sched: Embed run queue singleton into the scheduler
  dma-fence: Add helper for custom fence context when merging fences
  drm/sched: Resolve all job dependencies in one go

 drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
 drivers/gpu/drm/scheduler/Makefile          |   2 +-
 drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
 drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
 drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++---------------
 drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
 include/drm/gpu_scheduler.h                 |  74 ++-
 include/linux/dma-fence-unwrap.h            |  31 +-
 14 files changed, 606 insertions(+), 678 deletions(-)
 create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c

Comments

Philipp Stanner Jan. 2, 2025, 1:09 p.m. UTC | #1
On Mon, 2024-12-30 at 16:52 +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> <tldr>
> Replacing FIFO with a flavour of deadline driven scheduling and
> removing round-
> robin. Connecting the scheduler with dma-fence deadlines. First draft
> and
> testing by different drivers and feedback would be nice. I was only
> able to test
> it with amdgpu. Other drivers may not even compile.

(on my machine I could build them all)

> </tldr>
> 
> If I remember correctly Christian mentioned recently (give or take)
> that maybe
> round-robin could be removed. That got me thinking how and what could
> be
> improved and simplified. So I played a bit in the scheduler code and
> came up
> with something which appears to not crash at least. Whether or not
> there are
> significant advantages apart from maybe code consolidation and
> reduction is the
> main thing to be determined.

Hi,

so first of all: Happy new year and thx for putting in all that effort
:)

I gave the series a first look; Since this is an RFC, let's abstain
from doing deeper reviews of the exact code for now.


> 
> One big question is whether round-robin can really be removed. Does
> anyone use
> it, rely on it, or what are even use cases where it is much better
> than FIFO.

So AFAICS Round Robin is not used anymore by anyone. And my
understanding indeed is, too, that there is not really any use-case
where one would like anything except for FIFO.

Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling policy
as the default"), it seems to me that RR just was easy to implement and
it had the disadvantage of systems under high load cause the oldest job
to be starved to death, which is why FIFO was introduced.

So my guess would be that RR just is a relict.

If we agree on that, then we could remove RR in any case, and the
subsequent question would be whether FIFO should be replaced with
deadline (or: if there should be FIFO *and* deadline?), wouldn't it?

> 
> See "drm/sched: Add deadline policy" commit message for a short
> description on
> what flavour of deadline scheduling it is. But in essence it should a
> more fair
> FIFO where higher priority can not forever starve lower priorities.

See my answer on that patch.

As you can imagine I'm wondering if that "better FIFO" would be worth
it considering that we are running into a certain risk of regressing
stuff through this rework.


> 
> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence
> deadlines to
> the scheduler because it is easy and makes logical sense with this.
> And I
> noticed userspace already uses it so why not wire it up fully.

Userspace uses the dma-fence deadlines you mean? Do you have a pointer
for us?

> 
> Otherwise the series is a bit of progression from consolidating RR
> into FIFO
> code paths and going from there to deadline and then to a change in
> how
> dependencies are handled. And code simplification to 1:1 run queue to
> scheduler
> relationship, because deadline does not need per priority run queues.
> 
> There is quite a bit of code to go throught here so I think it could
> be even
> better if other drivers could give it a spin as is and see if some
> improvements
> can be detected. Or at least no regressions.

I hope I can dive deeper into the Nouveau side soon.

> 
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@redhat.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <pstanner@redhat.com>
> 
> Tvrtko Ursulin (14):
>   drm/sched: Delete unused update_job_credits
>   drm/sched: Remove idle entity from tree
>   drm/sched: Implement RR via FIFO
>   drm/sched: Consolidate entity run queue management
>   drm/sched: Move run queue related code into a separate file
>   drm/sched: Ignore own fence earlier
>   drm/sched: Resolve same scheduler dependencies earlier
>   drm/sched: Add deadline policy
>   drm/sched: Remove FIFO and RR and simplify to a single run queue
>   drm/sched: Queue all free credits in one worker invocation
>   drm/sched: Connect with dma-fence deadlines
>   drm/sched: Embed run queue singleton into the scheduler
>   dma-fence: Add helper for custom fence context when merging fences
>   drm/sched: Resolve all job dependencies in one go

It seems to me that this series is a "port RR and FIFO to deadline"-
series with some additional patches that could be branched out and be
reviewed through a separate series?

Patch 1 ("Delete unused...") for example. Other candidates are Patch 5
("Move run queue related..."), 13 ("Add helper for...").

A few patches might be mergeable even if the main idea with deadline
doesn't get approved, that's why I'm suggesting that.

Philipp


> 
>  drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>  drivers/gpu/drm/scheduler/Makefile          |   2 +-
>  drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>  drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>  drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++-------------
> --
>  drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>  include/drm/gpu_scheduler.h                 |  74 ++-
>  include/linux/dma-fence-unwrap.h            |  31 +-
>  14 files changed, 606 insertions(+), 678 deletions(-)
>  create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>
Tvrtko Ursulin Jan. 3, 2025, 12:02 p.m. UTC | #2
On 02/01/2025 13:09, Philipp Stanner wrote:
> On Mon, 2024-12-30 at 16:52 +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> <tldr>
>> Replacing FIFO with a flavour of deadline driven scheduling and
>> removing round-
>> robin. Connecting the scheduler with dma-fence deadlines. First draft
>> and
>> testing by different drivers and feedback would be nice. I was only
>> able to test
>> it with amdgpu. Other drivers may not even compile.
> 
> (on my machine I could build them all)

Kernel test robot reported that pvr does not build. I have fixed that 
locally. (Matter of adding a proper interface for peeking into 
job->dependencies.)

> 
>> </tldr>
>>
>> If I remember correctly Christian mentioned recently (give or take)
>> that maybe
>> round-robin could be removed. That got me thinking how and what could
>> be
>> improved and simplified. So I played a bit in the scheduler code and
>> came up
>> with something which appears to not crash at least. Whether or not
>> there are
>> significant advantages apart from maybe code consolidation and
>> reduction is the
>> main thing to be determined.
> 
> Hi,
> 
> so first of all: Happy new year and thx for putting in all that effort
> :)
> 
> I gave the series a first look; Since this is an RFC, let's abstain
> from doing deeper reviews of the exact code for now.

Ditto and thank you for taking a look!

>> One big question is whether round-robin can really be removed. Does
>> anyone use
>> it, rely on it, or what are even use cases where it is much better
>> than FIFO.
> 
> So AFAICS Round Robin is not used anymore by anyone. And my
> understanding indeed is, too, that there is not really any use-case
> where one would like anything except for FIFO.
> 
> Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling policy
> as the default"), it seems to me that RR just was easy to implement and
> it had the disadvantage of systems under high load cause the oldest job
> to be starved to death, which is why FIFO was introduced.
> 
> So my guess would be that RR just is a relict.
> 
> If we agree on that, then we could remove RR in any case, and the
> subsequent question would be whether FIFO should be replaced with
> deadline (or: if there should be FIFO *and* deadline?), wouldn't it?

I am unsure about RR but I agree what is the second part of the question.

There may be nuances with different drivers depending on how much they 
can queue to the hardware/firmware at once. Modern drivers which use 1:1 
sched:entity I don't expect care about DRM scheduler scheduling mode. 
The fewer jobs driver can queue to the backend the more it cares. 
Question is FIFO ever better. Keeping in mind that for same priority 
this deadline and FIFO are actually identical.

>> See "drm/sched: Add deadline policy" commit message for a short
>> description on
>> what flavour of deadline scheduling it is. But in essence it should a
>> more fair
>> FIFO where higher priority can not forever starve lower priorities.
> 
> See my answer on that patch.
> 
> As you can imagine I'm wondering if that "better FIFO" would be worth
> it considering that we are running into a certain risk of regressing
> stuff through this rework.

I will reply to that part there then.
>> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence
>> deadlines to
>> the scheduler because it is easy and makes logical sense with this.
>> And I
>> noticed userspace already uses it so why not wire it up fully.
> 
> Userspace uses the dma-fence deadlines you mean? Do you have a pointer
> for us?

I've noticed it empirically and the one I could fine is this:

https://invent.kde.org/plasma/kwin/-/commit/4ad5670ddfcd7400c8b84c12cbf8bd97a0590f43

>> Otherwise the series is a bit of progression from consolidating RR
>> into FIFO
>> code paths and going from there to deadline and then to a change in
>> how
>> dependencies are handled. And code simplification to 1:1 run queue to
>> scheduler
>> relationship, because deadline does not need per priority run queues.
>>
>> There is quite a bit of code to go throught here so I think it could
>> be even
>> better if other drivers could give it a spin as is and see if some
>> improvements
>> can be detected. Or at least no regressions.
> 
> I hope I can dive deeper into the Nouveau side soon.

Fantastic!
>>
>> Cc: Christian König <christian.koenig@amd.com>
>> Cc: Danilo Krummrich <dakr@redhat.com>
>> Cc: Matthew Brost <matthew.brost@intel.com>
>> Cc: Philipp Stanner <pstanner@redhat.com>
>>
>> Tvrtko Ursulin (14):
>>    drm/sched: Delete unused update_job_credits
>>    drm/sched: Remove idle entity from tree
>>    drm/sched: Implement RR via FIFO
>>    drm/sched: Consolidate entity run queue management
>>    drm/sched: Move run queue related code into a separate file
>>    drm/sched: Ignore own fence earlier
>>    drm/sched: Resolve same scheduler dependencies earlier
>>    drm/sched: Add deadline policy
>>    drm/sched: Remove FIFO and RR and simplify to a single run queue
>>    drm/sched: Queue all free credits in one worker invocation
>>    drm/sched: Connect with dma-fence deadlines
>>    drm/sched: Embed run queue singleton into the scheduler
>>    dma-fence: Add helper for custom fence context when merging fences
>>    drm/sched: Resolve all job dependencies in one go
> 
> It seems to me that this series is a "port RR and FIFO to deadline"-
> series with some additional patches that could be branched out and be
> reviewed through a separate series?
> 
> Patch 1 ("Delete unused...") for example. Other candidates are Patch 5
> ("Move run queue related..."), 13 ("Add helper for...").
> 
> A few patches might be mergeable even if the main idea with deadline
> doesn't get approved, that's why I'm suggesting that.

Yes some of those could be possible and I am happy to extract and rebase 
in principle. But not yet I think. If and when something gets a positive 
nod.

Regards,

Tvrtko

>>
>>   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>>   drivers/gpu/drm/scheduler/Makefile          |   2 +-
>>   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>>   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>>   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++-------------
>> --
>>   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>>   include/drm/gpu_scheduler.h                 |  74 ++-
>>   include/linux/dma-fence-unwrap.h            |  31 +-
>>   14 files changed, 606 insertions(+), 678 deletions(-)
>>   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>>
>
Koenig, Christian Jan. 3, 2025, 12:31 p.m. UTC | #3
Am 03.01.25 um 13:02 schrieb Tvrtko Ursulin:
>>> </tldr>
>>>
>>> If I remember correctly Christian mentioned recently (give or take)
>>> that maybe
>>> round-robin could be removed. That got me thinking how and what could
>>> be
>>> improved and simplified. So I played a bit in the scheduler code and
>>> came up
>>> with something which appears to not crash at least. Whether or not
>>> there are
>>> significant advantages apart from maybe code consolidation and
>>> reduction is the
>>> main thing to be determined.
>>
>> Hi,
>>
>> so first of all: Happy new year and thx for putting in all that effort
>> :)
>>
>> I gave the series a first look; Since this is an RFC, let's abstain
>> from doing deeper reviews of the exact code for now.
>
> Ditto and thank you for taking a look!

Ah, yes. Happy new year guys :)

>
>>> One big question is whether round-robin can really be removed. Does
>>> anyone use
>>> it, rely on it, or what are even use cases where it is much better
>>> than FIFO.
>>
>> So AFAICS Round Robin is not used anymore by anyone. And my
>> understanding indeed is, too, that there is not really any use-case
>> where one would like anything except for FIFO.
>>
>> Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling policy
>> as the default"), it seems to me that RR just was easy to implement and
>> it had the disadvantage of systems under high load cause the oldest job
>> to be starved to death, which is why FIFO was introduced.
>>
>> So my guess would be that RR just is a relict.
>>
>> If we agree on that, then we could remove RR in any case, and the
>> subsequent question would be whether FIFO should be replaced with
>> deadline (or: if there should be FIFO *and* deadline?), wouldn't it?
>
> I am unsure about RR but I agree what is the second part of the question.

Well we came up with FIFO because we found that RR performed quite badly 
when you have a huge number of submitting applications.

E.g. one of our cloud test cases ran 100 instances of a single game and 
the worst response time improved massively by switching from RR to FIFO.

Different priorities on the other hand were originally invented to make 
sure the kernel has precedence over userspace. But later we also exposed 
the priorities to userspace which results in the problem that higher 
priority queues can starve low priority ones.

That's the other reason why I said that RR should probably be removed 
and FIFO changed in a way that the priority is basically just a bonus to 
the score used for sorting the FIFO. I haven't taken a deeper look yet, 
but I think that this is more or less what this patch set here does.

What FIFO is still missing compared to RR is some sort of fairness 
between queues. E.g. a queues which hasn't submitted something in a 
while might get a bonus for their submissions compared to a queue which 
submits stuff all the time (or something like that).

Regards,
Christian.


> There may be nuances with different drivers depending on how much they 
> can queue to the hardware/firmware at once. Modern drivers which use 
> 1:1 sched:entity I don't expect care about DRM scheduler scheduling 
> mode. The fewer jobs driver can queue to the backend the more it 
> cares. Question is FIFO ever better. Keeping in mind that for same 
> priority this deadline and FIFO are actually identical.
>
>>> See "drm/sched: Add deadline policy" commit message for a short
>>> description on
>>> what flavour of deadline scheduling it is. But in essence it should a
>>> more fair
>>> FIFO where higher priority can not forever starve lower priorities.
>>
>> See my answer on that patch.
>>
>> As you can imagine I'm wondering if that "better FIFO" would be worth
>> it considering that we are running into a certain risk of regressing
>> stuff through this rework.
>
> I will reply to that part there then.
>>> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence
>>> deadlines to
>>> the scheduler because it is easy and makes logical sense with this.
>>> And I
>>> noticed userspace already uses it so why not wire it up fully.
>>
>> Userspace uses the dma-fence deadlines you mean? Do you have a pointer
>> for us?
>
> I've noticed it empirically and the one I could fine is this:
>
> https://invent.kde.org/plasma/kwin/-/commit/4ad5670ddfcd7400c8b84c12cbf8bd97a0590f43 
>
>
>>> Otherwise the series is a bit of progression from consolidating RR
>>> into FIFO
>>> code paths and going from there to deadline and then to a change in
>>> how
>>> dependencies are handled. And code simplification to 1:1 run queue to
>>> scheduler
>>> relationship, because deadline does not need per priority run queues.
>>>
>>> There is quite a bit of code to go throught here so I think it could
>>> be even
>>> better if other drivers could give it a spin as is and see if some
>>> improvements
>>> can be detected. Or at least no regressions.
>>
>> I hope I can dive deeper into the Nouveau side soon.
>
> Fantastic!
>>>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Cc: Danilo Krummrich <dakr@redhat.com>
>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>> Cc: Philipp Stanner <pstanner@redhat.com>
>>>
>>> Tvrtko Ursulin (14):
>>>    drm/sched: Delete unused update_job_credits
>>>    drm/sched: Remove idle entity from tree
>>>    drm/sched: Implement RR via FIFO
>>>    drm/sched: Consolidate entity run queue management
>>>    drm/sched: Move run queue related code into a separate file
>>>    drm/sched: Ignore own fence earlier
>>>    drm/sched: Resolve same scheduler dependencies earlier
>>>    drm/sched: Add deadline policy
>>>    drm/sched: Remove FIFO and RR and simplify to a single run queue
>>>    drm/sched: Queue all free credits in one worker invocation
>>>    drm/sched: Connect with dma-fence deadlines
>>>    drm/sched: Embed run queue singleton into the scheduler
>>>    dma-fence: Add helper for custom fence context when merging fences
>>>    drm/sched: Resolve all job dependencies in one go
>>
>> It seems to me that this series is a "port RR and FIFO to deadline"-
>> series with some additional patches that could be branched out and be
>> reviewed through a separate series?
>>
>> Patch 1 ("Delete unused...") for example. Other candidates are Patch 5
>> ("Move run queue related..."), 13 ("Add helper for...").
>>
>> A few patches might be mergeable even if the main idea with deadline
>> doesn't get approved, that's why I'm suggesting that.
>
> Yes some of those could be possible and I am happy to extract and 
> rebase in principle. But not yet I think. If and when something gets a 
> positive nod.
>
> Regards,
>
> Tvrtko
>
>>>
>>>   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>>>   drivers/gpu/drm/scheduler/Makefile          |   2 +-
>>>   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>>>   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>>>   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++-------------
>>> -- 
>>>   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>>>   include/drm/gpu_scheduler.h                 |  74 ++-
>>>   include/linux/dma-fence-unwrap.h            |  31 +-
>>>   14 files changed, 606 insertions(+), 678 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>>>
>>
Philipp Stanner Jan. 3, 2025, 1:45 p.m. UTC | #4
On Fri, 2025-01-03 at 13:31 +0100, Christian König wrote:
> Am 03.01.25 um 13:02 schrieb Tvrtko Ursulin:
> > > > </tldr>
> > > > 
> > > > If I remember correctly Christian mentioned recently (give or
> > > > take)
> > > > that maybe
> > > > round-robin could be removed. That got me thinking how and what
> > > > could
> > > > be
> > > > improved and simplified. So I played a bit in the scheduler
> > > > code and
> > > > came up
> > > > with something which appears to not crash at least. Whether or
> > > > not
> > > > there are
> > > > significant advantages apart from maybe code consolidation and
> > > > reduction is the
> > > > main thing to be determined.
> > > 
> > > Hi,
> > > 
> > > so first of all: Happy new year and thx for putting in all that
> > > effort
> > > :)
> > > 
> > > I gave the series a first look; Since this is an RFC, let's
> > > abstain
> > > from doing deeper reviews of the exact code for now.
> > 
> > Ditto and thank you for taking a look!
> 
> Ah, yes. Happy new year guys :)
> 
> > 
> > > > One big question is whether round-robin can really be removed.
> > > > Does
> > > > anyone use
> > > > it, rely on it, or what are even use cases where it is much
> > > > better
> > > > than FIFO.
> > > 
> > > So AFAICS Round Robin is not used anymore by anyone. And my
> > > understanding indeed is, too, that there is not really any use-
> > > case
> > > where one would like anything except for FIFO.
> > > 
> > > Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling
> > > policy
> > > as the default"), it seems to me that RR just was easy to
> > > implement and
> > > it had the disadvantage of systems under high load cause the
> > > oldest job
> > > to be starved to death, which is why FIFO was introduced.
> > > 
> > > So my guess would be that RR just is a relict.
> > > 
> > > If we agree on that, then we could remove RR in any case, and the
> > > subsequent question would be whether FIFO should be replaced with
> > > deadline (or: if there should be FIFO *and* deadline?), wouldn't
> > > it?
> > 
> > I am unsure about RR but I agree what is the second part of the
> > question.
> 
> Well we came up with FIFO because we found that RR performed quite
> badly 
> when you have a huge number of submitting applications.
> 
> E.g. one of our cloud test cases ran 100 instances of a single game
> and 
> the worst response time improved massively by switching from RR to
> FIFO.
> 
> Different priorities on the other hand were originally invented to
> make 
> sure the kernel has precedence over userspace.

Should it, in your opinion, be *guaranteed* that the kernel always
takes precedence, or is a "kernel takes precedence almost always"
enough?

Because this patchset does seem to do the latter

>  But later we also exposed 
> the priorities to userspace which results in the problem that higher 
> priority queues can starve low priority ones.
> 
> That's the other reason why I said that RR should probably be removed
> and FIFO changed in a way that the priority is basically just a bonus
> to 
> the score used for sorting the FIFO. I haven't taken a deeper look
> yet, 
> but I think that this is more or less what this patch set here does.
> 
> What FIFO is still missing compared to RR is some sort of fairness 
> between queues.

Before I forget it I note it right away:

If we go for such a thing we should set our terminology straight. A
FIFO that suddenly takes deadlines into account is not a FIFO anymore.
So we shouldn't call it that.

However, a deadline scheduler with priority levels also doesn't seem to
be a deadline scheduler.

I think, no matter what we do, we should aim for strict congruency
between the GPU scheduler and the CPU scheduler in regards with terms
and naming. The CPU scheduler has SCHED_FIFO, which is pure FIFO with
priorities (behaving exactly like the GPU schedulers FIFO, currently)
and SCHED_DEADLINE, which is purely based on deadlines (not like in
this patch set).

So we should find a new suitable term if we go for that to avoid any
misunderstandings.


P.

>  E.g. a queues which hasn't submitted something in a 
> while might get a bonus for their submissions compared to a queue
> which 
> submits stuff all the time (or something like that).
> 
> Regards,
> Christian.
> 
> 
> > There may be nuances with different drivers depending on how much
> > they 
> > can queue to the hardware/firmware at once. Modern drivers which
> > use 
> > 1:1 sched:entity I don't expect care about DRM scheduler scheduling
> > mode. The fewer jobs driver can queue to the backend the more it 
> > cares. Question is FIFO ever better. Keeping in mind that for same 
> > priority this deadline and FIFO are actually identical.
> > 
> > > > See "drm/sched: Add deadline policy" commit message for a short
> > > > description on
> > > > what flavour of deadline scheduling it is. But in essence it
> > > > should a
> > > > more fair
> > > > FIFO where higher priority can not forever starve lower
> > > > priorities.
> > > 
> > > See my answer on that patch.
> > > 
> > > As you can imagine I'm wondering if that "better FIFO" would be
> > > worth
> > > it considering that we are running into a certain risk of
> > > regressing
> > > stuff through this rework.
> > 
> > I will reply to that part there then.
> > > > "drm/sched: Connect with dma-fence deadlines" wires up dma-
> > > > fence
> > > > deadlines to
> > > > the scheduler because it is easy and makes logical sense with
> > > > this.
> > > > And I
> > > > noticed userspace already uses it so why not wire it up fully.
> > > 
> > > Userspace uses the dma-fence deadlines you mean? Do you have a
> > > pointer
> > > for us?
> > 
> > I've noticed it empirically and the one I could fine is this:
> > 
> > https://invent.kde.org/plasma/kwin/-/commit/4ad5670ddfcd7400c8b84c12cbf8bd97a0590f43
> >  
> > 
> > 
> > > > Otherwise the series is a bit of progression from consolidating
> > > > RR
> > > > into FIFO
> > > > code paths and going from there to deadline and then to a
> > > > change in
> > > > how
> > > > dependencies are handled. And code simplification to 1:1 run
> > > > queue to
> > > > scheduler
> > > > relationship, because deadline does not need per priority run
> > > > queues.
> > > > 
> > > > There is quite a bit of code to go throught here so I think it
> > > > could
> > > > be even
> > > > better if other drivers could give it a spin as is and see if
> > > > some
> > > > improvements
> > > > can be detected. Or at least no regressions.
> > > 
> > > I hope I can dive deeper into the Nouveau side soon.
> > 
> > Fantastic!
> > > > 
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Cc: Danilo Krummrich <dakr@redhat.com>
> > > > Cc: Matthew Brost <matthew.brost@intel.com>
> > > > Cc: Philipp Stanner <pstanner@redhat.com>
> > > > 
> > > > Tvrtko Ursulin (14):
> > > >    drm/sched: Delete unused update_job_credits
> > > >    drm/sched: Remove idle entity from tree
> > > >    drm/sched: Implement RR via FIFO
> > > >    drm/sched: Consolidate entity run queue management
> > > >    drm/sched: Move run queue related code into a separate file
> > > >    drm/sched: Ignore own fence earlier
> > > >    drm/sched: Resolve same scheduler dependencies earlier
> > > >    drm/sched: Add deadline policy
> > > >    drm/sched: Remove FIFO and RR and simplify to a single run
> > > > queue
> > > >    drm/sched: Queue all free credits in one worker invocation
> > > >    drm/sched: Connect with dma-fence deadlines
> > > >    drm/sched: Embed run queue singleton into the scheduler
> > > >    dma-fence: Add helper for custom fence context when merging
> > > > fences
> > > >    drm/sched: Resolve all job dependencies in one go
> > > 
> > > It seems to me that this series is a "port RR and FIFO to
> > > deadline"-
> > > series with some additional patches that could be branched out
> > > and be
> > > reviewed through a separate series?
> > > 
> > > Patch 1 ("Delete unused...") for example. Other candidates are
> > > Patch 5
> > > ("Move run queue related..."), 13 ("Add helper for...").
> > > 
> > > A few patches might be mergeable even if the main idea with
> > > deadline
> > > doesn't get approved, that's why I'm suggesting that.
> > 
> > Yes some of those could be possible and I am happy to extract and 
> > rebase in principle. But not yet I think. If and when something
> > gets a 
> > positive nod.
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > > 
> > > >   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
> > > >   drivers/gpu/drm/scheduler/Makefile          |   2 +-
> > > >   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
> > > >   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
> > > >   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++------
> > > > -------
> > > > -- 
> > > >   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
> > > >   include/drm/gpu_scheduler.h                 |  74 ++-
> > > >   include/linux/dma-fence-unwrap.h            |  31 +-
> > > >   14 files changed, 606 insertions(+), 678 deletions(-)
> > > >   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
> > > > 
> > > 
>
Koenig, Christian Jan. 3, 2025, 3:16 p.m. UTC | #5
[AMD Official Use Only - AMD Internal Distribution Only]

Could you send that whole patch set to me once more?

The AMD mails servers seem to have had a hickup over the holidays and all mails received between ~25.12.2024 and 1.1.2025 are somehow mangled.

Thanks in advance,
Christian.
Tvrtko Ursulin Jan. 3, 2025, 3:17 p.m. UTC | #6
On 03/01/2025 12:31, Christian König wrote:
> Am 03.01.25 um 13:02 schrieb Tvrtko Ursulin:
>>>> </tldr>
>>>>
>>>> If I remember correctly Christian mentioned recently (give or take)
>>>> that maybe
>>>> round-robin could be removed. That got me thinking how and what could
>>>> be
>>>> improved and simplified. So I played a bit in the scheduler code and
>>>> came up
>>>> with something which appears to not crash at least. Whether or not
>>>> there are
>>>> significant advantages apart from maybe code consolidation and
>>>> reduction is the
>>>> main thing to be determined.
>>>
>>> Hi,
>>>
>>> so first of all: Happy new year and thx for putting in all that effort
>>> :)
>>>
>>> I gave the series a first look; Since this is an RFC, let's abstain
>>> from doing deeper reviews of the exact code for now.
>>
>> Ditto and thank you for taking a look!
> 
> Ah, yes. Happy new year guys :)
> 
>>
>>>> One big question is whether round-robin can really be removed. Does
>>>> anyone use
>>>> it, rely on it, or what are even use cases where it is much better
>>>> than FIFO.
>>>
>>> So AFAICS Round Robin is not used anymore by anyone. And my
>>> understanding indeed is, too, that there is not really any use-case
>>> where one would like anything except for FIFO.
>>>
>>> Looking at 977d97f18b5b ("drm/scheduler: Set the FIFO scheduling policy
>>> as the default"), it seems to me that RR just was easy to implement and
>>> it had the disadvantage of systems under high load cause the oldest job
>>> to be starved to death, which is why FIFO was introduced.
>>>
>>> So my guess would be that RR just is a relict.
>>>
>>> If we agree on that, then we could remove RR in any case, and the
>>> subsequent question would be whether FIFO should be replaced with
>>> deadline (or: if there should be FIFO *and* deadline?), wouldn't it?
>>
>> I am unsure about RR but I agree what is the second part of the question.
> 
> Well we came up with FIFO because we found that RR performed quite badly 
> when you have a huge number of submitting applications.
> 
> E.g. one of our cloud test cases ran 100 instances of a single game and 
> the worst response time improved massively by switching from RR to FIFO.
> 
> Different priorities on the other hand were originally invented to make 
> sure the kernel has precedence over userspace. But later we also exposed 
> the priorities to userspace which results in the problem that higher 
> priority queues can starve low priority ones.
> 
> That's the other reason why I said that RR should probably be removed 
> and FIFO changed in a way that the priority is basically just a bonus to 
> the score used for sorting the FIFO. I haven't taken a deeper look yet, 
> but I think that this is more or less what this patch set here does.
> 
> What FIFO is still missing compared to RR is some sort of fairness 
> between queues. E.g. a queues which hasn't submitted something in a 
> while might get a bonus for their submissions compared to a queue which 
> submits stuff all the time (or something like that).

I can try to add something along these lines. Like dynamically scale the 
relative deadlines based on entity queue depth so infrequent short 
bursts compete better with persistent hogs. I'll wait for some more 
feedback and then see where to go.

Regards,

Tvrtko

>> There may be nuances with different drivers depending on how much they 
>> can queue to the hardware/firmware at once. Modern drivers which use 
>> 1:1 sched:entity I don't expect care about DRM scheduler scheduling 
>> mode. The fewer jobs driver can queue to the backend the more it 
>> cares. Question is FIFO ever better. Keeping in mind that for same 
>> priority this deadline and FIFO are actually identical.
>>
>>>> See "drm/sched: Add deadline policy" commit message for a short
>>>> description on
>>>> what flavour of deadline scheduling it is. But in essence it should a
>>>> more fair
>>>> FIFO where higher priority can not forever starve lower priorities.
>>>
>>> See my answer on that patch.
>>>
>>> As you can imagine I'm wondering if that "better FIFO" would be worth
>>> it considering that we are running into a certain risk of regressing
>>> stuff through this rework.
>>
>> I will reply to that part there then.
>>>> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence
>>>> deadlines to
>>>> the scheduler because it is easy and makes logical sense with this.
>>>> And I
>>>> noticed userspace already uses it so why not wire it up fully.
>>>
>>> Userspace uses the dma-fence deadlines you mean? Do you have a pointer
>>> for us?
>>
>> I've noticed it empirically and the one I could fine is this:
>>
>> https://invent.kde.org/plasma/kwin/-/commit/4ad5670ddfcd7400c8b84c12cbf8bd97a0590f43
>>
>>>> Otherwise the series is a bit of progression from consolidating RR
>>>> into FIFO
>>>> code paths and going from there to deadline and then to a change in
>>>> how
>>>> dependencies are handled. And code simplification to 1:1 run queue to
>>>> scheduler
>>>> relationship, because deadline does not need per priority run queues.
>>>>
>>>> There is quite a bit of code to go throught here so I think it could
>>>> be even
>>>> better if other drivers could give it a spin as is and see if some
>>>> improvements
>>>> can be detected. Or at least no regressions.
>>>
>>> I hope I can dive deeper into the Nouveau side soon.
>>
>> Fantastic!
>>>>
>>>> Cc: Christian König <christian.koenig@amd.com>
>>>> Cc: Danilo Krummrich <dakr@redhat.com>
>>>> Cc: Matthew Brost <matthew.brost@intel.com>
>>>> Cc: Philipp Stanner <pstanner@redhat.com>
>>>>
>>>> Tvrtko Ursulin (14):
>>>>    drm/sched: Delete unused update_job_credits
>>>>    drm/sched: Remove idle entity from tree
>>>>    drm/sched: Implement RR via FIFO
>>>>    drm/sched: Consolidate entity run queue management
>>>>    drm/sched: Move run queue related code into a separate file
>>>>    drm/sched: Ignore own fence earlier
>>>>    drm/sched: Resolve same scheduler dependencies earlier
>>>>    drm/sched: Add deadline policy
>>>>    drm/sched: Remove FIFO and RR and simplify to a single run queue
>>>>    drm/sched: Queue all free credits in one worker invocation
>>>>    drm/sched: Connect with dma-fence deadlines
>>>>    drm/sched: Embed run queue singleton into the scheduler
>>>>    dma-fence: Add helper for custom fence context when merging fences
>>>>    drm/sched: Resolve all job dependencies in one go
>>>
>>> It seems to me that this series is a "port RR and FIFO to deadline"-
>>> series with some additional patches that could be branched out and be
>>> reviewed through a separate series?
>>>
>>> Patch 1 ("Delete unused...") for example. Other candidates are Patch 5
>>> ("Move run queue related..."), 13 ("Add helper for...").
>>>
>>> A few patches might be mergeable even if the main idea with deadline
>>> doesn't get approved, that's why I'm suggesting that.
>>
>> Yes some of those could be possible and I am happy to extract and 
>> rebase in principle. But not yet I think. If and when something gets a 
>> positive nod.
>>
>> Regards,
>>
>> Tvrtko
>>
>>>>
>>>>   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>>>>   drivers/gpu/drm/scheduler/Makefile          |   2 +-
>>>>   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>>>>   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>>>>   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++-------------
>>>> -- 
>>>>   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>>>>   include/drm/gpu_scheduler.h                 |  74 ++-
>>>>   include/linux/dma-fence-unwrap.h            |  31 +-
>>>>   14 files changed, 606 insertions(+), 678 deletions(-)
>>>>   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
>>>>
>>>
>
Tvrtko Ursulin Jan. 3, 2025, 3:32 p.m. UTC | #7
On 03/01/2025 15:16, Koenig, Christian wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Could you send that whole patch set to me once more?

No problem - I've pushed it to 
https://gitlab.freedesktop.org/tursulin/kernel/-/commits/drm-sched-deadline?ref_type=heads, 
that's probably even easier, right?

Regards,

Tvrtko

> The AMD mails servers seem to have had a hickup over the holidays and all mails received between ~25.12.2024 and 1.1.2025 are somehow mangled.
> 
> Thanks in advance,
> Christian.
> 
> ________________________________________
> Von: Tvrtko Ursulin <tursulin@igalia.com>
> Gesendet: Montag, 30. Dezember 2024 17:52
> An: dri-devel@lists.freedesktop.org
> Cc: kernel-dev@igalia.com; Tvrtko Ursulin; Koenig, Christian; Danilo Krummrich; Matthew Brost; Philipp Stanner
> Betreff: [RFC 00/14] Deadline scheduler and other ideas
> 
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> <tldr>
> Replacing FIFO with a flavour of deadline driven scheduling and removing round-
> robin. Connecting the scheduler with dma-fence deadlines. First draft and
> testing by different drivers and feedback would be nice. I was only able to test
> it with amdgpu. Other drivers may not even compile.
> </tldr>
> 
> If I remember correctly Christian mentioned recently (give or take) that maybe
> round-robin could be removed. That got me thinking how and what could be
> improved and simplified. So I played a bit in the scheduler code and came up
> with something which appears to not crash at least. Whether or not there are
> significant advantages apart from maybe code consolidation and reduction is the
> main thing to be determined.
> 
> One big question is whether round-robin can really be removed. Does anyone use
> it, rely on it, or what are even use cases where it is much better than FIFO.
> 
> See "drm/sched: Add deadline policy" commit message for a short description on
> what flavour of deadline scheduling it is. But in essence it should a more fair
> FIFO where higher priority can not forever starve lower priorities.
> 
> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence deadlines to
> the scheduler because it is easy and makes logical sense with this. And I
> noticed userspace already uses it so why not wire it up fully.
> 
> Otherwise the series is a bit of progression from consolidating RR into FIFO
> code paths and going from there to deadline and then to a change in how
> dependencies are handled. And code simplification to 1:1 run queue to scheduler
> relationship, because deadline does not need per priority run queues.
> 
> There is quite a bit of code to go throught here so I think it could be even
> better if other drivers could give it a spin as is and see if some improvements
> can be detected. Or at least no regressions.
> 
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@redhat.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <pstanner@redhat.com>
> 
> Tvrtko Ursulin (14):
>    drm/sched: Delete unused update_job_credits
>    drm/sched: Remove idle entity from tree
>    drm/sched: Implement RR via FIFO
>    drm/sched: Consolidate entity run queue management
>    drm/sched: Move run queue related code into a separate file
>    drm/sched: Ignore own fence earlier
>    drm/sched: Resolve same scheduler dependencies earlier
>    drm/sched: Add deadline policy
>    drm/sched: Remove FIFO and RR and simplify to a single run queue
>    drm/sched: Queue all free credits in one worker invocation
>    drm/sched: Connect with dma-fence deadlines
>    drm/sched: Embed run queue singleton into the scheduler
>    dma-fence: Add helper for custom fence context when merging fences
>    drm/sched: Resolve all job dependencies in one go
> 
>   drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>   drivers/gpu/drm/scheduler/Makefile          |   2 +-
>   drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>   drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>   drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++---------------
>   drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>   include/drm/gpu_scheduler.h                 |  74 ++-
>   include/linux/dma-fence-unwrap.h            |  31 +-
>   14 files changed, 606 insertions(+), 678 deletions(-)
>   create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
> 
> --
> 2.47.1
>
Simona Vetter Jan. 6, 2025, 1:47 p.m. UTC | #8
On Fri, Jan 03, 2025 at 03:16:56PM +0000, Koenig, Christian wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
> Could you send that whole patch set to me once more?
> 
> The AMD mails servers seem to have had a hickup over the holidays and
> all mails received between ~25.12.2024 and 1.1.2025 are somehow mangled.

I seem to have the same issue with fetching from lore.o.k, despite that
the archives seem to be complete. No idea what's happened.
-Sima

> 
> Thanks in advance,
> Christian.
> 
> ________________________________________
> Von: Tvrtko Ursulin <tursulin@igalia.com>
> Gesendet: Montag, 30. Dezember 2024 17:52
> An: dri-devel@lists.freedesktop.org
> Cc: kernel-dev@igalia.com; Tvrtko Ursulin; Koenig, Christian; Danilo Krummrich; Matthew Brost; Philipp Stanner
> Betreff: [RFC 00/14] Deadline scheduler and other ideas
> 
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> <tldr>
> Replacing FIFO with a flavour of deadline driven scheduling and removing round-
> robin. Connecting the scheduler with dma-fence deadlines. First draft and
> testing by different drivers and feedback would be nice. I was only able to test
> it with amdgpu. Other drivers may not even compile.
> </tldr>
> 
> If I remember correctly Christian mentioned recently (give or take) that maybe
> round-robin could be removed. That got me thinking how and what could be
> improved and simplified. So I played a bit in the scheduler code and came up
> with something which appears to not crash at least. Whether or not there are
> significant advantages apart from maybe code consolidation and reduction is the
> main thing to be determined.
> 
> One big question is whether round-robin can really be removed. Does anyone use
> it, rely on it, or what are even use cases where it is much better than FIFO.
> 
> See "drm/sched: Add deadline policy" commit message for a short description on
> what flavour of deadline scheduling it is. But in essence it should a more fair
> FIFO where higher priority can not forever starve lower priorities.
> 
> "drm/sched: Connect with dma-fence deadlines" wires up dma-fence deadlines to
> the scheduler because it is easy and makes logical sense with this. And I
> noticed userspace already uses it so why not wire it up fully.
> 
> Otherwise the series is a bit of progression from consolidating RR into FIFO
> code paths and going from there to deadline and then to a change in how
> dependencies are handled. And code simplification to 1:1 run queue to scheduler
> relationship, because deadline does not need per priority run queues.
> 
> There is quite a bit of code to go throught here so I think it could be even
> better if other drivers could give it a spin as is and see if some improvements
> can be detected. Or at least no regressions.
> 
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Danilo Krummrich <dakr@redhat.com>
> Cc: Matthew Brost <matthew.brost@intel.com>
> Cc: Philipp Stanner <pstanner@redhat.com>
> 
> Tvrtko Ursulin (14):
>   drm/sched: Delete unused update_job_credits
>   drm/sched: Remove idle entity from tree
>   drm/sched: Implement RR via FIFO
>   drm/sched: Consolidate entity run queue management
>   drm/sched: Move run queue related code into a separate file
>   drm/sched: Ignore own fence earlier
>   drm/sched: Resolve same scheduler dependencies earlier
>   drm/sched: Add deadline policy
>   drm/sched: Remove FIFO and RR and simplify to a single run queue
>   drm/sched: Queue all free credits in one worker invocation
>   drm/sched: Connect with dma-fence deadlines
>   drm/sched: Embed run queue singleton into the scheduler
>   dma-fence: Add helper for custom fence context when merging fences
>   drm/sched: Resolve all job dependencies in one go
> 
>  drivers/dma-buf/dma-fence-unwrap.c          |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c      |   6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c     |  27 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h     |   5 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h   |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c |   8 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c     |   8 +-
>  drivers/gpu/drm/scheduler/Makefile          |   2 +-
>  drivers/gpu/drm/scheduler/sched_entity.c    | 316 ++++++-----
>  drivers/gpu/drm/scheduler/sched_fence.c     |   5 +-
>  drivers/gpu/drm/scheduler/sched_main.c      | 587 +++++---------------
>  drivers/gpu/drm/scheduler/sched_rq.c        | 199 +++++++
>  include/drm/gpu_scheduler.h                 |  74 ++-
>  include/linux/dma-fence-unwrap.h            |  31 +-
>  14 files changed, 606 insertions(+), 678 deletions(-)
>  create mode 100644 drivers/gpu/drm/scheduler/sched_rq.c
> 
> --
> 2.47.1
>