mbox series

[v5,0/7] DRM scheduler changes for Xe

Message ID 20231011235826.585624-1-matthew.brost@intel.com (mailing list archive)
Headers show
Series DRM scheduler changes for Xe | expand

Message

Matthew Brost Oct. 11, 2023, 11:58 p.m. UTC
As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
have been asked to merge our common DRM scheduler patches first.

This a continuation of a RFC [3] with all comments addressed, ready for
a full review, and hopefully in state which can merged in the near
future. More details of this series can found in the cover letter of the
RFC [3].

These changes have been tested with the Xe driver.

v2:
 - Break run job, free job, and process message in own work items
 - This might break other drivers as run job and free job now can run in
   parallel, can fix up if needed

v3:
 - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
 - Fix issue with setting timestamp to early
 - Don't dequeue jobs for single entity after calling entity fini
 - Flush pending jobs on entity fini
 - Add documentation for entity teardown
 - Add Matthew Brost to maintainers of DRM scheduler

v4:
 - Drop message interface
 - Drop 'Flush pending jobs on entity fini'
 - Drop 'Add documentation for entity teardown'
 - Address all feedback

v5:
 - Address Luben's feedback
 - Drop starting TDR after calling run_job()
 - Drop adding Matthew Brost to maintainers of DRM scheduler

Matt

[1] https://gitlab.freedesktop.org/drm/xe/kernel
[2] https://patchwork.freedesktop.org/series/112188/
[3] https://patchwork.freedesktop.org/series/116055/

Matthew Brost (7):
  drm/sched: Add drm_sched_wqueue_* helpers
  drm/sched: Convert drm scheduler to use a work queue rather than
    kthread
  drm/sched: Move schedule policy to scheduler
  drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
  drm/sched: Split free_job into own work item
  drm/sched: Add drm_sched_start_timeout_unlocked helper
  drm/sched: Add helper to queue TDR immediately for current and future
    jobs

 .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
 drivers/gpu/drm/lima/lima_sched.c             |   5 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c          |   7 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
 drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c      |  86 ++-
 drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
 drivers/gpu/drm/scheduler/sched_main.c        | 506 ++++++++++++------
 drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
 include/drm/gpu_scheduler.h                   |  48 +-
 14 files changed, 499 insertions(+), 233 deletions(-)

Comments

Danilo Krummrich Oct. 12, 2023, 2:02 a.m. UTC | #1
Hi Matt,

Can you please address my comments from V3 and V4?

https://lore.kernel.org/all/076891e9-b2ce-4c7f-833d-157aca5cd44b@amd.com/T/#m34ccee55e37ca47c87adf01439585d0bd187e3a0

- Danilo

On 10/12/23 01:58, Matthew Brost wrote:
> As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
> have been asked to merge our common DRM scheduler patches first.
> 
> This a continuation of a RFC [3] with all comments addressed, ready for
> a full review, and hopefully in state which can merged in the near
> future. More details of this series can found in the cover letter of the
> RFC [3].
> 
> These changes have been tested with the Xe driver.
> 
> v2:
>   - Break run job, free job, and process message in own work items
>   - This might break other drivers as run job and free job now can run in
>     parallel, can fix up if needed
> 
> v3:
>   - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
>   - Fix issue with setting timestamp to early
>   - Don't dequeue jobs for single entity after calling entity fini
>   - Flush pending jobs on entity fini
>   - Add documentation for entity teardown
>   - Add Matthew Brost to maintainers of DRM scheduler
> 
> v4:
>   - Drop message interface
>   - Drop 'Flush pending jobs on entity fini'
>   - Drop 'Add documentation for entity teardown'
>   - Address all feedback
> 
> v5:
>   - Address Luben's feedback
>   - Drop starting TDR after calling run_job()
>   - Drop adding Matthew Brost to maintainers of DRM scheduler
> 
> Matt
> 
> [1] https://gitlab.freedesktop.org/drm/xe/kernel
> [2] https://patchwork.freedesktop.org/series/112188/
> [3] https://patchwork.freedesktop.org/series/116055/
> 
> Matthew Brost (7):
>    drm/sched: Add drm_sched_wqueue_* helpers
>    drm/sched: Convert drm scheduler to use a work queue rather than
>      kthread
>    drm/sched: Move schedule policy to scheduler
>    drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
>    drm/sched: Split free_job into own work item
>    drm/sched: Add drm_sched_start_timeout_unlocked helper
>    drm/sched: Add helper to queue TDR immediately for current and future
>      jobs
> 
>   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
>   drivers/gpu/drm/lima/lima_sched.c             |   5 +-
>   drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
>   drivers/gpu/drm/msm/msm_ringbuffer.c          |   7 +-
>   drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
>   drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
>   drivers/gpu/drm/scheduler/sched_entity.c      |  86 ++-
>   drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
>   drivers/gpu/drm/scheduler/sched_main.c        | 506 ++++++++++++------
>   drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
>   include/drm/gpu_scheduler.h                   |  48 +-
>   14 files changed, 499 insertions(+), 233 deletions(-)
>
Matthew Brost Oct. 12, 2023, 4:49 a.m. UTC | #2
On Thu, Oct 12, 2023 at 04:02:13AM +0200, Danilo Krummrich wrote:
> Hi Matt,
> 
> Can you please address my comments from V3 and V4?
> 
> https://lore.kernel.org/all/076891e9-b2ce-4c7f-833d-157aca5cd44b@amd.com/T/#m34ccee55e37ca47c87adf01439585d0bd187e3a0
> 

Not my intent to ignore you. To be clear I need to address this comment:

'There is some feedback from V3 that doesn't seem to be addressed yet.

(1) Document tear down of struct drm_gpu_scheduler. [1]
(2) Implement helpers to tear down struct drm_gpu_scheduler. [2]
(3) Document implications of using a workqueue in terms of free_job() being
     or not being part of the fence signaling path respectively. [3]

I think at least (1) and (3) should be part of this series. I think (2) could
also happen subsequently. Christian's idea [2] how to address this is quite
interesting, but might exceed the scope of this series.

I will try to rebase my Nouveau changes onto your V4 tomorrow for a quick test.

- Danilo

[1] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m2e8c1c1e68e8127d5dd62509b5e424a12217300b
[2] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m16a0d6fa2e617383776515af45d3c6b9db543d47
[3] https://lore.kernel.org/all/20230912021615.2086698-1-matthew.brost@intel.com/T/#m807ff95284089fdb51985f1c187666772314bd8a'

Matt

> - Danilo
> 
> On 10/12/23 01:58, Matthew Brost wrote:
> > As a prerequisite to merging the new Intel Xe DRM driver [1] [2], we
> > have been asked to merge our common DRM scheduler patches first.
> > 
> > This a continuation of a RFC [3] with all comments addressed, ready for
> > a full review, and hopefully in state which can merged in the near
> > future. More details of this series can found in the cover letter of the
> > RFC [3].
> > 
> > These changes have been tested with the Xe driver.
> > 
> > v2:
> >   - Break run job, free job, and process message in own work items
> >   - This might break other drivers as run job and free job now can run in
> >     parallel, can fix up if needed
> > 
> > v3:
> >   - Include missing patch 'drm/sched: Add drm_sched_submit_* helpers'
> >   - Fix issue with setting timestamp to early
> >   - Don't dequeue jobs for single entity after calling entity fini
> >   - Flush pending jobs on entity fini
> >   - Add documentation for entity teardown
> >   - Add Matthew Brost to maintainers of DRM scheduler
> > 
> > v4:
> >   - Drop message interface
> >   - Drop 'Flush pending jobs on entity fini'
> >   - Drop 'Add documentation for entity teardown'
> >   - Address all feedback
> > 
> > v5:
> >   - Address Luben's feedback
> >   - Drop starting TDR after calling run_job()
> >   - Drop adding Matthew Brost to maintainers of DRM scheduler
> > 
> > Matt
> > 
> > [1] https://gitlab.freedesktop.org/drm/xe/kernel
> > [2] https://patchwork.freedesktop.org/series/112188/
> > [3] https://patchwork.freedesktop.org/series/116055/
> > 
> > Matthew Brost (7):
> >    drm/sched: Add drm_sched_wqueue_* helpers
> >    drm/sched: Convert drm scheduler to use a work queue rather than
> >      kthread
> >    drm/sched: Move schedule policy to scheduler
> >    drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy
> >    drm/sched: Split free_job into own work item
> >    drm/sched: Add drm_sched_start_timeout_unlocked helper
> >    drm/sched: Add helper to queue TDR immediately for current and future
> >      jobs
> > 
> >   .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |   2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  15 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  15 +-
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c       |   5 +-
> >   drivers/gpu/drm/lima/lima_sched.c             |   5 +-
> >   drivers/gpu/drm/msm/adreno/adreno_device.c    |   6 +-
> >   drivers/gpu/drm/msm/msm_ringbuffer.c          |   7 +-
> >   drivers/gpu/drm/nouveau/nouveau_sched.c       |   5 +-
> >   drivers/gpu/drm/panfrost/panfrost_job.c       |   5 +-
> >   drivers/gpu/drm/scheduler/sched_entity.c      |  86 ++-
> >   drivers/gpu/drm/scheduler/sched_fence.c       |   2 +-
> >   drivers/gpu/drm/scheduler/sched_main.c        | 506 ++++++++++++------
> >   drivers/gpu/drm/v3d/v3d_sched.c               |  25 +-
> >   include/drm/gpu_scheduler.h                   |  48 +-
> >   14 files changed, 499 insertions(+), 233 deletions(-)
> > 
>