Message ID | 20241023235917.1836428-1-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM | expand |
On 10/24/2024 1:59 AM, Matthew Brost wrote: > drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path > of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler > work queue with WQ_MEM_RECLAIM to ensure forward progress during > reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward > progress during reclaim. > > v2: > - Fixes tags (Philipp) > - Reword commit message (Philipp) > > Cc: Luben Tuikov <ltuikov89@gmail.com> > Cc: Danilo Krummrich <dakr@kernel.org> > Cc: Philipp Stanner <pstanner@redhat.com> > Cc: stable@vger.kernel.org > Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for submit_wq") > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") > Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Looks like Xe has a dependency on this now that xe->ordered_wq is allocated with WQ_MEM_RECLAIM flag: https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html > --- > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 540231e6bac6..df0a5abb1400 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1283,10 +1283,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, > sched->own_submit_wq = false; > } else { > #ifdef CONFIG_LOCKDEP > - sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, 0, > + sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, > + WQ_MEM_RECLAIM, > &drm_sched_lockdep_map); > #else > - sched->submit_wq = alloc_ordered_workqueue(name, 0); > + sched->submit_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); > #endif > if (!sched->submit_wq) > return -ENOMEM;
On Thu, Oct 24, 2024 at 01:44:41PM +0200, Nirmoy Das wrote: > > On 10/24/2024 1:59 AM, Matthew Brost wrote: > > drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path > > of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler > > work queue with WQ_MEM_RECLAIM to ensure forward progress during > > reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward > > progress during reclaim. > > > > v2: > > - Fixes tags (Philipp) > > - Reword commit message (Philipp) > > > > Cc: Luben Tuikov <ltuikov89@gmail.com> > > Cc: Danilo Krummrich <dakr@kernel.org> > > Cc: Philipp Stanner <pstanner@redhat.com> > > Cc: stable@vger.kernel.org > > Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for submit_wq") > > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > Acked-by: Nirmoy Das <nirmoy.das@intel.com> > > Looks like Xe has a dependency on this now that xe->ordered_wq is allocated with WQ_MEM_RECLAIM flag: > Thanks for pointing this out. I merged the Xe patches first not realizing this was going to break CI. Hopefully I can merge this scheduler patch soon. Matt > https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-140135v2/bat-lnl-1/igt@xe_exec_fault_mode@twice-invalid-fault.html > > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > > index 540231e6bac6..df0a5abb1400 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -1283,10 +1283,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, > > sched->own_submit_wq = false; > > } else { > > #ifdef CONFIG_LOCKDEP > > - sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, 0, > > + sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, > > + WQ_MEM_RECLAIM, > > &drm_sched_lockdep_map); > > #else > > - sched->submit_wq = alloc_ordered_workqueue(name, 0); > > + sched->submit_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); > > #endif > > if (!sched->submit_wq) > > return -ENOMEM;
On Wed, 2024-10-23 at 16:59 -0700, Matthew Brost wrote: > drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the > path > of dma-fences, and dma-fences are in the path of reclaim. Mark > scheduler > work queue with WQ_MEM_RECLAIM to ensure forward progress during > reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward > progress during reclaim. > > v2: > - Fixes tags (Philipp) > - Reword commit message (Philipp) > > Cc: Luben Tuikov <ltuikov89@gmail.com> > Cc: Danilo Krummrich <dakr@kernel.org> > Cc: Philipp Stanner <pstanner@redhat.com> > Cc: stable@vger.kernel.org > Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for > submit_wq") > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work > queue rather than kthread") > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 540231e6bac6..df0a5abb1400 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -1283,10 +1283,11 @@ int drm_sched_init(struct drm_gpu_scheduler > *sched, > sched->own_submit_wq = false; > } else { > #ifdef CONFIG_LOCKDEP > - sched->submit_wq = > alloc_ordered_workqueue_lockdep_map(name, 0, > + sched->submit_wq = > alloc_ordered_workqueue_lockdep_map(name, > + > WQ_MEM_RECLAIM, > > &drm_sched_lockdep_map); > #else > - sched->submit_wq = alloc_ordered_workqueue(name, 0); > + sched->submit_wq = alloc_ordered_workqueue(name, > WQ_MEM_RECLAIM); > #endif > if (!sched->submit_wq) > return -ENOMEM; Cool, thx – looks good from my POV. Since you now sent this patch as a single one, what would be the preferred merge plan for this? Your XE-Series doesn't depend on this IIUC, so should we take this patch here separately into drm-misc-next? Regards, P.
On Thu, Oct 24, 2024 at 05:35:47PM +0200, Philipp Stanner wrote: > On Wed, 2024-10-23 at 16:59 -0700, Matthew Brost wrote: > > drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the > > path > > of dma-fences, and dma-fences are in the path of reclaim. Mark > > scheduler > > work queue with WQ_MEM_RECLAIM to ensure forward progress during > > reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward > > progress during reclaim. > > > > v2: > > - Fixes tags (Philipp) > > - Reword commit message (Philipp) > > > > Cc: Luben Tuikov <ltuikov89@gmail.com> > > Cc: Danilo Krummrich <dakr@kernel.org> > > Cc: Philipp Stanner <pstanner@redhat.com> > > Cc: stable@vger.kernel.org > > Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for > > submit_wq") > > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work > > queue rather than kthread") > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index 540231e6bac6..df0a5abb1400 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -1283,10 +1283,11 @@ int drm_sched_init(struct drm_gpu_scheduler > > *sched, > > sched->own_submit_wq = false; > > } else { > > #ifdef CONFIG_LOCKDEP > > - sched->submit_wq = > > alloc_ordered_workqueue_lockdep_map(name, 0, > > + sched->submit_wq = > > alloc_ordered_workqueue_lockdep_map(name, > > + > > WQ_MEM_RECLAIM, > > > > &drm_sched_lockdep_map); > > #else > > - sched->submit_wq = alloc_ordered_workqueue(name, 0); > > + sched->submit_wq = alloc_ordered_workqueue(name, > > WQ_MEM_RECLAIM); > > #endif > > if (!sched->submit_wq) > > return -ENOMEM; > > > Cool, thx – looks good from my POV. > Can I get a RB? > Since you now sent this patch as a single one, what would be the > preferred merge plan for this? Your XE-Series doesn't depend on this > IIUC, so should we take this patch here separately into drm-misc-next? > Merge this one to drm-misc and we will backport into drm-xe-next. Matt > > Regards, > P. >
On Thu, 2024-10-24 at 15:47 +0000, Matthew Brost wrote: > On Thu, Oct 24, 2024 at 05:35:47PM +0200, Philipp Stanner wrote: > > On Wed, 2024-10-23 at 16:59 -0700, Matthew Brost wrote: > > > drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in > > > the > > > path > > > of dma-fences, and dma-fences are in the path of reclaim. Mark > > > scheduler > > > work queue with WQ_MEM_RECLAIM to ensure forward progress during > > > reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward > > > progress during reclaim. > > > > > > v2: > > > - Fixes tags (Philipp) > > > - Reword commit message (Philipp) > > > > > > Cc: Luben Tuikov <ltuikov89@gmail.com> > > > Cc: Danilo Krummrich <dakr@kernel.org> > > > Cc: Philipp Stanner <pstanner@redhat.com> > > > Cc: stable@vger.kernel.org > > > Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for > > > submit_wq") > > > Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a > > > work > > > queue rather than kthread") > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > > --- > > > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > index 540231e6bac6..df0a5abb1400 100644 > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > @@ -1283,10 +1283,11 @@ int drm_sched_init(struct > > > drm_gpu_scheduler > > > *sched, > > > sched->own_submit_wq = false; > > > } else { > > > #ifdef CONFIG_LOCKDEP > > > - sched->submit_wq = > > > alloc_ordered_workqueue_lockdep_map(name, 0, > > > + sched->submit_wq = > > > alloc_ordered_workqueue_lockdep_map(name, > > > + > > > > > > WQ_MEM_RECLAIM, > > > > > > > > > &drm_sched_lockdep_map); > > > #else > > > - sched->submit_wq = alloc_ordered_workqueue(name, > > > 0); > > > + sched->submit_wq = alloc_ordered_workqueue(name, > > > WQ_MEM_RECLAIM); > > > #endif > > > if (!sched->submit_wq) > > > return -ENOMEM; > > > > > > Cool, thx – looks good from my POV. > > > > Can I get a RB? Oh, sure: Reviewed-by: Philipp Stanner <pstanner@redhat.com> > > > Since you now sent this patch as a single one, what would be the > > preferred merge plan for this? Your XE-Series doesn't depend on > > this > > IIUC, so should we take this patch here separately into drm-misc- > > next? > > > > Merge this one to drm-misc and we will backport into drm-xe-next. OK – feel free to apply it yourself if you want, then we wouldn't need to synchronize Philipp > > Matt > > > > > Regards, > > P. > > >
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 540231e6bac6..df0a5abb1400 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1283,10 +1283,11 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, sched->own_submit_wq = false; } else { #ifdef CONFIG_LOCKDEP - sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, 0, + sched->submit_wq = alloc_ordered_workqueue_lockdep_map(name, + WQ_MEM_RECLAIM, &drm_sched_lockdep_map); #else - sched->submit_wq = alloc_ordered_workqueue(name, 0); + sched->submit_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); #endif if (!sched->submit_wq) return -ENOMEM;
drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler work queue with WQ_MEM_RECLAIM to ensure forward progress during reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward progress during reclaim. v2: - Fixes tags (Philipp) - Reword commit message (Philipp) Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Philipp Stanner <pstanner@redhat.com> Cc: stable@vger.kernel.org Fixes: 34f50cc6441b ("drm/sched: Use drm sched lockdep map for submit_wq") Fixes: a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)