Message ID | 20231011235826.585624-8-matthew.brost@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | DRM scheduler changes for Xe | expand |
On 2023-10-11 19:58, Matthew Brost wrote: > Add helper to queue TDR immediately for current and future jobs. This is > used in Xe, a new Intel GPU driver, to trigger a TDR to cleanup a > drm_scheduler that encounter errors. I think the best (most optimal) thing to do is to remove the last sentence mentioning Xe. It is irrelevant to this patch. This patch is functional as is, and worth having it as is. So it's best to have just: Add a helper whereby a driver can invoke TDR immediately. Also remove "for current and future jobs" from the title, as it is implied by how TDR works. We want to say less. drm/sched: Add a helper to queue TDR immediately These are only GPU scheduler changes, worth having on their own. The fact that a new (future as of this moment) driver (Xe) would use them is irrelevant at the moment. Other drivers (new, current?) would most likely end up using the changes of these patches, and these changes go in on their own merit. Regards, Luben > > v2: > - Drop timeout args, rename function, use mod delayed work (Luben) > v3: > - s/XE/Xe (Luben) > - present tense in commit message (Luben) > - Adjust comment for drm_sched_tdr_queue_imm (Luben) > > Cc: Luben Tuikov <luben.tuikov@amd.com> > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 18 +++++++++++++++++- > include/drm/gpu_scheduler.h | 1 + > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index c4d5c3d265a8..f2846745b067 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -431,7 +431,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) > > if (sched->timeout != MAX_SCHEDULE_TIMEOUT && > !list_empty(&sched->pending_list)) > - queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); > + mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); > } > > static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) > @@ -441,6 +441,22 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) > spin_unlock(&sched->job_list_lock); > } > > +/** > + * drm_sched_tdr_queue_imm: - immediately start job timeout handler > + * > + * @sched: scheduler for which the timeout handling should be started. > + * > + * Start timeout handling immediately for the named scheduler. > + */ > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched) > +{ > + spin_lock(&sched->job_list_lock); > + sched->timeout = 0; > + drm_sched_start_timeout(sched); > + spin_unlock(&sched->job_list_lock); > +} > +EXPORT_SYMBOL(drm_sched_tdr_queue_imm); > + > /** > * drm_sched_fault - immediately start timeout handler > * > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 625ffe040de3..998b32b8d212 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, > struct drm_gpu_scheduler **sched_list, > unsigned int num_sched_list); > > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched); > void drm_sched_job_cleanup(struct drm_sched_job *job); > void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); > bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched);
On Fri, Oct 13, 2023 at 11:04:47PM -0400, Luben Tuikov wrote: > On 2023-10-11 19:58, Matthew Brost wrote: > > Add helper to queue TDR immediately for current and future jobs. This is > > used in Xe, a new Intel GPU driver, to trigger a TDR to cleanup a > > drm_scheduler that encounter errors. > > I think the best (most optimal) thing to do is to remove the last sentence > mentioning Xe. It is irrelevant to this patch. This patch is functional > as is, and worth having it as is. > > So it's best to have just: > > Add a helper whereby a driver can invoke TDR immediately. > +1. > Also remove "for current and future jobs" from the title, as it is > implied by how TDR works. We want to say less. > > drm/sched: Add a helper to queue TDR immediately > Yep, my bad I forgot to adjust the commit message in this rev. Will fix. Matt > These are only GPU scheduler changes, worth having on their own. The fact > that a new (future as of this moment) driver (Xe) would use them is irrelevant > at the moment. Other drivers (new, current?) would most likely end up using the changes > of these patches, and these changes go in on their own merit. > > Regards, > Luben > > > > > v2: > > - Drop timeout args, rename function, use mod delayed work (Luben) > > v3: > > - s/XE/Xe (Luben) > > - present tense in commit message (Luben) > > - Adjust comment for drm_sched_tdr_queue_imm (Luben) > > > > Cc: Luben Tuikov <luben.tuikov@amd.com> > > Signed-off-by: Matthew Brost <matthew.brost@intel.com> > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 18 +++++++++++++++++- > > include/drm/gpu_scheduler.h | 1 + > > 2 files changed, 18 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > > index c4d5c3d265a8..f2846745b067 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -431,7 +431,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) > > > > if (sched->timeout != MAX_SCHEDULE_TIMEOUT && > > !list_empty(&sched->pending_list)) > > - queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); > > + mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); > > } > > > > static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) > > @@ -441,6 +441,22 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) > > spin_unlock(&sched->job_list_lock); > > } > > > > +/** > > + * drm_sched_tdr_queue_imm: - immediately start job timeout handler > > + * > > + * @sched: scheduler for which the timeout handling should be started. > > + * > > + * Start timeout handling immediately for the named scheduler. > > + */ > > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched) > > +{ > > + spin_lock(&sched->job_list_lock); > > + sched->timeout = 0; > > + drm_sched_start_timeout(sched); > > + spin_unlock(&sched->job_list_lock); > > +} > > +EXPORT_SYMBOL(drm_sched_tdr_queue_imm); > > + > > /** > > * drm_sched_fault - immediately start timeout handler > > * > > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > > index 625ffe040de3..998b32b8d212 100644 > > --- a/include/drm/gpu_scheduler.h > > +++ b/include/drm/gpu_scheduler.h > > @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, > > struct drm_gpu_scheduler **sched_list, > > unsigned int num_sched_list); > > > > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched); > > void drm_sched_job_cleanup(struct drm_sched_job *job); > > void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); > > bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched); >
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index c4d5c3d265a8..f2846745b067 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -431,7 +431,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) if (sched->timeout != MAX_SCHEDULE_TIMEOUT && !list_empty(&sched->pending_list)) - queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); + mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); } static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) @@ -441,6 +441,22 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) spin_unlock(&sched->job_list_lock); } +/** + * drm_sched_tdr_queue_imm: - immediately start job timeout handler + * + * @sched: scheduler for which the timeout handling should be started. + * + * Start timeout handling immediately for the named scheduler. + */ +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched) +{ + spin_lock(&sched->job_list_lock); + sched->timeout = 0; + drm_sched_start_timeout(sched); + spin_unlock(&sched->job_list_lock); +} +EXPORT_SYMBOL(drm_sched_tdr_queue_imm); + /** * drm_sched_fault - immediately start timeout handler * diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 625ffe040de3..998b32b8d212 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list); +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched); void drm_sched_job_cleanup(struct drm_sched_job *job); void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched);
Add helper to queue TDR immediately for current and future jobs. This is used in Xe, a new Intel GPU driver, to trigger a TDR to cleanup a drm_scheduler that encounter errors. v2: - Drop timeout args, rename function, use mod delayed work (Luben) v3: - s/XE/Xe (Luben) - present tense in commit message (Luben) - Adjust comment for drm_sched_tdr_queue_imm (Luben) Cc: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> --- drivers/gpu/drm/scheduler/sched_main.c | 18 +++++++++++++++++- include/drm/gpu_scheduler.h | 1 + 2 files changed, 18 insertions(+), 1 deletion(-)