diff mbox series

[v4,09/10] drm/sched: Add helper to queue TDR immediately for current and future jobs

Message ID 20230919050155.2647172-10-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series DRM scheduler changes for Xe | expand

Commit Message

Matthew Brost Sept. 19, 2023, 5:01 a.m. UTC
Add helper to queue TDR immediately for current and future jobs. This
will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup
a drm_scheduler that encounter errors.

v2:
 - Drop timeout args, rename function, use mod delayed work (Luben)

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
 include/drm/gpu_scheduler.h            |  1 +
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

Luben Tuikov Sept. 29, 2023, 10:44 p.m. UTC | #1
On 2023-09-19 01:01, Matthew Brost wrote:
> Add helper to queue TDR immediately for current and future jobs. This
> will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup

Please use present tense, "is", in code, comments, commits, etc.

Is it "XE" or is it "Xe"? I always thought it was "Xe".

	This is used in Xe, a new Intel GPU driver, to trigger a TDR to clean up

Code, comments, commits, etc., immediately become history, and it's a bit
ambitious to use future tense in something which immediately becomes
history. It's much better to describe what is happening now, including the patch
in question (any patch, ftm) is considered "now"/"current state" as well.

> a drm_scheduler that encounter error[.]> 
> v2:
>  - Drop timeout args, rename function, use mod delayed work (Luben)
> 
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
>  include/drm/gpu_scheduler.h            |  1 +
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index e8a3e6033f66..88ef8be2d3c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -435,7 +435,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>  
>  	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>  	    !list_empty(&sched->pending_list))
> -		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> +		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
>  }
>  
>  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> @@ -445,6 +445,23 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
>  	spin_unlock(&sched->job_list_lock);
>  }
>  
> +/**
> + * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
> + * jobs

Let's not mention "including future jobs", since we don't know the future.
But we can sneak "jobs" into the description like this:

 * drm_sched_tdr_queue_imm - immediately start job timeout handler

:-)

> + *
> + * @sched: scheduler where the timeout handling should be started.

"where" --> "for which"
The former denotes a location, like in space-time, and the latter
denotes an object, like a scheduler, a spaceship, a bicycle, etc.

> + *
> + * Start timeout handling immediately for current and future jobs

 * Start timeout handling immediately for the named scheduler.

> + */
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
> +{
> +	spin_lock(&sched->job_list_lock);
> +	sched->timeout = 0;
> +	drm_sched_start_timeout(sched);
> +	spin_unlock(&sched->job_list_lock);
> +}
> +EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
> +
>  /**
>   * drm_sched_fault - immediately start timeout handler
>   *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7e6c121003ca..27f5778bbd6d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
>  
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
>  void drm_sched_job_cleanup(struct drm_sched_job *job);
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
>  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);

Looks good!

Fix the above, for an immediate R-B. :-)
Matthew Brost Oct. 5, 2023, 3:22 a.m. UTC | #2
On Fri, Sep 29, 2023 at 06:44:53PM -0400, Luben Tuikov wrote:
> On 2023-09-19 01:01, Matthew Brost wrote:
> > Add helper to queue TDR immediately for current and future jobs. This
> > will be used in XE, new Intel GPU driver, to trigger the TDR to cleanup
> 
> Please use present tense, "is", in code, comments, commits, etc.
> 
> Is it "XE" or is it "Xe"? I always thought it was "Xe".
> 

Yea should be 'Xe'.

> 	This is used in Xe, a new Intel GPU driver, to trigger a TDR to clean up
> 

Will fix.

> Code, comments, commits, etc., immediately become history, and it's a bit
> ambitious to use future tense in something which immediately becomes
> history. It's much better to describe what is happening now, including the patch
> in question (any patch, ftm) is considered "now"/"current state" as well.
>

Got it.

> > a drm_scheduler that encounter error[.]> 
> > v2:
> >  - Drop timeout args, rename function, use mod delayed work (Luben)
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 19 ++++++++++++++++++-
> >  include/drm/gpu_scheduler.h            |  1 +
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index e8a3e6033f66..88ef8be2d3c7 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -435,7 +435,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
> >  
> >  	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> >  	    !list_empty(&sched->pending_list))
> > -		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> > +		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
> >  }
> >  
> >  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> > @@ -445,6 +445,23 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> >  	spin_unlock(&sched->job_list_lock);
> >  }
> >  
> > +/**
> > + * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
> > + * jobs
> 
> Let's not mention "including future jobs", since we don't know the future.
> But we can sneak "jobs" into the description like this:
> 
>  * drm_sched_tdr_queue_imm - immediately start job timeout handler
> 
> :-)
>

Will change.
 
> > + *
> > + * @sched: scheduler where the timeout handling should be started.
> 
> "where" --> "for which"
> The former denotes a location, like in space-time, and the latter
> denotes an object, like a scheduler, a spaceship, a bicycle, etc.
>

+1
 
> > + *
> > + * Start timeout handling immediately for current and future jobs
> 
>  * Start timeout handling immediately for the named scheduler.
>

+1

> > + */
> > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
> > +{
> > +	spin_lock(&sched->job_list_lock);
> > +	sched->timeout = 0;
> > +	drm_sched_start_timeout(sched);
> > +	spin_unlock(&sched->job_list_lock);
> > +}
> > +EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
> > +
> >  /**
> >   * drm_sched_fault - immediately start timeout handler
> >   *
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 7e6c121003ca..27f5778bbd6d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >  				    struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> >  
> > +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
> >  void drm_sched_job_cleanup(struct drm_sched_job *job);
> >  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
> >  bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);
> 
> Looks good!
> 
> Fix the above, for an immediate R-B. :-)

Thanks for the review, will fix all of this.

Matt

> -- 
> Regards,
> Luben
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index e8a3e6033f66..88ef8be2d3c7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -435,7 +435,7 @@  static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
 
 	if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
 	    !list_empty(&sched->pending_list))
-		queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
+		mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout);
 }
 
 static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
@@ -445,6 +445,23 @@  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
 	spin_unlock(&sched->job_list_lock);
 }
 
+/**
+ * drm_sched_tdr_queue_imm: - immediately start timeout handler including future
+ * jobs
+ *
+ * @sched: scheduler where the timeout handling should be started.
+ *
+ * Start timeout handling immediately for current and future jobs
+ */
+void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
+{
+	spin_lock(&sched->job_list_lock);
+	sched->timeout = 0;
+	drm_sched_start_timeout(sched);
+	spin_unlock(&sched->job_list_lock);
+}
+EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
+
 /**
  * drm_sched_fault - immediately start timeout handler
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 7e6c121003ca..27f5778bbd6d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -568,6 +568,7 @@  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
 
+void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
 void drm_sched_job_cleanup(struct drm_sched_job *job);
 void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
 bool drm_sched_submit_ready(struct drm_gpu_scheduler *sched);