[1/2] drm/sched: Fix drm_sched_entity_flush() return val

Message ID	20241119134122.21950-2-pstanner@redhat.com (mailing list archive)
State	New
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Philipp Stanner <pstanner@redhat.com> To: Luben Tuikov <ltuikov89@gmail.com>, Matthew Brost <matthew.brost@intel.com>, Danilo Krummrich <dakr@kernel.org>, Philipp Stanner <pstanner@redhat.com>, Maarten Lankhorst <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de>, David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch> Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com> Subject: [PATCH 1/2] drm/sched: Fix drm_sched_entity_flush() return val Date: Tue, 19 Nov 2024 14:41:22 +0100 Message-ID: <20241119134122.21950-2-pstanner@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	[1/2] drm/sched: Fix drm_sched_entity_flush() return val \| expand [1/2] drm/sched: Fix drm_sched_entity_flush() return val [2/2] drm/sched: Fix docu of drm_sched_entity_flush()

Message ID

20241119134122.21950-2-pstanner@redhat.com (mailing list archive)

State

New

Headers

From: Philipp Stanner <pstanner@redhat.com>
To: Luben Tuikov <ltuikov89@gmail.com>,
 Matthew Brost <matthew.brost@intel.com>,
 Danilo Krummrich <dakr@kernel.org>, Philipp Stanner <pstanner@redhat.com>,
 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
 Maxime Ripard <mripard@kernel.org>,
 Thomas Zimmermann <tzimmermann@suse.de>, David Airlie <airlied@gmail.com>,
 Simona Vetter <simona@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	=?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>
Subject: [PATCH 1/2] drm/sched: Fix drm_sched_entity_flush() return val
Date: Tue, 19 Nov 2024 14:41:22 +0100
Message-ID: <20241119134122.21950-2-pstanner@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Series

[1/2] drm/sched: Fix drm_sched_entity_flush() return val | expand

Commit Message

Philipp Stanner Nov. 19, 2024, 1:41 p.m. UTC

The documentation of drm_sched_entity_flush() states that the function
shall - always - return the remaining timeout time in jiffies.

However, that is not what the function actually does; in one of its if
branches it simply returns the unchanged timeout value.

Furthermore, the used function wait_event_timeout() doesn't always
return the remaining timeout time.

Adjust the function so that it actually does what the documentation
states it shall do.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
---
@AMD:
You guys are the only ones who use the function's return code, so I
leave it completely up to you to decide what behavior you want.

But we should at least do something, because right now function
documentation and behavior do not match.

P.
---
 drivers/gpu/drm/scheduler/sched_entity.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

Comments

Philipp Stanner Nov. 19, 2024, 1:48 p.m. UTC | #1

On Tue, 2024-11-19 at 14:41 +0100, Philipp Stanner wrote:
> The documentation of drm_sched_entity_flush() states that the
> function
> shall - always - return the remaining timeout time in jiffies.
> 
> However, that is not what the function actually does; in one of its
> if
> branches it simply returns the unchanged timeout value.
> 
> Furthermore, the used function wait_event_timeout() doesn't always
> return the remaining timeout time.
> 
> Adjust the function so that it actually does what the documentation
> states it shall do.
> 
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Philipp Stanner <pstanner@redhat.com>
> ---
> @AMD:
> You guys are the only ones who use the function's return code, so I
> leave it completely up to you to decide what behavior you want.
> 
> But we should at least do something, because right now function
> documentation and behavior do not match.
> 
> P.
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 23 ++++++++++++++++------
> -
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index a75eede8bf8d..16b172aee453 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -278,7 +278,7 @@ static void drm_sched_entity_kill(struct
> drm_sched_entity *entity)
>   * waiting, removes the entity from the runqueue and returns an
> error when the
>   * process was killed.
>   *
> - * Returns the remaining time in jiffies left from the input timeout
> + * Returns: 0 if the timeout ellapsed, the remaining time otherwise.
>   */
>  long drm_sched_entity_flush(struct drm_sched_entity *entity, long
> timeout)
>  {
> @@ -294,15 +294,24 @@ long drm_sched_entity_flush(struct
> drm_sched_entity *entity, long timeout)
>  	 * The client will not queue more IBs during this fini,
> consume existing
>  	 * queued IBs or discard them on SIGKILL
>  	 */
> -	if (current->flags & PF_EXITING) {
> -		if (timeout)
> -			ret = wait_event_timeout(
> -					sched->job_scheduled,
> -
> 					drm_sched_entity_is_idle(entity),
> -					timeout);
> +	if (timeout != 0 && (current->flags & PF_EXITING)) {
> +		ret = wait_event_timeout(sched->job_scheduled,
> +				drm_sched_entity_is_idle(entity),
> +				timeout);
> +		/*
> +		 * wait_event_timeout() returns 1 if it timed out
> but the
> +		 * condition became true on timeout. We only care
> about whether
> +		 * it timed out or not.
> +		 */
> +		if (ret == 1)
> +			ret = 0;
>  	} else {
>  		wait_event_killable(sched->job_scheduled,
>  				   
> drm_sched_entity_is_idle(entity));
> +
> +		ret -= (long)get_jiffies_64();

Ah, just recognized that this is probably nonsense – anyways, let's
discuss what we shall do here. I can fix it later.

P.

> +		if (ret < 0)
> +			ret = 0;
>  	}
>  
>  	/* For killed process disable any more IBs enqueue right now
> */

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index a75eede8bf8d..16b172aee453 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -278,7 +278,7 @@  static void drm_sched_entity_kill(struct drm_sched_entity *entity)
  * waiting, removes the entity from the runqueue and returns an error when the
  * process was killed.
  *
- * Returns the remaining time in jiffies left from the input timeout
+ * Returns: 0 if the timeout ellapsed, the remaining time otherwise.
  */
 long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 {
@@ -294,15 +294,24 @@  long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
 	 * The client will not queue more IBs during this fini, consume existing
 	 * queued IBs or discard them on SIGKILL
 	 */
-	if (current->flags & PF_EXITING) {
-		if (timeout)
-			ret = wait_event_timeout(
-					sched->job_scheduled,
-					drm_sched_entity_is_idle(entity),
-					timeout);
+	if (timeout != 0 && (current->flags & PF_EXITING)) {
+		ret = wait_event_timeout(sched->job_scheduled,
+				drm_sched_entity_is_idle(entity),
+				timeout);
+		/*
+		 * wait_event_timeout() returns 1 if it timed out but the
+		 * condition became true on timeout. We only care about whether
+		 * it timed out or not.
+		 */
+		if (ret == 1)
+			ret = 0;
 	} else {
 		wait_event_killable(sched->job_scheduled,
 				    drm_sched_entity_is_idle(entity));
+
+		ret -= (long)get_jiffies_64();
+		if (ret < 0)
+			ret = 0;
 	}
 
 	/* For killed process disable any more IBs enqueue right now */

[1/2] drm/sched: Fix drm_sched_entity_flush() return val

Commit Message

Comments

Patch