Message ID | 20241119134122.21950-2-pstanner@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] drm/sched: Fix drm_sched_entity_flush() return val | expand |
On Tue, 2024-11-19 at 14:41 +0100, Philipp Stanner wrote: > The documentation of drm_sched_entity_flush() states that the > function > shall - always - return the remaining timeout time in jiffies. > > However, that is not what the function actually does; in one of its > if > branches it simply returns the unchanged timeout value. > > Furthermore, the used function wait_event_timeout() doesn't always > return the remaining timeout time. > > Adjust the function so that it actually does what the documentation > states it shall do. > > Cc: Christian König <christian.koenig@amd.com> > Signed-off-by: Philipp Stanner <pstanner@redhat.com> > --- > @AMD: > You guys are the only ones who use the function's return code, so I > leave it completely up to you to decide what behavior you want. > > But we should at least do something, because right now function > documentation and behavior do not match. > > P. > --- > drivers/gpu/drm/scheduler/sched_entity.c | 23 ++++++++++++++++------ > - > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > b/drivers/gpu/drm/scheduler/sched_entity.c > index a75eede8bf8d..16b172aee453 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -278,7 +278,7 @@ static void drm_sched_entity_kill(struct > drm_sched_entity *entity) > * waiting, removes the entity from the runqueue and returns an > error when the > * process was killed. > * > - * Returns the remaining time in jiffies left from the input timeout > + * Returns: 0 if the timeout ellapsed, the remaining time otherwise. > */ > long drm_sched_entity_flush(struct drm_sched_entity *entity, long > timeout) > { > @@ -294,15 +294,24 @@ long drm_sched_entity_flush(struct > drm_sched_entity *entity, long timeout) > * The client will not queue more IBs during this fini, > consume existing > * queued IBs or discard them on SIGKILL > */ > - if (current->flags & PF_EXITING) { > - if (timeout) > - ret = wait_event_timeout( > - sched->job_scheduled, > - > drm_sched_entity_is_idle(entity), > - timeout); > + if (timeout != 0 && (current->flags & PF_EXITING)) { > + ret = wait_event_timeout(sched->job_scheduled, > + drm_sched_entity_is_idle(entity), > + timeout); > + /* > + * wait_event_timeout() returns 1 if it timed out > but the > + * condition became true on timeout. We only care > about whether > + * it timed out or not. > + */ > + if (ret == 1) > + ret = 0; > } else { > wait_event_killable(sched->job_scheduled, > > drm_sched_entity_is_idle(entity)); > + > + ret -= (long)get_jiffies_64(); Ah, just recognized that this is probably nonsense – anyways, let's discuss what we shall do here. I can fix it later. P. > + if (ret < 0) > + ret = 0; > } > > /* For killed process disable any more IBs enqueue right now > */
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index a75eede8bf8d..16b172aee453 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -278,7 +278,7 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity) * waiting, removes the entity from the runqueue and returns an error when the * process was killed. * - * Returns the remaining time in jiffies left from the input timeout + * Returns: 0 if the timeout ellapsed, the remaining time otherwise. */ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) { @@ -294,15 +294,24 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) * The client will not queue more IBs during this fini, consume existing * queued IBs or discard them on SIGKILL */ - if (current->flags & PF_EXITING) { - if (timeout) - ret = wait_event_timeout( - sched->job_scheduled, - drm_sched_entity_is_idle(entity), - timeout); + if (timeout != 0 && (current->flags & PF_EXITING)) { + ret = wait_event_timeout(sched->job_scheduled, + drm_sched_entity_is_idle(entity), + timeout); + /* + * wait_event_timeout() returns 1 if it timed out but the + * condition became true on timeout. We only care about whether + * it timed out or not. + */ + if (ret == 1) + ret = 0; } else { wait_event_killable(sched->job_scheduled, drm_sched_entity_is_idle(entity)); + + ret -= (long)get_jiffies_64(); + if (ret < 0) + ret = 0; } /* For killed process disable any more IBs enqueue right now */
The documentation of drm_sched_entity_flush() states that the function shall - always - return the remaining timeout time in jiffies. However, that is not what the function actually does; in one of its if branches it simply returns the unchanged timeout value. Furthermore, the used function wait_event_timeout() doesn't always return the remaining timeout time. Adjust the function so that it actually does what the documentation states it shall do. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Philipp Stanner <pstanner@redhat.com> --- @AMD: You guys are the only ones who use the function's return code, so I leave it completely up to you to decide what behavior you want. But we should at least do something, because right now function documentation and behavior do not match. P. --- drivers/gpu/drm/scheduler/sched_entity.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)