Message ID | 20221116112532.36253-4-janusz.krzysztofik@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Fix timeout handling when retiring requests | expand |
On 16.11.2022 12:25, Janusz Krzysztofik wrote: > According to the docs of i915_request_wait_timeout(), its return value > "may be zero if the request is unfinished after the timeout expires." > However, 0 is also returned when the request is found finished right > after the timeout has expired. > > Since the docs also state: "If the timeout is 0, it will return 1 if the > fence is signaled.", return 1 also when the fence is found signaled after > non-zero timeout has expired. As I understand the patch "drm/i915: Fix i915_request fence wait semantics", and the docs "timeout is 0" means the initial value of timeout argument and this is handled already on the beginning of the function. In case initial timeout is greater than zero and then it expires, function should return 0 regardless of fence state. This is what I have understood from reading docs and implementation of dma_fence_default_wait [1], which should be the best source of info about "dma_fence wait semantic". As I said already, mixing remaining time and bool in return value of dma_fence_wait* functions is very confusing, but changing it would require major rework of the framework. [1]: https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753 Regards Andrzej > > Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics") > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > Cc: stable@vger.kernel.org # v5.17 > --- > drivers/gpu/drm/i915/i915_request.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > index f949a9495758a..406ddfafbed4d 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq, > > timeout = io_schedule_timeout(timeout); > } > + if (!timeout) /* expired but signaled, we shouldn't return 0 */ > + timeout = 1; > __set_current_state(TASK_RUNNING); > > if (READ_ONCE(wait.tsk))
On Wednesday, 16 November 2022 15:42:46 CET Andrzej Hajda wrote: > On 16.11.2022 12:25, Janusz Krzysztofik wrote: > > According to the docs of i915_request_wait_timeout(), its return value > > "may be zero if the request is unfinished after the timeout expires." > > However, 0 is also returned when the request is found finished right > > after the timeout has expired. > > > > Since the docs also state: "If the timeout is 0, it will return 1 if the > > fence is signaled.", return 1 also when the fence is found signaled after > > non-zero timeout has expired. > > As I understand the patch "drm/i915: Fix i915_request fence wait > semantics", and the docs "timeout is 0" means the initial value of > timeout argument and this is handled already on the beginning of the > function. > In case initial timeout is greater than zero and then it expires, > function should return 0 regardless of fence state. This is what I have > understood from reading docs and implementation of > dma_fence_default_wait [1], which should be the best source of info > about "dma_fence wait semantic". > > As I said already, mixing remaining time and bool in return value of > dma_fence_wait* functions is very confusing, but changing it would > require major rework of the framework. OK, let's drop this controversial patch. The corner case it touches should already be handled correctly by intel_gt_retire_requests_timeout(), which this series is about, after 1/3 and 2/3 are applied. Thanks, Janusz > > [1]: > https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753 > > Regards > Andrzej > > > > > Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics") > > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> > > Cc: stable@vger.kernel.org # v5.17 > > --- > > drivers/gpu/drm/i915/i915_request.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > index f949a9495758a..406ddfafbed4d 100644 > > --- a/drivers/gpu/drm/i915/i915_request.c > > +++ b/drivers/gpu/drm/i915/i915_request.c > > @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq, > > > > timeout = io_schedule_timeout(timeout); > > } > > + if (!timeout) /* expired but signaled, we shouldn't return 0 */ > > + timeout = 1; > > __set_current_state(TASK_RUNNING); > > > > if (READ_ONCE(wait.tsk)) > >
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f949a9495758a..406ddfafbed4d 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq, timeout = io_schedule_timeout(timeout); } + if (!timeout) /* expired but signaled, we shouldn't return 0 */ + timeout = 1; __set_current_state(TASK_RUNNING); if (READ_ONCE(wait.tsk))
According to the docs of i915_request_wait_timeout(), its return value "may be zero if the request is unfinished after the timeout expires." However, 0 is also returned when the request is found finished right after the timeout has expired. Since the docs also state: "If the timeout is 0, it will return 1 if the fence is signaled.", return 1 also when the fence is found signaled after non-zero timeout has expired. Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics") Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.17 --- drivers/gpu/drm/i915/i915_request.c | 2 ++ 1 file changed, 2 insertions(+)