diff mbox series

[3/3] drm/i915: Never return 0 if request wait succeeds

Message ID 20221116112532.36253-4-janusz.krzysztofik@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Fix timeout handling when retiring requests | expand

Commit Message

Janusz Krzysztofik Nov. 16, 2022, 11:25 a.m. UTC
According to the docs of i915_request_wait_timeout(), its return value
"may be zero if the request is unfinished after the timeout expires."
However, 0 is also returned when the request is found finished right
after the timeout has expired.

Since the docs also state: "If the timeout is 0, it will return 1 if the
fence is signaled.", return 1 also when the fence is found signaled after
non-zero timeout has expired.

Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.17
---
 drivers/gpu/drm/i915/i915_request.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Andrzej Hajda Nov. 16, 2022, 2:42 p.m. UTC | #1
On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> According to the docs of i915_request_wait_timeout(), its return value
> "may be zero if the request is unfinished after the timeout expires."
> However, 0 is also returned when the request is found finished right
> after the timeout has expired.
> 
> Since the docs also state: "If the timeout is 0, it will return 1 if the
> fence is signaled.", return 1 also when the fence is found signaled after
> non-zero timeout has expired.

As I understand the patch "drm/i915: Fix i915_request fence wait 
semantics", and the docs "timeout is 0" means the initial value of 
timeout argument and this is handled already on the beginning of the 
function.
In case initial timeout is greater than zero and then it expires, 
function should return 0 regardless of fence state. This is what I have 
understood from reading docs and implementation of 
dma_fence_default_wait [1], which should be the best source of info 
about "dma_fence wait semantic".

As I said already, mixing remaining time and bool in return value of 
dma_fence_wait* functions is very confusing, but changing it would 
require major rework of the framework.

[1]: 
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753

Regards
Andrzej

> 
> Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.17
> ---
>   drivers/gpu/drm/i915/i915_request.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index f949a9495758a..406ddfafbed4d 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
>   
>   		timeout = io_schedule_timeout(timeout);
>   	}
> +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> +		timeout = 1;
>   	__set_current_state(TASK_RUNNING);
>   
>   	if (READ_ONCE(wait.tsk))
Janusz Krzysztofik Nov. 16, 2022, 3:48 p.m. UTC | #2
On Wednesday, 16 November 2022 15:42:46 CET Andrzej Hajda wrote:
> On 16.11.2022 12:25, Janusz Krzysztofik wrote:
> > According to the docs of i915_request_wait_timeout(), its return value
> > "may be zero if the request is unfinished after the timeout expires."
> > However, 0 is also returned when the request is found finished right
> > after the timeout has expired.
> > 
> > Since the docs also state: "If the timeout is 0, it will return 1 if the
> > fence is signaled.", return 1 also when the fence is found signaled after
> > non-zero timeout has expired.
> 
> As I understand the patch "drm/i915: Fix i915_request fence wait 
> semantics", and the docs "timeout is 0" means the initial value of 
> timeout argument and this is handled already on the beginning of the 
> function.
> In case initial timeout is greater than zero and then it expires, 
> function should return 0 regardless of fence state. This is what I have 
> understood from reading docs and implementation of 
> dma_fence_default_wait [1], which should be the best source of info 
> about "dma_fence wait semantic".
> 
> As I said already, mixing remaining time and bool in return value of 
> dma_fence_wait* functions is very confusing, but changing it would 
> require major rework of the framework.

OK, let's drop this controversial patch.  The corner case it touches should 
already be handled correctly by intel_gt_retire_requests_timeout(), which this 
series is about, after 1/3 and 2/3 are applied.

Thanks,
Janusz

> 
> [1]: 
> https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-fence.c#L753
> 
> Regards
> Andrzej
> 
> > 
> > Fixes: 7e2e69ed4678 ("drm/i915: Fix i915_request fence wait semantics")
> > Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Cc: stable@vger.kernel.org # v5.17
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index f949a9495758a..406ddfafbed4d 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -2079,6 +2079,8 @@ long i915_request_wait_timeout(struct i915_request *rq,
> >   
> >   		timeout = io_schedule_timeout(timeout);
> >   	}
> > +	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
> > +		timeout = 1;
> >   	__set_current_state(TASK_RUNNING);
> >   
> >   	if (READ_ONCE(wait.tsk))
> 
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f949a9495758a..406ddfafbed4d 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -2079,6 +2079,8 @@  long i915_request_wait_timeout(struct i915_request *rq,
 
 		timeout = io_schedule_timeout(timeout);
 	}
+	if (!timeout)	/* expired but signaled, we shouldn't return 0 */
+		timeout = 1;
 	__set_current_state(TASK_RUNNING);
 
 	if (READ_ONCE(wait.tsk))