diff mbox series

[v2,1/2] drm/i915: Fix negative value passed as remaining time

Message ID 20221118104222.57328-2-janusz.krzysztofik@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Fix timeout handling when retiring requests | expand

Commit Message

Janusz Krzysztofik Nov. 18, 2022, 10:42 a.m. UTC
Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
with GuC") extended the API of intel_gt_retire_requests_timeout() with an
extra argument 'remaining_timeout', intended for passing back unconsumed
portion of requested timeout when 0 (success) is returned.  However, when
request retirement happens to succeed despite an error returned by a call
to dma_fence_wait_timeout(), that error code (a negative value) is passed
back instead of remaining time.  If we then pass that negative value
forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
will be triggered.

If request retirement succeeds but an error code other than -ETIME is
passed back via remaininig_timeout, we have no clue on how much of
the initial timeout might have been left for spending it on waiting for
GuC to become idle.  Then, we have no choice other than fail in that case
-- do it.  However, if -ETIME is returned via remaining_timeout then we
know that no more time has been left.  Then, pass 0 timeout value to
intel_uc_wait_for_idle() to give it a chance to return success if GuC is
already idle.

v2: Fix the issue on the caller side, not the provider.

Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC")
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.15+
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Janusz Krzysztofik Nov. 21, 2022, 8:36 a.m. UTC | #1
On Friday, 18 November 2022 11:42:21 CET Janusz Krzysztofik wrote:
> Commit b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work
> with GuC") extended the API of intel_gt_retire_requests_timeout() with an
> extra argument 'remaining_timeout', intended for passing back unconsumed
> portion of requested timeout when 0 (success) is returned.  However, when
> request retirement happens to succeed despite an error returned by a call
> to dma_fence_wait_timeout(), that error code (a negative value) is passed
> back instead of remaining time.  If we then pass that negative value
> forward as requested timeout to intel_uc_wait_for_idle(), an explicit BUG
> will be triggered.
> 
> If request retirement succeeds but an error code other than -ETIME is
> passed back via remaininig_timeout, we have no clue on how much of
> the initial timeout might have been left for spending it on waiting for
> GuC to become idle.  Then, we have no choice other than fail in that case
> -- do it.  

Looking at this again, I think we should ignore those errors, like they have 
been already ignored by intel_gt_retire_requests_timeout() returning 0, and 
call intel_uc_wait_for_idle() with 0 timeout.

I'll submit new version if you agree.

Thanks,
Janusz

> However, if -ETIME is returned via remaining_timeout then we
> know that no more time has been left.  Then, pass 0 timeout value to
> intel_uc_wait_for_idle() to give it a chance to return success if GuC is
> already idle.
> 
> v2: Fix the issue on the caller side, not the provider.
> 
> Fixes: b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work 
with GuC")
> Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: stable@vger.kernel.org # v5.15+
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/
intel_gt.c
> index 0325f071046ca..5d612ba547d23 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -677,8 +677,15 @@ int intel_gt_wait_for_idle(struct intel_gt *gt, long 
timeout)
>  			return -EINTR;
>  	}
>  
> -	return timeout ? timeout : intel_uc_wait_for_idle(&gt->uc,
> -							  
remaining_timeout);
> +	if (timeout)
> +		return timeout;
> +
> +	if (remaining_timeout == -ETIME)
> +		remaining_timeout = 0;
> +	else if (remaining_timeout < 0)
> +		return remaining_timeout;
> +
> +	return intel_uc_wait_for_idle(&gt->uc, remaining_timeout);
>  }
>  
>  int intel_gt_init(struct intel_gt *gt)
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 0325f071046ca..5d612ba547d23 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -677,8 +677,15 @@  int intel_gt_wait_for_idle(struct intel_gt *gt, long timeout)
 			return -EINTR;
 	}
 
-	return timeout ? timeout : intel_uc_wait_for_idle(&gt->uc,
-							  remaining_timeout);
+	if (timeout)
+		return timeout;
+
+	if (remaining_timeout == -ETIME)
+		remaining_timeout = 0;
+	else if (remaining_timeout < 0)
+		return remaining_timeout;
+
+	return intel_uc_wait_for_idle(&gt->uc, remaining_timeout);
 }
 
 int intel_gt_init(struct intel_gt *gt)