diff mbox

[RFC,06/11] drm/i915: Disable warnings for TDR interruptions in the display driver.

Message ID 1433783009-17251-7-git-send-email-tomas.elf@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tomas Elf June 8, 2015, 5:03 p.m. UTC
Now that i915_wait_request takes per-engine hang recovery into account it is
more likely to fail and return -EAGAIN or -EIO due to hung engines (unlike
before when it would only fail if a full GPU reset was imminent). What this
means is that the display driver might see more frequent failures that are only
a consequence of ongoing hang recoveries. Therefore, let's not spew a lot of
warnings in the kernel log every time a flip fails due to an ongoing hang
recovery, since a) This is to be expected during hang recovery and b) It
severely degrades performance and makes the hang recovery take even longer to
complete, which ultimately might cause the userland window compositor to fail
because the flip is taking too long to complete and it simply gives up, leaving
the screen in a frozen state.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Chris Wilson June 8, 2015, 5:53 p.m. UTC | #1
On Mon, Jun 08, 2015 at 06:03:24PM +0100, Tomas Elf wrote:
> Now that i915_wait_request takes per-engine hang recovery into account it is
> more likely to fail and return -EAGAIN or -EIO due to hung engines (unlike
> before when it would only fail if a full GPU reset was imminent). What this
> means is that the display driver might see more frequent failures that are only
> a consequence of ongoing hang recoveries. Therefore, let's not spew a lot of
> warnings in the kernel log every time a flip fails due to an ongoing hang
> recovery, since a) This is to be expected during hang recovery and b) It
> severely degrades performance and makes the hang recovery take even longer to
> complete, which ultimately might cause the userland window compositor to fail
> because the flip is taking too long to complete and it simply gives up, leaving
> the screen in a frozen state.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c |   16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 97922fb..128c58c 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10356,9 +10356,21 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
>  
>  	mmio_flip = &crtc->mmio_flip;
>  	if (mmio_flip->req)
> -		WARN_ON(__i915_wait_request(mmio_flip->req,
> +	{
> +		int ret = __i915_wait_request(mmio_flip->req,
>  					    crtc->reset_counter,
> -					    false, NULL, NULL) != 0);
> +					    false, NULL, NULL);
> +
> +		/*
> +		 * If a hang has been detected then we expect
> +		 * __i915_wait_request to fail since it's probably going to be
> +		 * forced to give up the struct_mutex and try to grab it again
> +		 * once the TDR is done. Don't produce a warning in that case!
> +		 */
> +		if (ret)
> +			WARN_ON(!i915_gem_check_wedge(crtc->base.dev->dev_private,
> +					NULL, true));

Now this is plain wrong and should have an alert that your proposed
changes to __i915_wait_request was wrong.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 97922fb..128c58c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10356,9 +10356,21 @@  static void intel_mmio_flip_work_func(struct work_struct *work)
 
 	mmio_flip = &crtc->mmio_flip;
 	if (mmio_flip->req)
-		WARN_ON(__i915_wait_request(mmio_flip->req,
+	{
+		int ret = __i915_wait_request(mmio_flip->req,
 					    crtc->reset_counter,
-					    false, NULL, NULL) != 0);
+					    false, NULL, NULL);
+
+		/*
+		 * If a hang has been detected then we expect
+		 * __i915_wait_request to fail since it's probably going to be
+		 * forced to give up the struct_mutex and try to grab it again
+		 * once the TDR is done. Don't produce a warning in that case!
+		 */
+		if (ret)
+			WARN_ON(!i915_gem_check_wedge(crtc->base.dev->dev_private,
+					NULL, true));
+	}
 
 	intel_do_mmio_flip(crtc);
 	if (mmio_flip->req) {