diff mbox

[v4] drm/i915: Avoid GPU Hang when comming out of s3 or s4

Message ID 1431342327-8140-1-git-send-email-peter.antoine@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Antoine May 11, 2015, 11:05 a.m. UTC
This patch fixed a timing issue that causes a GPU hang when a the system
comes out of power saving.

During pm_resume, We are submitting batchbuffers before enabling
Interrupts this is causing us to miss the context switch interrupt,
and in consequence intel_execlists_handle_ctx_events is not triggered.

This patch is based on a patch from Deepak S <deepak.s@intel.com>
from another platform.

The patch fixes an issue introduced by:
  commit e7778be1eab918274f79603d7c17b3ec8be77386
  drm/i915: Fix startup failure in LRC mode after recent init changes

The above patch added a call to init_context() to fix an issue introduced
by a previous patch. But, it then opened up a small timing window for the
batches being added by the init_context (basically setting up the context)
to complete before the interrupts have been turned on, thus hanging the
GPU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89600
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Greg KH May 11, 2015, 12:22 p.m. UTC | #1
On Mon, May 11, 2015 at 12:05:27PM +0100, Peter Antoine wrote:
> This patch fixed a timing issue that causes a GPU hang when a the system
> comes out of power saving.
> 
> During pm_resume, We are submitting batchbuffers before enabling
> Interrupts this is causing us to miss the context switch interrupt,
> and in consequence intel_execlists_handle_ctx_events is not triggered.
> 
> This patch is based on a patch from Deepak S <deepak.s@intel.com>
> from another platform.
> 
> The patch fixes an issue introduced by:
>   commit e7778be1eab918274f79603d7c17b3ec8be77386
>   drm/i915: Fix startup failure in LRC mode after recent init changes
> 
> The above patch added a call to init_context() to fix an issue introduced
> by a previous patch. But, it then opened up a small timing window for the
> batches being added by the init_context (basically setting up the context)
> to complete before the interrupts have been turned on, thus hanging the
> GPU.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89600
> Signed-off-by: Peter Antoine <peter.antoine@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.

</formletter>
Shuang He May 15, 2015, 2:33 a.m. UTC | #2
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6380
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  276/276              276/276
ILK                 -1              302/302              301/302
SNB                 -1              314/314              313/314
IVB                                  338/338              338/338
BYT                                  286/286              286/286
BDW                 -1              320/320              319/320
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*ILK  igt@kms_flip@flip-vs-dpms-interruptible      PASS(2)      DMESG_WARN(1)
(dmesg patch applied)drm:intel_pch_fifo_underrun_irq_handler[i915]]*ERROR*PCH_transcoder_A_FIFO_underrun@PCH transcoder A FIFO underrun
 SNB  igt@pm_rpm@dpms-mode-unset-non-lpsp      DMESG_WARN(13)PASS(1)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_uncore.c:#assert_device_not_suspended[i915]()@WARNING:.* at .* assert_device_not_suspended+0x
*BDW  igt@gem_fence_thrash@bo-write-verify-y      PASS(2)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_display.c:#assert_plane[i915]()@WARNING:.* at .* assert_plane
assertion_failure@assertion failure
WARNING:at_drivers/gpu/drm/drm_irq.c:#drm_wait_one_vblank[drm]()@WARNING:.* at .* drm_wait_one_vblank+0x
Note: You need to pay more attention to line start with '*'
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 6bb6c47..748ab13 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -734,6 +734,13 @@  static int i915_drm_resume(struct drm_device *dev)
 	intel_init_pch_refclk(dev);
 	drm_mode_config_reset(dev);
 
+	/* 
+	 * Interrupts have to be enabled before any batches are run. If not the
+	 * GPU will hang. The init_hw will initiate batches to update/restore
+	 * the context.
+	 */
+	intel_runtime_pm_enable_interrupts(dev_priv);
+
 	mutex_lock(&dev->struct_mutex);
 	if (i915_gem_init_hw(dev)) {
 		DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n");
@@ -741,9 +748,7 @@  static int i915_drm_resume(struct drm_device *dev)
 	}
 	mutex_unlock(&dev->struct_mutex);
 
-	/* We need working interrupts for modeset enabling ... */
-	intel_runtime_pm_enable_interrupts(dev_priv);
-
+	/* This must follow the pm enable interrupts */
 	intel_modeset_init_hw(dev);
 
 	spin_lock_irq(&dev_priv->irq_lock);