From patchwork Tue Oct 21 10:55:20 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dheeraj Jamwal <dheerajx.s.jamwal@intel.com>
X-Patchwork-Id: 5120141
Return-Path: <ltsi-dev-bounces@lists.linuxfoundation.org>
X-Original-To: patchwork-ltsi-dev@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork1.web.kernel.org (Postfix) with ESMTP id 148609F349
	for <patchwork-ltsi-dev@patchwork.kernel.org>;
	Tue, 21 Oct 2014 11:53:41 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 32E7720121
	for <patchwork-ltsi-dev@patchwork.kernel.org>;
	Tue, 21 Oct 2014 11:53:40 +0000 (UTC)
Received: from mail.linuxfoundation.org (mail.linuxfoundation.org
	[140.211.169.12])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 40E1C200F4
	for <patchwork-ltsi-dev@patchwork.kernel.org>;
	Tue, 21 Oct 2014 11:53:39 +0000 (UTC)
Received: from mail.linux-foundation.org (localhost [127.0.0.1])
	by mail.linuxfoundation.org (Postfix) with ESMTP id 61AC71223;
	Tue, 21 Oct 2014 11:10:53 +0000 (UTC)
X-Original-To: ltsi-dev@lists.linuxfoundation.org
Delivered-To: ltsi-dev@mail.linuxfoundation.org
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 25C021242
	for <ltsi-dev@lists.linuxfoundation.org>;
	Tue, 21 Oct 2014 11:10:52 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
	by smtp1.linuxfoundation.org (Postfix) with ESMTP id AA6FB201F0
	for <ltsi-dev@lists.linuxfoundation.org>;
	Tue, 21 Oct 2014 11:10:51 +0000 (UTC)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
	by fmsmga101.fm.intel.com with ESMTP; 21 Oct 2014 04:10:51 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.04,761,1406617200"; d="scan'208";a="617798947"
Received: from ubuntu-desktop.png.intel.com ([10.221.122.25])
	by fmsmga002.fm.intel.com with ESMTP; 21 Oct 2014 04:10:50 -0700
From: Dheeraj Jamwal <dheerajx.s.jamwal@intel.com>
To: ltsi-dev@lists.linuxfoundation.org
Date: Tue, 21 Oct 2014 18:55:20 +0800
Message-Id: <1413889294-31328-721-git-send-email-dheerajx.s.jamwal@intel.com>
X-Mailer: git-send-email 1.7.9.5
In-Reply-To: <1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com>
References: <dheerajx.s.jamwal@intel.com>
	<1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com>
X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED,
	RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
Subject: [LTSI-dev] [PATCH 0720/1094] drm/i915: get a runtime PM ref for the
	deferred GPU reset work
X-BeenThere: ltsi-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "A list to discuss patches, development,
	and other things related to the LTSI project"
	<ltsi-dev.lists.linuxfoundation.org>
List-Unsubscribe: 
 <https://lists.linuxfoundation.org/mailman/options/ltsi-dev>,
	<mailto:ltsi-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/ltsi-dev/>
List-Post: <mailto:ltsi-dev@lists.linuxfoundation.org>
List-Help: <mailto:ltsi-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/ltsi-dev>,
	<mailto:ltsi-dev-request@lists.linuxfoundation.org?subject=subscribe>
MIME-Version: 1.0
Sender: ltsi-dev-bounces@lists.linuxfoundation.org
Errors-To: ltsi-dev-bounces@lists.linuxfoundation.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: Imre Deak <imre.deak@intel.com>

Atm we can end up in the GPU reset deferred work in D3 state if the last
runtime PM reference is dropped between detecting a hang/scheduling the
work and executing the work. At least one such case I could trigger is
the simulated reset via the i915_wedged debugfs entry. Fix this by
getting an RPM reference around accessing the HW in the reset work.

v2:
- Instead of getting/putting the RPM reference in the reset work itself,
  get it already before scheduling the work. By this we also prevent
  going to D3 before the work gets to run, in addition to making sure
  that we run the work itself in D0. (Ville, Daniel)
v3:
- fix inverted logic fail when putting the RPM ref on behalf of a
  cancelled GPU reset work (Ville)
v4:
- Taking the RPM ref in the interrupt handler isn't really needed b/c
  it's already guaranteed that we hold an RPM ref until the end of the
  reset work in all cases we care about. So take the ref in the reset
  work (for cases like i915_wedged_set). (Daniel)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit f454c6940ed4bfc76670295534270ccebf366898)

Signed-off-by: Dheeraj Jamwal <dheerajx.s.jamwal@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 274c108..2446e61 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2188,6 +2188,14 @@ static void i915_error_work_func(struct work_struct *work)
 				   reset_event);
 
 		/*
+		 * In most cases it's guaranteed that we get here with an RPM
+		 * reference held, for example because there is a pending GPU
+		 * request that won't finish until the reset is done. This
+		 * isn't the case at least when we get here by doing a
+		 * simulated reset via debugs, so get an RPM reference.
+		 */
+		intel_runtime_pm_get(dev_priv);
+		/*
 		 * All state reset _must_ be completed before we update the
 		 * reset counter, for otherwise waiters might miss the reset
 		 * pending state and not properly drop locks, resulting in
@@ -2197,6 +2205,8 @@ static void i915_error_work_func(struct work_struct *work)
 
 		intel_display_handle_reset(dev);
 
+		intel_runtime_pm_put(dev_priv);
+
 		if (ret == 0) {
 			/*
 			 * After all the gem state is reset, increment the reset