drm/i915: Disable all outputs early, before KMS takeover

Message ID	1301395589-8121-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p2TAl4eh032526 for <patchwork-intel-gfx@patchwork.kernel.org>; Tue, 29 Mar 2011 10:47:25 GMT From: Chris Wilson <chris@chris-wilson.co.uk> To: intel-gfx@lists.freedesktop.org Date: Tue, 29 Mar 2011 11:46:29 +0100 Message-Id: <1301395589-8121-1-git-send-email-chris@chris-wilson.co.uk> In-Reply-To: <AANLkTi=VqkYjdiDLJvM-OfmBSGx-EkRkt=4XCDEnvZsU@mail.gmail.com> References: <AANLkTi=VqkYjdiDLJvM-OfmBSGx-EkRkt=4XCDEnvZsU@mail.gmail.com> Cc: Pekka Enberg <penberg@kernel.org>, linux-kernel@vger.kernel.org Subject: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early, before KMS takeover Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Message ID

1301395589-8121-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Tue, 29 Mar 2011 11:46:29 +0100
Message-Id: <1301395589-8121-1-git-send-email-chris@chris-wilson.co.uk>
In-Reply-To: <AANLkTi=VqkYjdiDLJvM-OfmBSGx-EkRkt=4XCDEnvZsU@mail.gmail.com>
References: <AANLkTi=VqkYjdiDLJvM-OfmBSGx-EkRkt=4XCDEnvZsU@mail.gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>, linux-kernel@vger.kernel.org
Subject: [Intel-gfx] [PATCH] drm/i915: Disable all outputs early,
	before KMS takeover
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org
Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org

Comments

Chris Wilson April 5, 2011, 10:30 a.m. UTC | #1

On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <tomasw@gmail.com> wrote:
> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <penberg@kernel.org> wrote:
> > Unfortunately I get a blank screen with after boot:
> > Nacked-by: Pekka Enberg <penberg@kernel.org>

But until you can tell me where it explodes on your system, we fix
issues on several other machines...
 
> Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
> stuck after boot too. When KMS is disabled I can at least get to the
> console (no graphics)
> This is with kernel 2.6.39-rc1.  It worked fine with 2.6.38. I don't
> have much time bisect and reboot.  Shell I  try to pull drm-fixes for
> rc2 or use try this patch?

Add drm.debug=0xe to your grub kernel parameters and attach the dmesg for
the failing boot. From that I should be able to recommend a course of
action.
-Chris

Chris Wilson April 5, 2011, 2:27 p.m. UTC | #2

On Tue, 5 Apr 2011 17:11:56 +0300, Pekka Enberg <penberg@kernel.org> wrote:
> [    9.310010]  <IRQ>  [<ffffffff8168cd0c>] panic+0x91/0x19e
> [    9.310010]  [<ffffffff816909ea>] oops_end+0xea/0xf0
> [    9.310010]  [<ffffffff8106afbb>] no_context+0xfb/0x260
> [    9.310010]  [<ffffffff8106b245>] __bad_area_nosemaphore+0x125/0x1e0
> [    9.310010]  [<ffffffff8106b313>] bad_area_nosemaphore+0x13/0x20
> [    9.310010]  [<ffffffff816930c0>] do_page_fault+0x310/0x4c0
> [    9.310010]  [<ffffffff810ac06f>] ? up+0x2f/0x50
> [    9.310010]  [<ffffffff8108652f>] ? console_unlock+0x17f/0x1d0
> [    9.310010]  [<ffffffff8168fd25>] page_fault+0x25/0x30
> [    9.310010]  [<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
> [    9.310010]  [<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
> [    9.310010]  [<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
> [    9.310010]  [<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]
> 
> This is the same pre-2.6.39-rc1 kernel with the two patches applied.
> I'll try latest Linus master next to see if the same problem triggers.

Hmm. Looks like we don't prevent the PGTBL_ER with those patches (or we
provoke another), and trigger the error before we can handle it.

Double ungood. Thanks,
-Chris

Linus Torvalds April 5, 2011, 2:42 p.m. UTC | #3

On Tue, Apr 5, 2011 at 3:30 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <tomasw@gmail.com> wrote:
>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <penberg@kernel.org> wrote:
>> > Unfortunately I get a blank screen with after boot:
>> > Nacked-by: Pekka Enberg <penberg@kernel.org>
>
> But until you can tell me where it explodes on your system, we fix
> issues on several other machines...

NO.

Chris, you need to understand the issue of "NO REGRESSIONS".

It's a very simple rule: it DOES NOT MATTER ONE WHIT how many machines
you fix. You never ever regress. Patches that cause regressions are
reverted.

There are multiple reasons for that rule, but the basic one ends up
being very simple: you only _think_ you fix more machines than you
break. Why? Because the people who test out your patches are the
"active" people - and often predominantly the active people who have
problems. In contrast, the people for whom things already work aren't
even testing your patches in the first place. Then, six months later,
when they update to a new Fedora version, things suddenly don't work
for them, and it turns out that yes, you fixed ten active testers, but
you broke a thousand random people.

So even _one_ person saying "this is a regression" is a total blocker.
Really. It's that simple.

YOU NEVER EVER BREAK WORKING MACHINES.

Seriously. We had this for years in ACPI-land and with suspend/resume
with "one step forward, two steps back", and nobody ever knew if we
were doing any real progress at all, because machines that had working
suspend/resume one kernel version would be broken again the next.
There was no real pattern of improvement, there was just a random
pattern of "things get fixed on one machine, and break on another".

We introduced the "no regressions" rule, and things got seriously
better. Suddenly things started getting _reliably_ better.

The whole situation with i915 has been pretty damn random lately, and
you really really need to understand that this is simply not how it's
done. Your cavalier attitude ("but it fixes things for others") is
absolutely not acceptable.

Keith Cc'd, because that patch had better not show up in my tree.

                       Linus

Keith Packard April 5, 2011, 3:01 p.m. UTC | #4

On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Keith Cc'd, because that patch had better not show up in my tree.

Having experienced the delights of ACPI in the past, I have already
subscribed to your newsletter.

Chris Wilson April 5, 2011, 3:12 p.m. UTC | #5

On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> NO.

And you seemed to have missed that patch has sat around waiting for Pekka
to give me some information on the failure on his machine.

I have been poking as many people as I could to get it reviewed and tested
on more machines so that someone else could either spot the problem or
capture the oops.

I was being facetious in order to get a response. Thanks for playing,
-Chris

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7273037..65d5adf 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1176,11 +1176,11 @@  static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
 	return can_switch;
 }
 
-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_load_gem_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned long prealloc_size, gtt_size, mappable_size;
-	int ret = 0;
+	int ret;
 
 	prealloc_size = dev_priv->mm.gtt->stolen_size;
 	gtt_size = dev_priv->mm.gtt->gtt_total_entries << PAGE_SHIFT;
@@ -1204,7 +1204,7 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	ret = i915_gem_init_ringbuffer(dev);
 	mutex_unlock(&dev->struct_mutex);
 	if (ret)
-		goto out;
+		return ret;
 
 	/* Try to set up FBC with a reasonable compressed buffer size */
 	if (I915_HAS_FBC(dev) && i915_powersave) {
@@ -1222,6 +1222,13 @@  static int i915_load_modeset_init(struct drm_device *dev)
 
 	/* Allow hardware batchbuffers unless told otherwise. */
 	dev_priv->allow_batchbuffer = 1;
+	return 0;
+}
+
+static int i915_load_modeset_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
 
 	ret = intel_parse_bios(dev);
 	if (ret)
@@ -1236,7 +1243,7 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	 */
 	ret = vga_client_register(dev->pdev, dev, NULL, i915_vga_set_decode);
 	if (ret && ret != -ENODEV)
-		goto cleanup_ringbuffer;
+		goto out;
 
 	intel_register_dsm_handler();
 
@@ -1257,13 +1264,19 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	if (ret)
 		goto cleanup_vga_switcheroo;
 
+	ret = i915_load_gem_init(dev);
+	if (ret)
+		goto cleanup_irq;
+
+	intel_modeset_gem_init(dev);
+
 	/* Always safe in the mode setting case. */
 	/* FIXME: do pre/post-mode set stuff in core KMS code */
 	dev->vblank_disable_allowed = 1;
 
 	ret = intel_fbdev_init(dev);
 	if (ret)
-		goto cleanup_irq;
+		goto cleanup_gem;
 
 	drm_kms_helper_poll_init(dev);
 
@@ -1272,16 +1285,16 @@  static int i915_load_modeset_init(struct drm_device *dev)
 
 	return 0;
 
+cleanup_gem:
+	mutex_lock(&dev->struct_mutex);
+	i915_gem_cleanup_ringbuffer(dev);
+	mutex_unlock(&dev->struct_mutex);
 cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_vga_switcheroo:
 	vga_switcheroo_unregister_client(dev->pdev);
 cleanup_vga_client:
 	vga_client_register(dev->pdev, NULL, NULL, NULL);
-cleanup_ringbuffer:
-	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
-	mutex_unlock(&dev->struct_mutex);
 out:
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 359ddce..60ebd79 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,7 @@  static inline void intel_unregister_dsm_handler(void) { return; }
 
 /* modesetting */
 extern void intel_modeset_init(struct drm_device *dev);
+extern void intel_modeset_gem_init(struct drm_device *dev);
 extern void intel_modeset_cleanup(struct drm_device *dev);
 extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state);
 extern void i8xx_disable_fbc(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..5c7385b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6497,6 +6497,9 @@  static void intel_setup_outputs(struct drm_device *dev)
 	}
 
 	intel_panel_setup_backlight(dev);
+
+	/* disable all the possible outputs/crtcs before entering KMS mode */
+	drm_helper_disable_unused_functions(dev);
 }
 
 static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb)
@@ -7432,13 +7435,12 @@  void intel_modeset_init(struct drm_device *dev)
 		intel_crtc_init(dev, i);
 	}
 
+	/* Just disable it once at startup */
+	i915_disable_vga(dev);
 	intel_setup_outputs(dev);
 
 	intel_enable_clock_gating(dev);
 
-	/* Just disable it once at startup */
-	i915_disable_vga(dev);
-
 	if (IS_IRONLAKE_M(dev)) {
 		ironlake_enable_drps(dev);
 		intel_init_emon(dev);
@@ -7447,12 +7449,15 @@  void intel_modeset_init(struct drm_device *dev)
 	if (IS_GEN6(dev))
 		gen6_enable_rps(dev_priv);
 
-	if (IS_IRONLAKE_M(dev))
-		ironlake_enable_rc6(dev);
-
 	INIT_WORK(&dev_priv->idle_work, intel_idle_update);
 	setup_timer(&dev_priv->idle_timer, intel_gpu_idle_timer,
 		    (unsigned long)dev);
+}
+
+void intel_modeset_gem_init(struct drm_device *dev)
+{
+	if (IS_IRONLAKE_M(dev))
+		ironlake_enable_rc6(dev);
 
 	intel_setup_overlay(dev);
 }

drm/i915: Disable all outputs early, before KMS takeover

Commit Message

Comments

Patch