diff mbox

drm/i915: Disable all outputs early, before KMS takeover

Message ID 1301395589-8121-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson March 29, 2011, 10:46 a.m. UTC
None

Comments

Chris Wilson April 5, 2011, 10:30 a.m. UTC | #1
On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <tomasw@gmail.com> wrote:
> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <penberg@kernel.org> wrote:
> > Unfortunately I get a blank screen with after boot:
> > Nacked-by: Pekka Enberg <penberg@kernel.org>

But until you can tell me where it explodes on your system, we fix
issues on several other machines...
 
> Not sure this is related, but when I enable DRM_I915_KMS=y I'm got
> stuck after boot too. When KMS is disabled I can at least get to the
> console (no graphics)
> This is with kernel 2.6.39-rc1.  It worked fine with 2.6.38. I don't
> have much time bisect and reboot.  Shell I  try to pull drm-fixes for
> rc2 or use try this patch?

Add drm.debug=0xe to your grub kernel parameters and attach the dmesg for
the failing boot. From that I should be able to recommend a course of
action.
-Chris
Chris Wilson April 5, 2011, 2:27 p.m. UTC | #2
On Tue, 5 Apr 2011 17:11:56 +0300, Pekka Enberg <penberg@kernel.org> wrote:
> [    9.310010]  <IRQ>  [<ffffffff8168cd0c>] panic+0x91/0x19e
> [    9.310010]  [<ffffffff816909ea>] oops_end+0xea/0xf0
> [    9.310010]  [<ffffffff8106afbb>] no_context+0xfb/0x260
> [    9.310010]  [<ffffffff8106b245>] __bad_area_nosemaphore+0x125/0x1e0
> [    9.310010]  [<ffffffff8106b313>] bad_area_nosemaphore+0x13/0x20
> [    9.310010]  [<ffffffff816930c0>] do_page_fault+0x310/0x4c0
> [    9.310010]  [<ffffffff810ac06f>] ? up+0x2f/0x50
> [    9.310010]  [<ffffffff8108652f>] ? console_unlock+0x17f/0x1d0
> [    9.310010]  [<ffffffff8168fd25>] page_fault+0x25/0x30
> [    9.310010]  [<ffffffffa0061628>] ? i915_handle_error+0x198/0xed0 [i915]
> [    9.310010]  [<ffffffff8137d04a>] ? scsi_next_command+0x4a/0x60
> [    9.310010]  [<ffffffff8137ddd6>] ? scsi_io_completion+0x2f6/0x630
> [    9.310010]  [<ffffffffa0064c62>] i915_driver_irq_handler+0x472/0x17f0 [i915]
> 
> This is the same pre-2.6.39-rc1 kernel with the two patches applied.
> I'll try latest Linus master next to see if the same problem triggers.

Hmm. Looks like we don't prevent the PGTBL_ER with those patches (or we
provoke another), and trigger the error before we can handle it.

Double ungood. Thanks,
-Chris
Linus Torvalds April 5, 2011, 2:42 p.m. UTC | #3
On Tue, Apr 5, 2011 at 3:30 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Tue, 5 Apr 2011 13:21:08 +0300, Tomas Winkler <tomasw@gmail.com> wrote:
>> On Fri, Apr 1, 2011 at 2:51 PM, Pekka Enberg <penberg@kernel.org> wrote:
>> > Unfortunately I get a blank screen with after boot:
>> > Nacked-by: Pekka Enberg <penberg@kernel.org>
>
> But until you can tell me where it explodes on your system, we fix
> issues on several other machines...

NO.

Chris, you need to understand the issue of "NO REGRESSIONS".

It's a very simple rule: it DOES NOT MATTER ONE WHIT how many machines
you fix. You never ever regress. Patches that cause regressions are
reverted.

There are multiple reasons for that rule, but the basic one ends up
being very simple: you only _think_ you fix more machines than you
break. Why? Because the people who test out your patches are the
"active" people - and often predominantly the active people who have
problems. In contrast, the people for whom things already work aren't
even testing your patches in the first place. Then, six months later,
when they update to a new Fedora version, things suddenly don't work
for them, and it turns out that yes, you fixed ten active testers, but
you broke a thousand random people.

So even _one_ person saying "this is a regression" is a total blocker.
Really. It's that simple.

YOU NEVER EVER BREAK WORKING MACHINES.

Seriously. We had this for years in ACPI-land and with suspend/resume
with "one step forward, two steps back", and nobody ever knew if we
were doing any real progress at all, because machines that had working
suspend/resume one kernel version would be broken again the next.
There was no real pattern of improvement, there was just a random
pattern of "things get fixed on one machine, and break on another".

We introduced the "no regressions" rule, and things got seriously
better. Suddenly things started getting _reliably_ better.

The whole situation with i915 has been pretty damn random lately, and
you really really need to understand that this is simply not how it's
done. Your cavalier attitude ("but it fixes things for others") is
absolutely not acceptable.

Keith Cc'd, because that patch had better not show up in my tree.

                       Linus
Keith Packard April 5, 2011, 3:01 p.m. UTC | #4
On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Keith Cc'd, because that patch had better not show up in my tree.

Having experienced the delights of ACPI in the past, I have already
subscribed to your newsletter.
Chris Wilson April 5, 2011, 3:12 p.m. UTC | #5
On Tue, 5 Apr 2011 07:42:14 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> NO.

And you seemed to have missed that patch has sat around waiting for Pekka
to give me some information on the failure on his machine.

I have been poking as many people as I could to get it reviewed and tested
on more machines so that someone else could either spot the problem or
capture the oops.

I was being facetious in order to get a response. Thanks for playing,
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7273037..65d5adf 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1176,11 +1176,11 @@  static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
 	return can_switch;
 }
 
-static int i915_load_modeset_init(struct drm_device *dev)
+static int i915_load_gem_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned long prealloc_size, gtt_size, mappable_size;
-	int ret = 0;
+	int ret;
 
 	prealloc_size = dev_priv->mm.gtt->stolen_size;
 	gtt_size = dev_priv->mm.gtt->gtt_total_entries << PAGE_SHIFT;
@@ -1204,7 +1204,7 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	ret = i915_gem_init_ringbuffer(dev);
 	mutex_unlock(&dev->struct_mutex);
 	if (ret)
-		goto out;
+		return ret;
 
 	/* Try to set up FBC with a reasonable compressed buffer size */
 	if (I915_HAS_FBC(dev) && i915_powersave) {
@@ -1222,6 +1222,13 @@  static int i915_load_modeset_init(struct drm_device *dev)
 
 	/* Allow hardware batchbuffers unless told otherwise. */
 	dev_priv->allow_batchbuffer = 1;
+	return 0;
+}
+
+static int i915_load_modeset_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret;
 
 	ret = intel_parse_bios(dev);
 	if (ret)
@@ -1236,7 +1243,7 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	 */
 	ret = vga_client_register(dev->pdev, dev, NULL, i915_vga_set_decode);
 	if (ret && ret != -ENODEV)
-		goto cleanup_ringbuffer;
+		goto out;
 
 	intel_register_dsm_handler();
 
@@ -1257,13 +1264,19 @@  static int i915_load_modeset_init(struct drm_device *dev)
 	if (ret)
 		goto cleanup_vga_switcheroo;
 
+	ret = i915_load_gem_init(dev);
+	if (ret)
+		goto cleanup_irq;
+
+	intel_modeset_gem_init(dev);
+
 	/* Always safe in the mode setting case. */
 	/* FIXME: do pre/post-mode set stuff in core KMS code */
 	dev->vblank_disable_allowed = 1;
 
 	ret = intel_fbdev_init(dev);
 	if (ret)
-		goto cleanup_irq;
+		goto cleanup_gem;
 
 	drm_kms_helper_poll_init(dev);
 
@@ -1272,16 +1285,16 @@  static int i915_load_modeset_init(struct drm_device *dev)
 
 	return 0;
 
+cleanup_gem:
+	mutex_lock(&dev->struct_mutex);
+	i915_gem_cleanup_ringbuffer(dev);
+	mutex_unlock(&dev->struct_mutex);
 cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_vga_switcheroo:
 	vga_switcheroo_unregister_client(dev->pdev);
 cleanup_vga_client:
 	vga_client_register(dev->pdev, NULL, NULL, NULL);
-cleanup_ringbuffer:
-	mutex_lock(&dev->struct_mutex);
-	i915_gem_cleanup_ringbuffer(dev);
-	mutex_unlock(&dev->struct_mutex);
 out:
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 359ddce..60ebd79 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1268,6 +1268,7 @@  static inline void intel_unregister_dsm_handler(void) { return; }
 
 /* modesetting */
 extern void intel_modeset_init(struct drm_device *dev);
+extern void intel_modeset_gem_init(struct drm_device *dev);
 extern void intel_modeset_cleanup(struct drm_device *dev);
 extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state);
 extern void i8xx_disable_fbc(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 432fc04..5c7385b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6497,6 +6497,9 @@  static void intel_setup_outputs(struct drm_device *dev)
 	}
 
 	intel_panel_setup_backlight(dev);
+
+	/* disable all the possible outputs/crtcs before entering KMS mode */
+	drm_helper_disable_unused_functions(dev);
 }
 
 static void intel_user_framebuffer_destroy(struct drm_framebuffer *fb)
@@ -7432,13 +7435,12 @@  void intel_modeset_init(struct drm_device *dev)
 		intel_crtc_init(dev, i);
 	}
 
+	/* Just disable it once at startup */
+	i915_disable_vga(dev);
 	intel_setup_outputs(dev);
 
 	intel_enable_clock_gating(dev);
 
-	/* Just disable it once at startup */
-	i915_disable_vga(dev);
-
 	if (IS_IRONLAKE_M(dev)) {
 		ironlake_enable_drps(dev);
 		intel_init_emon(dev);
@@ -7447,12 +7449,15 @@  void intel_modeset_init(struct drm_device *dev)
 	if (IS_GEN6(dev))
 		gen6_enable_rps(dev_priv);
 
-	if (IS_IRONLAKE_M(dev))
-		ironlake_enable_rc6(dev);
-
 	INIT_WORK(&dev_priv->idle_work, intel_idle_update);
 	setup_timer(&dev_priv->idle_timer, intel_gpu_idle_timer,
 		    (unsigned long)dev);
+}
+
+void intel_modeset_gem_init(struct drm_device *dev)
+{
+	if (IS_IRONLAKE_M(dev))
+		ironlake_enable_rc6(dev);
 
 	intel_setup_overlay(dev);
 }