diff mbox

still experiencing oops connecting Laptop to docking Station

Message ID 20150521120656.GX15256@phenom.ffwll.local (mailing list archive)
State New, archived
Headers show

Commit Message

Daniel Vetter May 21, 2015, 12:06 p.m. UTC
On Thu, May 21, 2015 at 12:32:18PM +0200, Nicolas Kalkhof wrote:
> > This only contains the pile of WARNINGs and not the Oops itself. I guess
> > we still need a picture (and you need to change the system console logging
> > to make sure the debug stuff shows up to). At least at the bottom of your
> > past there's from drm or i915, indicating that the crucial debug info
> > didn't make it out before the machine died.
> 
> Ok, I've set CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7 to squeeze some more debug info out of the kernel. For the OOPS I still have to take a picture since kdump seems to be broken for me.
> 
> http://pastebin.com/9mxRgNa2
> 
> > I just want the decoded address (which I read from the oops, if you
> > recompile the kernel it might have changed so please double check). I.e.
> > the above two lines is all the howto you need. But please use "list"
> > instead of "break" as Chris suggested.
> 
> Gotcha!
> 
> gdb) list *drm_atomic_helper_check_modeset+0x2c5
> 0x955 is in drm_atomic_helper_check_modeset (drivers/gpu/drm/drm_atomic_helper.c:235).
> 230             }
> 231
> 232             connector_state->best_encoder = new_encoder;
> 233             idx = drm_crtc_index(connector_state->crtc);
> 234
> 235             crtc_state = state->crtc_states[idx];
> 236             crtc_state->mode_changed = true;
> 237
> 238             DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d]\n",
> 239                              connector->base.id,
> (gdb)
> 
> Address drm_atomic_helper_check_modeset+0x2c5/0x990 is the first line in the call trace.
> One more thing: When I fire up X BEFORE I connect to the docking there are no OOPSES and I can switch between DP/eDP. As soon as I kill X however my machine dies.

Ah, Chris hunch against my claim that this is impossible was spot-on. Can
you please try the below hack please and grab dmesg? Hopefully the kernel
won't die any more at least.
-Daniel
---

Comments

Nicolas Kalkhof May 21, 2015, 2:30 p.m. UTC | #1
Hi Daniel,
 
> Ah, Chris hunch against my claim that this is impossible was spot-on. Can
> you please try the below hack please and grab dmesg? Hopefully the kernel
> won't die any more at least.

Unfortunately the patch doesn't keep my machine from crashing.

image of stack trace: http://imgur.com/CV5ho1B
dmesg: http://pastebin.com/T69SPT3g

it seems that the problem now occurs in drm_atomic_get_connector_state.

(gdb) list *drm_atomic_get_connector_state+0x40
0xff0 is in drm_atomic_get_connector_state (drivers/gpu/drm/drm_atomic.c:652).
647              *
648              * Note that we only grab the indexes once we have the right lock to
649              * prevent hotplug/unplugging of connectors. So removal is no problem,
650              * at most the array is a bit too large.
651              */
652             if (index >= state->num_connector) {
653                     DRM_DEBUG_ATOMIC("Hot-added connector would overflow state array, restarting\n");
654                     return ERR_PTR(-EAGAIN);
655             }
656
(gdb)   

HTH
Nic
Daniel Vetter May 21, 2015, 2:56 p.m. UTC | #2
On Thu, May 21, 2015 at 04:30:34PM +0200, Nicolas Kalkhof wrote:
> Hi Daniel,
>  
> > Ah, Chris hunch against my claim that this is impossible was spot-on. Can
> > you please try the below hack please and grab dmesg? Hopefully the kernel
> > won't die any more at least.
> 
> Unfortunately the patch doesn't keep my machine from crashing.
> 
> image of stack trace: http://imgur.com/CV5ho1B
> dmesg: http://pastebin.com/T69SPT3g
> 
> it seems that the problem now occurs in drm_atomic_get_connector_state.

Hm something seems to be deeply wrong here. And the oops you've captured
still doesn't include the debug messages before things go boom. Can you
please replace the WARN_ON in my previous patch with a BUG_ON? That should
stop the machine at least and hopefully prevents it all from going boom.

And please play around with dmesg -n to get these debug messages to the
console. I really need them to figure out what's going wrong here.
-Daniel

> 
> (gdb) list *drm_atomic_get_connector_state+0x40
> 0xff0 is in drm_atomic_get_connector_state (drivers/gpu/drm/drm_atomic.c:652).
> 647              *
> 648              * Note that we only grab the indexes once we have the right lock to
> 649              * prevent hotplug/unplugging of connectors. So removal is no problem,
> 650              * at most the array is a bit too large.
> 651              */
> 652             if (index >= state->num_connector) {
> 653                     DRM_DEBUG_ATOMIC("Hot-added connector would overflow state array, restarting\n");
> 654                     return ERR_PTR(-EAGAIN);
> 655             }
> 656
> (gdb)   
> 
> HTH
> Nic
Nicolas Kalkhof May 21, 2015, 7:42 p.m. UTC | #3
> Hm something seems to be deeply wrong here. And the oops you've captured
> still doesn't include the debug messages before things go boom. Can you
> please replace the WARN_ON in my previous patch with a BUG_ON? That should
> stop the machine at least and hopefully prevents it all from going boom.

I've switched WARN_ON to BUG_ON and in deed it appears right before the oops:

http://imgur.com/eKFSTYB

> And please play around with dmesg -n to get these debug messages to the
> console. I really need them to figure out what's going wrong here.

Please see attached .bz file. This is the max debug output taken from /proc/kmsg right up to the oops. This is everything I got out of it. Maybe I've missed something to get more debug messages to the console other than setting kernel parameters to drm.debug=0xff log_buf_len=10M LOGLEVEL=8 debug and dmesg config dmesg -n 7 and dmesg -n 8. Any hints?

-
Nic
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index b82ef6262469..15ba72fb98dd 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -157,6 +157,7 @@  update_connector_routing(struct drm_atomic_state *state, int conn_idx)
 	struct drm_connector *connector;
 	struct drm_connector_state *connector_state;
 	struct drm_crtc_state *crtc_state;
+	struct drm_crtc *saved_crtc;
 	int idx, ret;
 
 	connector = state->connectors[conn_idx];
@@ -195,6 +196,8 @@  update_connector_routing(struct drm_atomic_state *state, int conn_idx)
 		return 0;
 	}
 
+	saved_crtc = connector_state->crtc;
+
 	funcs = connector->helper_private;
 	new_encoder = funcs->best_encoder(connector);
 
@@ -230,7 +233,8 @@  update_connector_routing(struct drm_atomic_state *state, int conn_idx)
 	}
 
 	connector_state->best_encoder = new_encoder;
-	idx = drm_crtc_index(connector_state->crtc);
+	WARN_ON(!connector_state->crtc);
+	idx = drm_crtc_index(saved_crtc);
 
 	crtc_state = state->crtc_states[idx];
 	crtc_state->mode_changed = true;