diff mbox

[REGRESSION] Re: i915 driver crashes on T540p if docking station attached

Message ID CA+55aFzsVMMpSeF3fG0FJ4Lhj1Tb6MY9LMfnmj7u1JyzM9aQuA@mail.gmail.com
State New, archived
Headers show

Commit Message

Linus Torvalds July 30, 2015, 5:18 a.m. UTC
On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> It's here:  https://goo.gl/photos/xHjn2Z97JQEw6k2C9

You didn't catch enough of the code line to decode the code, but it's
early enough in drm_crtc_index() (just five bytes in) that it's almost
certainly the very first dereference, so it's almost guaranteed to be
that

   crtc->dev

access as part of list_for_each_entry(), with crtc being NULL. And
yes, "->dev" is the very first field, so the offset is zero too (while
the "->mode_config" list access would not be at offset zero).

And it looks like it is called from drm_atomic_helper_check_modeset():
the reason it has a question mark in the backtrace is because the
fault happens before the stack frame has even been set up.

There are multiple calls to "drm_crtc_index()" from that function, I
can't tell which one it is. Looking at the code generation I get, I
think it's because update_connector_routing() gets inlined, and that
one does several calls. Most of them look like this:

                if (connector->state->crtc) {
                        idx = drm_crtc_index(connector->state->crtc);

ie they check that the crtc is non-NULL, but that last one does not:

        connector_state->best_encoder = new_encoder;
        idx = drm_crtc_index(connector_state->crtc);

        crtc_state = state->crtc_states[idx];
        crtc_state->mode_changed = true;

and I suspect the fix might be something like the attached. Totally
untested. Ted?

This whole "atomic modeset" series has been one royal fuck-up, guys.
We've had too many of these kinds of crap issues.

                           Linus
drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Dave Airlie July 30, 2015, 11:16 a.m. UTC | #1
On 30 July 2015 at 15:18, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Wed, Jul 29, 2015 at 6:39 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>>
>> It's here:  https://goo.gl/photos/xHjn2Z97JQEw6k2C9
>
> You didn't catch enough of the code line to decode the code, but it's
> early enough in drm_crtc_index() (just five bytes in) that it's almost
> certainly the very first dereference, so it's almost guaranteed to be
> that
>
>    crtc->dev
>
> access as part of list_for_each_entry(), with crtc being NULL. And
> yes, "->dev" is the very first field, so the offset is zero too (while
> the "->mode_config" list access would not be at offset zero).
>
> And it looks like it is called from drm_atomic_helper_check_modeset():
> the reason it has a question mark in the backtrace is because the
> fault happens before the stack frame has even been set up.
>
> There are multiple calls to "drm_crtc_index()" from that function, I
> can't tell which one it is. Looking at the code generation I get, I
> think it's because update_connector_routing() gets inlined, and that
> one does several calls. Most of them look like this:
>
>                 if (connector->state->crtc) {
>                         idx = drm_crtc_index(connector->state->crtc);
>
> ie they check that the crtc is non-NULL, but that last one does not:
>
>         connector_state->best_encoder = new_encoder;
>         idx = drm_crtc_index(connector_state->crtc);
>
>         crtc_state = state->crtc_states[idx];
>         crtc_state->mode_changed = true;
>
> and I suspect the fix might be something like the attached. Totally
> untested. Ted?
>
> This whole "atomic modeset" series has been one royal fuck-up, guys.
> We've had too many of these kinds of crap issues.

It hasn't been that bad, on a scale of 1 to MD eats my raid array, I'd
say we are barely at a 5.

There have been a lot of small and seemingly easily fixed teething
problems, essentially rewriting the DRM API to provide a new userspace
API and internal interface, porting some drivers partly to the new
interface, while trying to maintain the old ABI/API on top seamlessly
was always going to be an impossible task. It was never going to
magically all just work in -next and land in your tree fully formed
smelling of lavender and elderberries. This is a massive undertaking,
and doing it over a few kernels was the only possible way it could
ever land.

I think the biggest problem we've had is the QA team at Intel got
reorganised or something right when they really needed to be doing
testing on this stuff, so what was sitting in -next never got as much
testing as it had previously, and you can see that in the types of
cases that are getting through. I think the other thing we can learn
is that when Android forks the kernel we should just say this shit is
too hard, let Google go and create a new API and a complete set of
graphics drivers and deal with it in 10 years, because that was
seriously the only other option.

So yes it's a pity other kernel developers are seeing our fallout, but
I've experienced lots of other kernel developers fall out over the
years, and generally the idea is to get this stuff fixed to a
reasonable state before you release a final kernel.

Note I'm not personally involved in the development for atomic
modesetting at all, I'm running the kernels with it where and when I
can, and I trust the developers who work on it are doing as much as
they can to make it work.

That said hopefully Daniel can find a bag of fucks to debug and write
a proper patch, instead of rage quitting the universe, and just git
reset --hard v4.0 drivers/gpu/drm/i915..

Dave.
Daniel Vetter July 30, 2015, 2:40 p.m. UTC | #2
On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
>  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 5b59d5ad7d1c..aac212297b49 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
>  	}
>  
>  	connector_state->best_encoder = new_encoder;
> -	idx = drm_crtc_index(connector_state->crtc);
> +	if (connector_state->crtc) {
> +		idx = drm_crtc_index(connector_state->crtc);
>  
> -	crtc_state = state->crtc_states[idx];
> -	crtc_state->mode_changed = true;
> +		crtc_state = state->crtc_states[idx];
> +		crtc_state->mode_changed = true;
> +	}

This shouldn't happen since if it does we ended up stealing the encoder
from the connector itself (we do check for connector_state->crtc earlier)
and that would be a bug. I haven't figured out a precise theory but my
guess is on the best_encoder selection, and indeed dp mst encoder
selection seems to have gone belly up in 4.2 with the bisected commit.

I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
but I couldn't test them yet since no dp mst here and I didn't find
anything that would ship faster than 1-2 weeks yet. I'll try to get some
other people here to test it meanwhile too.
-Daniel
Theodore Y. Ts'o July 30, 2015, 3:32 p.m. UTC | #3
On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
> >  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 5b59d5ad7d1c..aac212297b49 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
> >  	}
> >  
> >  	connector_state->best_encoder = new_encoder;
> > -	idx = drm_crtc_index(connector_state->crtc);
> > +	if (connector_state->crtc) {
> > +		idx = drm_crtc_index(connector_state->crtc);
> >  
> > -	crtc_state = state->crtc_states[idx];
> > -	crtc_state->mode_changed = true;
> > +		crtc_state = state->crtc_states[idx];
> > +		crtc_state->mode_changed = true;
> > +	}
> 
> This shouldn't happen since if it does we ended up stealing the encoder
> from the connector itself (we do check for connector_state->crtc earlier)
> and that would be a bug. I haven't figured out a precise theory but my
> guess is on the best_encoder selection, and indeed dp mst encoder
> selection seems to have gone belly up in 4.2 with the bisected commit.

Well, I just tested Linus's patch and it works.

BTW, is there any chance that I can suspend my laptop, and then move
it from my docking station at home (where I have a Dell 30" display)
to my docking station at work (where I have a Dell 24" display), and
actually have the new monitor be detected?  For at least the past
year, I have to reboot in order to be able to use the external
monitor?  This used to work, but it's been a very long-standing
regression.  I undrstand that Multi-stream DP is a evil horrible hack,
and supporting it is painful, but this used to work, and it hasn't in
a long time.  :-(

					- Ted
Theodore Y. Ts'o July 30, 2015, 3:50 p.m. UTC | #4
On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> but I couldn't test them yet since no dp mst here and I didn't find
> anything that would ship faster than 1-2 weeks yet. I'll try to get some
> other people here to test it meanwhile too.

I've tried pulling in your patches from fixes-stuff, onto Linus's tree
(without Linus's fix), and the good news is that I'm no longer
crashing on boot.

The *bad* news is that (a) it breaks the external monitor attached to
the docking station completely (this was working with Linus's patch),
and (b) it's triggering a LOCKDEP failure.

So even though Linus's patch wasn't supposed to work, I think I'm
going to back to it....

					- Ted


Jul 30 11:46:49 closure kernel: [    4.221951] 
Jul 30 11:46:49 closure kernel: [    4.221954] ======================================================
Jul 30 11:46:49 closure kernel: [    4.221957] [ INFO: possible circular locking dependency detected ]
Jul 30 11:46:49 closure kernel: [    4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted
Jul 30 11:46:49 closure kernel: [    4.221963] -------------------------------------------------------
Jul 30 11:46:49 closure kernel: [    4.221966] modprobe/503 is trying to acquire lock:
Jul 30 11:46:49 closure kernel: [    4.221968]  (init_mutex){+.+.+.}, at: [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.221977] 
Jul 30 11:46:49 closure kernel: [    4.221977] but task is already holding lock:
Jul 30 11:46:49 closure kernel: [    4.221979]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
Jul 30 11:46:49 closure kernel: [    4.221987] 
Jul 30 11:46:49 closure kernel: [    4.221987] which lock already depends on the new lock.
Jul 30 11:46:49 closure kernel: [    4.221987] 
Jul 30 11:46:49 closure kernel: [    4.221990] 
Jul 30 11:46:49 closure kernel: [    4.221990] the existing dependency chain (in reverse order) is:
Jul 30 11:46:49 closure kernel: [    4.221995] 
Jul 30 11:46:49 closure kernel: [    4.221995] -> #1 (&(&backlight_notifier)->rwsem){++++..}:
Jul 30 11:46:49 closure kernel: [    4.222001]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222007]        [<ffffffff8161f1db>] down_write+0x46/0x8a
Jul 30 11:46:49 closure kernel: [    4.222012]        [<ffffffff8109a6c0>] blocking_notifier_chain_register+0x36/0x57
Jul 30 11:46:49 closure kernel: [    4.222017]        [<ffffffff8134eb4e>] backlight_register_notifier+0x18/0x1a
Jul 30 11:46:49 closure kernel: [    4.222022]        [<ffffffff8138b463>] acpi_video_get_backlight_type+0xfa/0x164
Jul 30 11:46:49 closure kernel: [    4.222028]        [<ffffffffc03a1e45>] 0xffffffffc03a1e45
Jul 30 11:46:49 closure audispd: No plugins found, exiting
Jul 30 11:46:49 closure kernel: [    4.222032]        [<ffffffffc03a28a8>] 0xffffffffc03a28a8
Jul 30 11:46:49 closure kernel: [    4.222036]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222042]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222047]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222052]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222056]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
Jul 30 11:46:49 closure kernel: [    4.222060] 
Jul 30 11:46:49 closure kernel: [    4.222060] -> #0 (init_mutex){+.+.+.}:
Jul 30 11:46:49 closure kernel: [    4.222065]        [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
Jul 30 11:46:49 closure kernel: [    4.222070]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222074]        [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
Jul 30 11:46:49 closure kernel: [    4.222078]        [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222083]        [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
Jul 30 11:46:49 closure kernel: [    4.222088]        [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
Jul 30 11:46:49 closure kernel: [    4.222092]        [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
Jul 30 11:46:49 closure kernel: [    4.222098]        [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
Jul 30 11:46:49 closure kernel: [    4.222103]        [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
Jul 30 11:46:49 closure kernel: [    4.222108]        [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
Jul 30 11:46:49 closure kernel: [    4.222146]        [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
Jul 30 11:46:49 closure kernel: [    4.222176]        [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
Jul 30 11:46:49 closure kernel: [    4.222205]        [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222217]        [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
Jul 30 11:46:49 closure kernel: [    4.222228]        [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
Jul 30 11:46:49 closure kernel: [    4.222247]        [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
Jul 30 11:46:49 closure kernel: [    4.222253]        [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
Jul 30 11:46:49 closure kernel: [    4.222257]        [<ffffffff813d4a81>] __driver_attach+0x53/0x74
Jul 30 11:46:49 closure kernel: [    4.222262]        [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
Jul 30 11:46:49 closure kernel: [    4.222266]        [<ffffffff813d41f0>] driver_attach+0x1e/0x20
Jul 30 11:46:49 closure kernel: [    4.222269]        [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
Jul 30 11:46:49 closure kernel: [    4.222273]        [<ffffffff813d53d8>] driver_register+0x8f/0xcc
Jul 30 11:46:49 closure kernel: [    4.222278]        [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
Jul 30 11:46:49 closure kernel: [    4.222282]        [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
Jul 30 11:46:49 closure kernel: [    4.222293]        [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
Jul 30 11:46:49 closure kernel: [    4.222312]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222317]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222321]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222325]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222329]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
Jul 30 11:46:49 closure kernel: [    4.222334] 
Jul 30 11:46:49 closure kernel: [    4.222334] other info that might help us debug this:
Jul 30 11:46:49 closure kernel: [    4.222334] 
Jul 30 11:46:49 closure kernel: [    4.222340]  Possible unsafe locking scenario:
Jul 30 11:46:49 closure kernel: [    4.222340] 
Jul 30 11:46:49 closure kernel: [    4.222344]        CPU0                    CPU1
Jul 30 11:46:49 closure kernel: [    4.222347]        ----                    ----
Jul 30 11:46:49 closure kernel: [    4.222350]   lock(&(&backlight_notifier)->rwsem);
Jul 30 11:46:49 closure kernel: [    4.222353]                                lock(init_mutex);
Jul 30 11:46:49 closure kernel: [    4.222357]                                lock(&(&backlight_notifier)->rwsem);
Jul 30 11:46:49 closure kernel: [    4.222363]   lock(init_mutex);
Jul 30 11:46:49 closure kernel: [    4.222366] 
Jul 30 11:46:49 closure kernel: [    4.222366]  *** DEADLOCK ***
Jul 30 11:46:49 closure kernel: [    4.222366] 
Jul 30 11:46:49 closure kernel: [    4.222371] 4 locks held by modprobe/503:
Jul 30 11:46:49 closure kernel: [    4.222374]  #0:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
Jul 30 11:46:49 closure kernel: [    4.222381]  #1:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
Jul 30 11:46:49 closure kernel: [    4.222388]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc053aeb9>] drm_dev_register+0x24/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222402]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
Jul 30 11:46:49 closure kernel: [    4.222410] 
Jul 30 11:46:49 closure kernel: [    4.222410] stack backtrace:
Jul 30 11:46:49 closure kernel: [    4.222416] CPU: 7 PID: 503 Comm: modprobe Not tainted 4.2.0-rc4-13906-g5f1b75cd #16
Jul 30 11:46:49 closure kernel: [    4.222420] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
Jul 30 11:46:49 closure kernel: [    4.222425]  ffffffff8280a230 ffff8800c992b5d8 ffffffff8161a71e 0000000000000006
Jul 30 11:46:49 closure kernel: [    4.222431]  ffffffff8280a230 ffff8800c992b628 ffffffff810b9adf ffffffff82265780
Jul 30 11:46:49 closure kernel: [    4.222437]  ffff880405588000 0000000000000004 ffff880405588880 0000000000000004
Jul 30 11:46:49 closure kernel: [    4.222443] Call Trace:
Jul 30 11:46:49 closure kernel: [    4.222447]  [<ffffffff8161a71e>] dump_stack+0x4c/0x65
Jul 30 11:46:49 closure kernel: [    4.222451]  [<ffffffff810b9adf>] print_circular_bug+0x1f8/0x209
Jul 30 11:46:49 closure kernel: [    4.222455]  [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
Jul 30 11:46:49 closure kernel: [    4.222460]  [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
Jul 30 11:46:49 closure kernel: [    4.222464]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222469]  [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
Jul 30 11:46:49 closure kernel: [    4.222472]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222476]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222480]  [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
Jul 30 11:46:49 closure kernel: [    4.222484]  [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
Jul 30 11:46:49 closure kernel: [    4.222488]  [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
Jul 30 11:46:49 closure kernel: [    4.222492]  [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
Jul 30 11:46:49 closure kernel: [    4.222496]  [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
Jul 30 11:46:49 closure kernel: [    4.222500]  [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
Jul 30 11:46:49 closure kernel: [    4.222530]  [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
Jul 30 11:46:49 closure kernel: [    4.222556]  [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
Jul 30 11:46:49 closure kernel: [    4.222584]  [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
Jul 30 11:46:49 closure kernel: [    4.222589]  [<ffffffff810ba715>] ? mark_held_locks+0x56/0x6c
Jul 30 11:46:49 closure kernel: [    4.222593]  [<ffffffff81620836>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
Jul 30 11:46:49 closure kernel: [    4.222597]  [<ffffffff810ba89c>] ? trace_hardirqs_on_caller+0x171/0x18d
Jul 30 11:46:49 closure kernel: [    4.222607]  [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
Jul 30 11:46:49 closure kernel: [    4.222618]  [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
Jul 30 11:46:49 closure kernel: [    4.222636]  [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
Jul 30 11:46:49 closure kernel: [    4.222640]  [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
Jul 30 11:46:49 closure kernel: [    4.222644]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
Jul 30 11:46:49 closure kernel: [    4.222648]  [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
Jul 30 11:46:49 closure kernel: [    4.222652]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
Jul 30 11:46:49 closure kernel: [    4.222655]  [<ffffffff813d4a81>] __driver_attach+0x53/0x74
Jul 30 11:46:49 closure kernel: [    4.222659]  [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
Jul 30 11:46:49 closure kernel: [    4.222662]  [<ffffffff813d41f0>] driver_attach+0x1e/0x20
Jul 30 11:46:49 closure kernel: [    4.222666]  [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
Jul 30 11:46:49 closure kernel: [    4.222670]  [<ffffffff813d53d8>] driver_register+0x8f/0xcc
Jul 30 11:46:49 closure kernel: [    4.222674]  [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
Jul 30 11:46:49 closure kernel: [    4.222677]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222687]  [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
Jul 30 11:46:49 closure kernel: [    4.222690]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222708]  [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
Jul 30 11:46:49 closure kernel: [    4.222712]  [<ffffffffc081f000>] ? 0xffffffffc081f000
Jul 30 11:46:49 closure kernel: [    4.222716]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
Jul 30 11:46:49 closure kernel: [    4.222719]  [<ffffffff8161994d>] ? do_init_module+0x28/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222723]  [<ffffffff81199350>] ? kmem_cache_alloc_trace+0xba/0xcc
Jul 30 11:46:49 closure kernel: [    4.222727]  [<ffffffff81619985>] do_init_module+0x60/0x1e3
Jul 30 11:46:49 closure kernel: [    4.222731]  [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
Jul 30 11:46:49 closure kernel: [    4.222736]  [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
Jul 30 11:46:49 closure kernel: [    4.222739]  [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
Takashi Iwai July 30, 2015, 3:57 p.m. UTC | #5
On Thu, 30 Jul 2015 17:32:28 +0200,
Theodore Ts'o wrote:
> 
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > On Wed, Jul 29, 2015 at 10:18:16PM -0700, Linus Torvalds wrote:
> > >  drivers/gpu/drm/drm_atomic_helper.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > index 5b59d5ad7d1c..aac212297b49 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -230,10 +230,12 @@ update_connector_routing(struct drm_atomic_state *state, int conn_idx)
> > >  	}
> > >  
> > >  	connector_state->best_encoder = new_encoder;
> > > -	idx = drm_crtc_index(connector_state->crtc);
> > > +	if (connector_state->crtc) {
> > > +		idx = drm_crtc_index(connector_state->crtc);
> > >  
> > > -	crtc_state = state->crtc_states[idx];
> > > -	crtc_state->mode_changed = true;
> > > +		crtc_state = state->crtc_states[idx];
> > > +		crtc_state->mode_changed = true;
> > > +	}
> > 
> > This shouldn't happen since if it does we ended up stealing the encoder
> > from the connector itself (we do check for connector_state->crtc earlier)
> > and that would be a bug. I haven't figured out a precise theory but my
> > guess is on the best_encoder selection, and indeed dp mst encoder
> > selection seems to have gone belly up in 4.2 with the bisected commit.
> 
> Well, I just tested Linus's patch and it works.
> 
> BTW, is there any chance that I can suspend my laptop, and then move
> it from my docking station at home (where I have a Dell 30" display)
> to my docking station at work (where I have a Dell 24" display), and
> actually have the new monitor be detected?  For at least the past
> year, I have to reboot in order to be able to use the external
> monitor?  This used to work, but it's been a very long-standing
> regression.  I undrstand that Multi-stream DP is a evil horrible hack,
> and supporting it is painful, but this used to work, and it hasn't in
> a long time.  :-(

Relevant with this?
   https://bugs.freedesktop.org/show_bug.cgi?id=89589

I wanted to check this by myself, too, as the same bug was reported to
openSUSE bugzilla, but I had no hardware showing it.


Takashi
Theodore Y. Ts'o July 30, 2015, 3:59 p.m. UTC | #6
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.
> 
> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.

Well, that's not fair.  Even with Linus's fix, there is still a
LOCKDEP failure.  And a few more i915 WARNINGS.  But at least the
external monitor works, so this is what I'm using.  Enclosed please
find a dmesg with the lockdep and i915 warnings and my .config.  The
kernel that I used can be found at:

https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/log/?h=i915-test-4.2.0-rc4

						- Ted
Daniel Vetter July 30, 2015, 4 p.m. UTC | #7
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> > but I couldn't test them yet since no dp mst here and I didn't find
> > anything that would ship faster than 1-2 weeks yet. I'll try to get some
> > other people here to test it meanwhile too.
> 
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.

Ok so I'm not completely clueless yet, the encoder confusion indeed
resulted in the follow-up crash. But obviously I don't understand yet
exactly what's going on if this breaks the display.

> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.

The lockdep splat is all in the driver load before we do any modeset at
all, so shouldn't have changed between these patches. Are you sure it's a
regression due to mine and wasn't there before?

> So even though Linus's patch wasn't supposed to work, I think I'm
> going to back to it....

Well I found some dp mst hubs meanwhile so hopefully tomorrow I can test
myself what's going wrong here.
-Daniel

> 
> 					- Ted
> 
> 
> Jul 30 11:46:49 closure kernel: [    4.221951] 
> Jul 30 11:46:49 closure kernel: [    4.221954] ======================================================
> Jul 30 11:46:49 closure kernel: [    4.221957] [ INFO: possible circular locking dependency detected ]
> Jul 30 11:46:49 closure kernel: [    4.221960] 4.2.0-rc4-13906-g5f1b75cd #16 Not tainted
> Jul 30 11:46:49 closure kernel: [    4.221963] -------------------------------------------------------
> Jul 30 11:46:49 closure kernel: [    4.221966] modprobe/503 is trying to acquire lock:
> Jul 30 11:46:49 closure kernel: [    4.221968]  (init_mutex){+.+.+.}, at: [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.221977] 
> Jul 30 11:46:49 closure kernel: [    4.221977] but task is already holding lock:
> Jul 30 11:46:49 closure kernel: [    4.221979]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
> Jul 30 11:46:49 closure kernel: [    4.221987] 
> Jul 30 11:46:49 closure kernel: [    4.221987] which lock already depends on the new lock.
> Jul 30 11:46:49 closure kernel: [    4.221987] 
> Jul 30 11:46:49 closure kernel: [    4.221990] 
> Jul 30 11:46:49 closure kernel: [    4.221990] the existing dependency chain (in reverse order) is:
> Jul 30 11:46:49 closure kernel: [    4.221995] 
> Jul 30 11:46:49 closure kernel: [    4.221995] -> #1 (&(&backlight_notifier)->rwsem){++++..}:
> Jul 30 11:46:49 closure kernel: [    4.222001]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222007]        [<ffffffff8161f1db>] down_write+0x46/0x8a
> Jul 30 11:46:49 closure kernel: [    4.222012]        [<ffffffff8109a6c0>] blocking_notifier_chain_register+0x36/0x57
> Jul 30 11:46:49 closure kernel: [    4.222017]        [<ffffffff8134eb4e>] backlight_register_notifier+0x18/0x1a
> Jul 30 11:46:49 closure kernel: [    4.222022]        [<ffffffff8138b463>] acpi_video_get_backlight_type+0xfa/0x164
> Jul 30 11:46:49 closure kernel: [    4.222028]        [<ffffffffc03a1e45>] 0xffffffffc03a1e45
> Jul 30 11:46:49 closure audispd: No plugins found, exiting
> Jul 30 11:46:49 closure kernel: [    4.222032]        [<ffffffffc03a28a8>] 0xffffffffc03a28a8
> Jul 30 11:46:49 closure kernel: [    4.222036]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222042]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222047]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222052]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222056]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> Jul 30 11:46:49 closure kernel: [    4.222060] 
> Jul 30 11:46:49 closure kernel: [    4.222060] -> #0 (init_mutex){+.+.+.}:
> Jul 30 11:46:49 closure kernel: [    4.222065]        [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
> Jul 30 11:46:49 closure kernel: [    4.222070]        [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222074]        [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
> Jul 30 11:46:49 closure kernel: [    4.222078]        [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222083]        [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
> Jul 30 11:46:49 closure kernel: [    4.222088]        [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
> Jul 30 11:46:49 closure kernel: [    4.222092]        [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
> Jul 30 11:46:49 closure kernel: [    4.222098]        [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
> Jul 30 11:46:49 closure kernel: [    4.222103]        [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
> Jul 30 11:46:49 closure kernel: [    4.222108]        [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222146]        [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222176]        [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222205]        [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222217]        [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
> Jul 30 11:46:49 closure kernel: [    4.222228]        [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222247]        [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
> Jul 30 11:46:49 closure kernel: [    4.222253]        [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
> Jul 30 11:46:49 closure kernel: [    4.222257]        [<ffffffff813d4a81>] __driver_attach+0x53/0x74
> Jul 30 11:46:49 closure kernel: [    4.222262]        [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
> Jul 30 11:46:49 closure kernel: [    4.222266]        [<ffffffff813d41f0>] driver_attach+0x1e/0x20
> Jul 30 11:46:49 closure kernel: [    4.222269]        [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
> Jul 30 11:46:49 closure kernel: [    4.222273]        [<ffffffff813d53d8>] driver_register+0x8f/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222278]        [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
> Jul 30 11:46:49 closure kernel: [    4.222282]        [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
> Jul 30 11:46:49 closure kernel: [    4.222293]        [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222312]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222317]        [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222321]        [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222325]        [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222329]        [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> Jul 30 11:46:49 closure kernel: [    4.222334] 
> Jul 30 11:46:49 closure kernel: [    4.222334] other info that might help us debug this:
> Jul 30 11:46:49 closure kernel: [    4.222334] 
> Jul 30 11:46:49 closure kernel: [    4.222340]  Possible unsafe locking scenario:
> Jul 30 11:46:49 closure kernel: [    4.222340] 
> Jul 30 11:46:49 closure kernel: [    4.222344]        CPU0                    CPU1
> Jul 30 11:46:49 closure kernel: [    4.222347]        ----                    ----
> Jul 30 11:46:49 closure kernel: [    4.222350]   lock(&(&backlight_notifier)->rwsem);
> Jul 30 11:46:49 closure kernel: [    4.222353]                                lock(init_mutex);
> Jul 30 11:46:49 closure kernel: [    4.222357]                                lock(&(&backlight_notifier)->rwsem);
> Jul 30 11:46:49 closure kernel: [    4.222363]   lock(init_mutex);
> Jul 30 11:46:49 closure kernel: [    4.222366] 
> Jul 30 11:46:49 closure kernel: [    4.222366]  *** DEADLOCK ***
> Jul 30 11:46:49 closure kernel: [    4.222366] 
> Jul 30 11:46:49 closure kernel: [    4.222371] 4 locks held by modprobe/503:
> Jul 30 11:46:49 closure kernel: [    4.222374]  #0:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
> Jul 30 11:46:49 closure kernel: [    4.222381]  #1:  (&dev->mutex){......}, at: [<ffffffff813d3ff1>] device_lock+0xf/0x11
> Jul 30 11:46:49 closure kernel: [    4.222388]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc053aeb9>] drm_dev_register+0x24/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222402]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7c9>] __blocking_notifier_call_chain+0x37/0x69
> Jul 30 11:46:49 closure kernel: [    4.222410] 
> Jul 30 11:46:49 closure kernel: [    4.222410] stack backtrace:
> Jul 30 11:46:49 closure kernel: [    4.222416] CPU: 7 PID: 503 Comm: modprobe Not tainted 4.2.0-rc4-13906-g5f1b75cd #16
> Jul 30 11:46:49 closure kernel: [    4.222420] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
> Jul 30 11:46:49 closure kernel: [    4.222425]  ffffffff8280a230 ffff8800c992b5d8 ffffffff8161a71e 0000000000000006
> Jul 30 11:46:49 closure kernel: [    4.222431]  ffffffff8280a230 ffff8800c992b628 ffffffff810b9adf ffffffff82265780
> Jul 30 11:46:49 closure kernel: [    4.222437]  ffff880405588000 0000000000000004 ffff880405588880 0000000000000004
> Jul 30 11:46:49 closure kernel: [    4.222443] Call Trace:
> Jul 30 11:46:49 closure kernel: [    4.222447]  [<ffffffff8161a71e>] dump_stack+0x4c/0x65
> Jul 30 11:46:49 closure kernel: [    4.222451]  [<ffffffff810b9adf>] print_circular_bug+0x1f8/0x209
> Jul 30 11:46:49 closure kernel: [    4.222455]  [<ffffffff810bb77a>] __lock_acquire+0xc55/0xf54
> Jul 30 11:46:49 closure kernel: [    4.222460]  [<ffffffff810bbe08>] lock_acquire+0x104/0x18b
> Jul 30 11:46:49 closure kernel: [    4.222464]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222469]  [<ffffffff8161d83a>] mutex_lock_nested+0x70/0x391
> Jul 30 11:46:49 closure kernel: [    4.222472]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222476]  [<ffffffff8138b380>] ? acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222480]  [<ffffffff8138b380>] acpi_video_get_backlight_type+0x17/0x164
> Jul 30 11:46:49 closure kernel: [    4.222484]  [<ffffffff8138b505>] acpi_video_backlight_notify+0x19/0x2f
> Jul 30 11:46:49 closure kernel: [    4.222488]  [<ffffffff8109a442>] notifier_call_chain+0x4c/0x71
> Jul 30 11:46:49 closure kernel: [    4.222492]  [<ffffffff8109a7e2>] __blocking_notifier_call_chain+0x50/0x69
> Jul 30 11:46:49 closure kernel: [    4.222496]  [<ffffffff8109a80f>] blocking_notifier_call_chain+0x14/0x16
> Jul 30 11:46:49 closure kernel: [    4.222500]  [<ffffffff8134f023>] backlight_device_register+0x1df/0x1f1
> Jul 30 11:46:49 closure kernel: [    4.222530]  [<ffffffffc07b3061>] intel_backlight_register+0xf0/0x157 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222556]  [<ffffffffc078c843>] intel_modeset_gem_init+0x158/0x164 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222584]  [<ffffffffc07b997c>] i915_driver_load+0xf1c/0x1139 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222589]  [<ffffffff810ba715>] ? mark_held_locks+0x56/0x6c
> Jul 30 11:46:49 closure kernel: [    4.222593]  [<ffffffff81620836>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
> Jul 30 11:46:49 closure kernel: [    4.222597]  [<ffffffff810ba89c>] ? trace_hardirqs_on_caller+0x171/0x18d
> Jul 30 11:46:49 closure kernel: [    4.222607]  [<ffffffffc053af19>] drm_dev_register+0x84/0xfd [drm]
> Jul 30 11:46:49 closure kernel: [    4.222618]  [<ffffffffc053d77e>] drm_get_pci_dev+0x102/0x1bc [drm]
> Jul 30 11:46:49 closure kernel: [    4.222636]  [<ffffffffc07291e2>] i915_pci_probe+0x4f/0x51 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222640]  [<ffffffff81333ad3>] pci_device_probe+0x74/0xd6
> Jul 30 11:46:49 closure kernel: [    4.222644]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
> Jul 30 11:46:49 closure kernel: [    4.222648]  [<ffffffff813d4806>] driver_probe_device+0x15f/0x387
> Jul 30 11:46:49 closure kernel: [    4.222652]  [<ffffffff813d4a2e>] ? driver_probe_device+0x387/0x387
> Jul 30 11:46:49 closure kernel: [    4.222655]  [<ffffffff813d4a81>] __driver_attach+0x53/0x74
> Jul 30 11:46:49 closure kernel: [    4.222659]  [<ffffffff813d2aa0>] bus_for_each_dev+0x6f/0x89
> Jul 30 11:46:49 closure kernel: [    4.222662]  [<ffffffff813d41f0>] driver_attach+0x1e/0x20
> Jul 30 11:46:49 closure kernel: [    4.222666]  [<ffffffff813d3e33>] bus_add_driver+0x140/0x238
> Jul 30 11:46:49 closure kernel: [    4.222670]  [<ffffffff813d53d8>] driver_register+0x8f/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222674]  [<ffffffff81332be1>] __pci_register_driver+0x5e/0x62
> Jul 30 11:46:49 closure kernel: [    4.222677]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222687]  [<ffffffffc053d890>] drm_pci_init+0x58/0xda [drm]
> Jul 30 11:46:49 closure kernel: [    4.222690]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222708]  [<ffffffffc081f0a0>] i915_init+0xa0/0xa8 [i915]
> Jul 30 11:46:49 closure kernel: [    4.222712]  [<ffffffffc081f000>] ? 0xffffffffc081f000
> Jul 30 11:46:49 closure kernel: [    4.222716]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
> Jul 30 11:46:49 closure kernel: [    4.222719]  [<ffffffff8161994d>] ? do_init_module+0x28/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222723]  [<ffffffff81199350>] ? kmem_cache_alloc_trace+0xba/0xcc
> Jul 30 11:46:49 closure kernel: [    4.222727]  [<ffffffff81619985>] do_init_module+0x60/0x1e3
> Jul 30 11:46:49 closure kernel: [    4.222731]  [<ffffffff810f0a5b>] load_module+0x1c42/0x2059
> Jul 30 11:46:49 closure kernel: [    4.222736]  [<ffffffff810f1046>] SyS_finit_module+0x85/0x92
> Jul 30 11:46:49 closure kernel: [    4.222739]  [<ffffffff8162109b>] entry_SYSCALL_64_fastpath+0x16/0x73
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Linus Torvalds July 30, 2015, 6:14 p.m. UTC | #8
On Thu, Jul 30, 2015 at 8:57 AM, Takashi Iwai <tiwai@suse.de> wrote:
> On Thu, 30 Jul 2015 17:32:28 +0200,
> Theodore Ts'o wrote:
>>
>> BTW, is there any chance that I can suspend my laptop, and then move
>> it from my docking station at home (where I have a Dell 30" display)
>> to my docking station at work (where I have a Dell 24" display), and
>> actually have the new monitor be detected?  For at least the past
>> year, I have to reboot in order to be able to use the external
>> monitor?  This used to work, but it's been a very long-standing
>> regression.  I undrstand that Multi-stream DP is a evil horrible hack,
>> and supporting it is painful, but this used to work, and it hasn't in
>> a long time.  :-(
>
> Relevant with this?
>    https://bugs.freedesktop.org/show_bug.cgi?id=89589
>
> I wanted to check this by myself, too, as the same bug was reported to
> openSUSE bugzilla, but I had no hardware showing it.

Hmm. That commit e7d6f7d70829 looks like it should still revert fairly
cleanly (just move the call to intel_dp_mst_resume() to before the
intel_modeset_setup_hw_state() call and locking).

Ted, worth checking out, even if that presumably ends up
re-introducing some WARN_ON's..

                    Linus
Daniel Vetter Aug. 3, 2015, 3:27 p.m. UTC | #9
On Thu, Jul 30, 2015 at 11:50:29AM -0400, Theodore Ts'o wrote:
> On Thu, Jul 30, 2015 at 04:40:02PM +0200, Daniel Vetter wrote:
> > I have 4 patches in git://people.freedesktop.org/~danvet/drm fixes-stuff
> > but I couldn't test them yet since no dp mst here and I didn't find
> > anything that would ship faster than 1-2 weeks yet. I'll try to get some
> > other people here to test it meanwhile too.
> 
> I've tried pulling in your patches from fixes-stuff, onto Linus's tree
> (without Linus's fix), and the good news is that I'm no longer
> crashing on boot.
> 
> The *bad* news is that (a) it breaks the external monitor attached to
> the docking station completely (this was working with Linus's patch),
> and (b) it's triggering a LOCKDEP failure.
> 
> So even though Linus's patch wasn't supposed to work, I think I'm
> going to back to it....

Ok I updated fixes-stuff with just 2 patches which seem to be enough to
fix it. Plus a patch to convert Linus' hack into something we can keep
plus a drive-by WARNING fix in mst that got in the way for me.

Seems to work here in getting rid of the Oops. If this tests out for you
too I'll send a pull to Linus.

Thanks, Daniel
Theodore Y. Ts'o Aug. 3, 2015, 4:25 p.m. UTC | #10
On Mon, Aug 03, 2015 at 05:27:29PM +0200, Daniel Vetter wrote:
> 
> Ok I updated fixes-stuff with just 2 patches which seem to be enough to
> fix it. Plus a patch to convert Linus' hack into something we can keep
> plus a drive-by WARNING fix in mst that got in the way for me.
> 
> Seems to work here in getting rid of the Oops. If this tests out for you
> too I'll send a pull to Linus.

I've just tried pulling in your updated fixes-stuff, and it avoids the
oops and allows external the monitor to work correctly.  However, I'm
still seeing a large number of drm/i915 related warning messages and
other kernel kvetching.

Thanks!!

						- Ted

[    4.084198] [drm] Initialized drm 1.1.0 20060810
[    4.129576] [drm] Memory usable by graphics device = 2048M
[    4.129616] [drm] Replacing VGA console driver
[    4.130315] Console: switching to colour dummy device 80x25
[    4.145332] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    4.145334] [drm] Driver supports precise vblank timestamp query.
[    4.146184] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    4.163778] usbcore: registered new interface driver btusb
[    4.170719] ------------[ cut here ]------------
[    4.170749] WARNING: CPU: 0 PID: 463 at /usr/projects/linux/linux/drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
[    4.170751] WARN_ON(!r->enable)
[    4.170752] Modules linked in:
[    4.170754]  btusb btrtl btbcm btintel iwlmvm(+) bluetooth mac80211 iwlwifi snd_hda_intel i915(+) drm_kms_helper snd_hda_codec cfg80211 drm snd_hwdep lpc_ich snd_hda_core intel_gtt thinkpad_acpi tpm_tis nvram tpm intel_smartconnect uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core sch_fq_codel kvm_intel kvm ecryptfs parport_pc ppdev lp parport autofs4 btrfs xor hid_generic usbhid hid raid6_pq microcode rtsx_pci_sdmmc ehci_pci e1000e rtsx_pci ehci_hcd xhci_pci ptp mfd_core pps_core xhci_hcd
[    4.170805] CPU: 0 PID: 463 Comm: systemd-udevd Not tainted 4.2.0-rc5-14194-g130583b #18
[    4.170807] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.170809]  0000000000000009 ffff880403f0f4c8 ffffffff8161aaee 0000000000000006
[    4.170814]  ffff880403f0f518 ffff880403f0f508 ffffffff8107e5f0 0000000000000006
[    4.170818]  ffffffffc05ade43 ffff8800c8b70000 ffff8800c7f16000 ffff880405fb48b8
[    4.170823] Call Trace:
[    4.170829]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.170833]  [<ffffffff8107e5f0>] warn_slowpath_common+0xa1/0xbb
[    4.170856]  [<ffffffffc05ade43>] ? ilk_update_wm+0x71a/0xb27 [i915]
[    4.170859]  [<ffffffff8107e650>] warn_slowpath_fmt+0x46/0x48
[    4.170879]  [<ffffffffc05abb1e>] ? ilk_compute_wm_maximums+0x43/0xa2 [i915]
[    4.170899]  [<ffffffffc05ade43>] ilk_update_wm+0x71a/0xb27 [i915]
[    4.170921]  [<ffffffffc05afb2b>] intel_update_watermarks+0x1e/0x20 [i915]
[    4.170957]  [<ffffffffc05ff8d4>] haswell_crtc_disable+0x270/0x2ae [i915]
[    4.170989]  [<ffffffffc060199d>] intel_crtc_control+0xa0/0xe1 [i915]
[    4.171020]  [<ffffffffc0601a2b>] intel_crtc_update_dpms+0x4d/0x5d [i915]
[    4.171052]  [<ffffffffc0607dd9>] intel_modeset_setup_hw_state+0x7b0/0xa90 [i915]
[    4.171081]  [<ffffffffc05ec6de>] ? hsw_write64+0xcd/0xcd [i915]
[    4.171113]  [<ffffffffc060ab44>] ? ilk_fbc_disable+0x29/0x69 [i915]
[    4.171142]  [<ffffffffc0609512>] intel_modeset_init+0x130d/0x14e3 [i915]
[    4.171179]  [<ffffffffc0636962>] i915_driver_load+0xf05/0x1139 [i915]
[    4.171183]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.171186]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.171189]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.171204]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.171215]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.171237]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.171240]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.171245]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171248]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.171250]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171252]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.171255]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.171257]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.171260]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.171263]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.171266]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.171268]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171278]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.171281]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171301]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.171303]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171307]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.171310]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.171313]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.171315]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.171319]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.171324]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.171327]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.171329] ---[ end trace 7eb514b89de5fc4a ]---
[    4.171331] ------------[ cut here ]------------
[    4.171354] WARNING: CPU: 0 PID: 463 at /usr/projects/linux/linux/drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
[    4.171355] WARN_ON(!r->enable)
[    4.171357] Modules linked in:
[    4.171358]  btusb btrtl btbcm btintel iwlmvm(+) bluetooth mac80211 iwlwifi snd_hda_intel i915(+) drm_kms_helper snd_hda_codec cfg80211 drm snd_hwdep lpc_ich snd_hda_core intel_gtt thinkpad_acpi tpm_tis nvram tpm intel_smartconnect uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core sch_fq_codel kvm_intel kvm ecryptfs parport_pc ppdev lp parport autofs4 btrfs xor hid_generic usbhid hid raid6_pq microcode rtsx_pci_sdmmc ehci_pci e1000e rtsx_pci ehci_hcd xhci_pci ptp mfd_core pps_core xhci_hcd
[    4.171404] CPU: 0 PID: 463 Comm: systemd-udevd Tainted: G        W       4.2.0-rc5-14194-g130583b #18
[    4.171406] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.171408]  0000000000000009 ffff880403f0f4c8 ffffffff8161aaee 0000000000000006
[    4.171412]  ffff880403f0f518 ffff880403f0f508 ffffffff8107e5f0 0000000000000006
[    4.171417]  ffffffffc05ade43 ffff8800c8b70000 ffff8800c7f15000 ffff880405fb48b8
[    4.171421] Call Trace:
[    4.171424]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.171427]  [<ffffffff8107e5f0>] warn_slowpath_common+0xa1/0xbb
[    4.171449]  [<ffffffffc05ade43>] ? ilk_update_wm+0x71a/0xb27 [i915]
[    4.171452]  [<ffffffff8107e650>] warn_slowpath_fmt+0x46/0x48
[    4.171472]  [<ffffffffc05abb1e>] ? ilk_compute_wm_maximums+0x43/0xa2 [i915]
[    4.171491]  [<ffffffffc05ade43>] ilk_update_wm+0x71a/0xb27 [i915]
[    4.171513]  [<ffffffffc05afb2b>] intel_update_watermarks+0x1e/0x20 [i915]
[    4.171546]  [<ffffffffc05ff8d4>] haswell_crtc_disable+0x270/0x2ae [i915]
[    4.171579]  [<ffffffffc060199d>] intel_crtc_control+0xa0/0xe1 [i915]
[    4.171610]  [<ffffffffc0601a2b>] intel_crtc_update_dpms+0x4d/0x5d [i915]
[    4.171641]  [<ffffffffc0607dd9>] intel_modeset_setup_hw_state+0x7b0/0xa90 [i915]
[    4.171671]  [<ffffffffc05ec6de>] ? hsw_write64+0xcd/0xcd [i915]
[    4.171702]  [<ffffffffc060ab44>] ? ilk_fbc_disable+0x29/0x69 [i915]
[    4.171733]  [<ffffffffc0609512>] intel_modeset_init+0x130d/0x14e3 [i915]
[    4.171770]  [<ffffffffc0636962>] i915_driver_load+0xf05/0x1139 [i915]
[    4.171773]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.171776]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.171779]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.171791]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.171802]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.171825]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.171828]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.171831]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171833]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.171836]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.171838]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.171841]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.171844]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.171846]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.171849]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.171852]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.171854]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171866]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.171868]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171890]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.171893]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.171896]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.171898]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.171901]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.171904]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.171907]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.171911]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.171914]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.171916] ---[ end trace 7eb514b89de5fc4b ]---
[    4.176978] Bluetooth: hci0: read Intel version: 370710018002030d48
[    4.176981] Bluetooth: hci0: Intel device is already patched. patch num: 48

[    4.181839] ======================================================
[    4.181844] [ INFO: possible circular locking dependency detected ]
[    4.181849] 4.2.0-rc5-14194-g130583b #18 Tainted: G        W      
[    4.181854] -------------------------------------------------------
[    4.181859] systemd-udevd/463 is trying to acquire lock:
[    4.181864]  (init_mutex){+.+.+.}, at: [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.181878] 
               but task is already holding lock:
[    4.181883]  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7cc>] __blocking_notifier_call_chain+0x37/0x69
[    4.181895] 
               which lock already depends on the new lock.

[    4.181902] 
               the existing dependency chain (in reverse order) is:
[    4.181912] 
               -> #1 (&(&backlight_notifier)->rwsem){++++..}:
[    4.181923]        [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.181932]        [<ffffffff8161f5ab>] down_write+0x46/0x8a
[    4.181942]        [<ffffffff8109a6c3>] blocking_notifier_chain_register+0x36/0x57
[    4.181953]        [<ffffffff8134ecae>] backlight_register_notifier+0x18/0x1a
[    4.181962]        [<ffffffff8138b5c3>] acpi_video_get_backlight_type+0xfa/0x164
[    4.181973]        [<ffffffffc03d2e45>] 0xffffffffc03d2e45
[    4.181981]        [<ffffffffc03d38a8>] 0xffffffffc03d38a8
[    4.181988]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.181997]        [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.182006]        [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.182015]        [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.182023]        [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.182031] 
               -> #0 (init_mutex){+.+.+.}:
[    4.182042]        [<ffffffff810bb7ec>] __lock_acquire+0xc55/0xf54
[    4.182050]        [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.182058]        [<ffffffff8161dc0a>] mutex_lock_nested+0x70/0x391
[    4.182066]        [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.182077]        [<ffffffff8138b665>] acpi_video_backlight_notify+0x19/0x2f
[    4.182086]        [<ffffffff8109a445>] notifier_call_chain+0x4c/0x71
[    4.182094]        [<ffffffff8109a7e5>] __blocking_notifier_call_chain+0x50/0x69
[    4.182105]        [<ffffffff8109a812>] blocking_notifier_call_chain+0x14/0x16
[    4.182116]        [<ffffffff8134f183>] backlight_device_register+0x1df/0x1f1
[    4.182125]        [<ffffffffc063005e>] intel_backlight_register+0xf0/0x157 [i915]
[    4.182174]        [<ffffffffc0609840>] intel_modeset_gem_init+0x158/0x164 [i915]
[    4.182214]        [<ffffffffc0636979>] i915_driver_load+0xf1c/0x1139 [i915]
[    4.182253]        [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.182271]        [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.182287]        [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.182314]        [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.182322]        [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.182331]        [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.182339]        [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.182347]        [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.182355]        [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.182363]        [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.182371]        [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.182379]        [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.182396]        [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.182423]        [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.182432]        [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.182440]        [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.182448]        [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.182456]        [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.182465] 
               other info that might help us debug this:

[    4.182477]  Possible unsafe locking scenario:

[    4.182486]        CPU0                    CPU1
[    4.182491]        ----                    ----
[    4.182497]   lock(&(&backlight_notifier)->rwsem);
[    4.182504]                                lock(init_mutex);
[    4.182512]                                lock(&(&backlight_notifier)->rwsem);
[    4.182522]   lock(init_mutex);
[    4.182528] 
                *** DEADLOCK ***

[    4.182540] 4 locks held by systemd-udevd/463:
[    4.182546]  #0:  (&dev->mutex){......}, at: [<ffffffff813d4151>] device_lock+0xf/0x11
[    4.182560]  #1:  (&dev->mutex){......}, at: [<ffffffff813d4151>] device_lock+0xf/0x11
[    4.182574]  #2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffc042ceb9>] drm_dev_register+0x24/0xfd [drm]
[    4.182596]  #3:  (&(&backlight_notifier)->rwsem){++++..}, at: [<ffffffff8109a7cc>] __blocking_notifier_call_chain+0x37/0x69
[    4.182612] 
               stack backtrace:
[    4.182622] CPU: 0 PID: 463 Comm: systemd-udevd Tainted: G        W       4.2.0-rc5-14194-g130583b #18
[    4.182632] Hardware name: LENOVO 20BECTO1WW/20BECTO1WW, BIOS GMET59WW (2.07 ) 02/12/2014
[    4.182642]  ffffffff8280b780 ffff880403f0f5d8 ffffffff8161aaee 0000000000000006
[    4.182654]  ffffffff8280b780 ffff880403f0f628 ffffffff810b9b51 ffffffff82265780
[    4.182667]  ffff880403de0000 0000000000000004 ffff880403de0880 0000000000000004
[    4.182679] Call Trace:
[    4.182685]  [<ffffffff8161aaee>] dump_stack+0x4c/0x65
[    4.182693]  [<ffffffff810b9b51>] print_circular_bug+0x1f8/0x209
[    4.182701]  [<ffffffff810bb7ec>] __lock_acquire+0xc55/0xf54
[    4.182710]  [<ffffffff810bbe7a>] lock_acquire+0x104/0x18b
[    4.182717]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182726]  [<ffffffff8161dc0a>] mutex_lock_nested+0x70/0x391
[    4.182734]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182742]  [<ffffffff8138b4e0>] ? acpi_video_get_backlight_type+0x17/0x164
[    4.182750]  [<ffffffff8138b4e0>] acpi_video_get_backlight_type+0x17/0x164
[    4.182759]  [<ffffffff8138b665>] acpi_video_backlight_notify+0x19/0x2f
[    4.182766]  [<ffffffff8109a445>] notifier_call_chain+0x4c/0x71
[    4.182774]  [<ffffffff8109a7e5>] __blocking_notifier_call_chain+0x50/0x69
[    4.182782]  [<ffffffff8109a812>] blocking_notifier_call_chain+0x14/0x16
[    4.182790]  [<ffffffff8134f183>] backlight_device_register+0x1df/0x1f1
[    4.182833]  [<ffffffffc063005e>] intel_backlight_register+0xf0/0x157 [i915]
[    4.182872]  [<ffffffffc0609840>] intel_modeset_gem_init+0x158/0x164 [i915]
[    4.182915]  [<ffffffffc0636979>] i915_driver_load+0xf1c/0x1139 [i915]
[    4.182924]  [<ffffffff810ba787>] ? mark_held_locks+0x56/0x6c
[    4.182932]  [<ffffffff81620c06>] ? _raw_spin_unlock_irqrestore+0x3f/0x4d
[    4.182940]  [<ffffffff810ba90e>] ? trace_hardirqs_on_caller+0x171/0x18d
[    4.182956]  [<ffffffffc042cf19>] drm_dev_register+0x84/0xfd [drm]
[    4.182972]  [<ffffffffc042f77e>] drm_get_pci_dev+0x102/0x1bc [drm]
[    4.182998]  [<ffffffffc05a61e2>] i915_pci_probe+0x4f/0x51 [i915]
[    4.183006]  [<ffffffff81333c33>] pci_device_probe+0x74/0xd6
[    4.183014]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.183021]  [<ffffffff813d4966>] driver_probe_device+0x15f/0x387
[    4.183029]  [<ffffffff813d4b8e>] ? driver_probe_device+0x387/0x387
[    4.183036]  [<ffffffff813d4be1>] __driver_attach+0x53/0x74
[    4.183043]  [<ffffffff813d2c00>] bus_for_each_dev+0x6f/0x89
[    4.183050]  [<ffffffff813d4350>] driver_attach+0x1e/0x20
[    4.183058]  [<ffffffff813d3f93>] bus_add_driver+0x140/0x238
[    4.183065]  [<ffffffff813d5538>] driver_register+0x8f/0xcc
[    4.183073]  [<ffffffff81332d41>] __pci_register_driver+0x5e/0x62
[    4.183080]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183095]  [<ffffffffc042f890>] drm_pci_init+0x58/0xda [drm]
[    4.183102]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183126]  [<ffffffffc069c0a0>] i915_init+0xa0/0xa8 [i915]
[    4.183132]  [<ffffffffc069c000>] ? 0xffffffffc069c000
[    4.183139]  [<ffffffff810003c7>] do_one_initcall+0x19a/0x1af
[    4.183146]  [<ffffffff81619d1d>] ? do_init_module+0x28/0x1e3
[    4.183153]  [<ffffffff81199429>] ? kmem_cache_alloc_trace+0xba/0xcc
[    4.183161]  [<ffffffff81619d55>] do_init_module+0x60/0x1e3
[    4.183169]  [<ffffffff810f0acd>] load_module+0x1c42/0x2059
[    4.183178]  [<ffffffff810f10b8>] SyS_finit_module+0x85/0x92
[    4.183185]  [<ffffffff8162145b>] entry_SYSCALL_64_fastpath+0x16/0x73
[    4.186598] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
[    4.191522] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    4.191536] [drm] Initialized i915 1.6.0 20150522 for 0000:00:02.0 on minor 0
[    4.191691] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[    4.248792] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5
[    4.322899] fbcon: inteldrmfb (fb0) is primary device
[    5.440946] Console: switching to colour frame buffer device 360x101
[    5.452747] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    5.452767] i915 0000:00:02.0: registered panic notifier
Linus Torvalds Aug. 3, 2015, 5:24 p.m. UTC | #11
On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> I've just tried pulling in your updated fixes-stuff, and it avoids the
> oops and allows external the monitor to work correctly.

Good. Have either of you tested the suspend/resume behavior? Is that fixed too?

>                      However, I'm
> still seeing a large number of drm/i915 related warning messages and
> other kernel kvetching.

I suspect I can live with that for now. The lockdep one looks like
it's mainly an initialization issue, so you'd never get the actual
deadlock in practice, but it's obviously annoying.  The intel_pm.c one
I'll have to defer to the i915 people for..

I'll be travelling much of this week (flying to Finland tomorrow, back
on Sunday - yay, 30h in airplanes for three days on the ground, but
it's my dad's bday), and my internet will be sporadic. But I'll have a
laptop and be able to pull stuff every once in a while.

It would be good to have this one resolved, and I just need to worry
about the remaining VM problem..

                           Linus

> [    4.170749] WARNING: CPU: 0 PID: 463 at drivers/gpu/drm/i915/intel_pm.c:2339 ilk_update_wm+0x71a/0xb27 [i915]()
> [    4.170751] WARN_ON(!r->enable)
Theodore Y. Ts'o Aug. 3, 2015, 6:49 p.m. UTC | #12
On Mon, Aug 03, 2015 at 10:24:53AM -0700, Linus Torvalds wrote:
> On Mon, Aug 3, 2015 at 9:25 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > I've just tried pulling in your updated fixes-stuff, and it avoids the
> > oops and allows external the monitor to work correctly.
> 
> Good. Have either of you tested the suspend/resume behavior? Is that fixed too?

No, I haven't had a chance to test the suspend/resume behavior,
because that requires suspending at work, going home, and connecting
to a dock which has a different monitor attached to it, and resuming
(or vice versa of suspending at home and then resuming at work).

So it's a bit trickier for me to test.  It's also not a regression,
and the workaround of rebooting is annoying, but I've lived with it
for several releases now, but I'll try the two patches/changes that
folks had suggested hopefully later this week.

						- Ted
Daniel Vetter Aug. 3, 2015, 10:05 p.m. UTC | #13
On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>>                      However, I'm
>> still seeing a large number of drm/i915 related warning messages and
>> other kernel kvetching.
>
> I suspect I can live with that for now. The lockdep one looks like
> it's mainly an initialization issue, so you'd never get the actual
> deadlock in practice, but it's obviously annoying.  The intel_pm.c one
> I'll have to defer to the i915 people for..

The lockdep splat is just acpi being inconsistent with init_mutex vs.
backlight notifier_chain (which has it's own lock) calls. init_mutex
is new in 4.2 and has been added in

commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Tue Jun 16 16:27:48 2015 +0200

    acpi-video-detect: Rewrite backlight interface selection logic


Not mine ;-) But adding relevant people.

I'll send you a pull for the mst one tomorrow and look into the
watermark fail in intel_pm.c too.
-Daniel
Rafael J. Wysocki Aug. 4, 2015, 1:17 a.m. UTC | #14
On Tuesday, August 04, 2015 12:05:14 AM Daniel Vetter wrote:
> On Mon, Aug 3, 2015 at 7:24 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >>                      However, I'm
> >> still seeing a large number of drm/i915 related warning messages and
> >> other kernel kvetching.
> >
> > I suspect I can live with that for now. The lockdep one looks like
> > it's mainly an initialization issue, so you'd never get the actual
> > deadlock in practice, but it's obviously annoying.  The intel_pm.c one
> > I'll have to defer to the i915 people for..
> 
> The lockdep splat is just acpi being inconsistent with init_mutex vs.
> backlight notifier_chain (which has it's own lock) calls. init_mutex
> is new in 4.2 and has been added in
> 
> commit 87521e16a7abbf3fa337f56cb4d1e18247f15e8a
> Author: Hans de Goede <hdegoede@redhat.com>
> Date:   Tue Jun 16 16:27:48 2015 +0200
> 
>     acpi-video-detect: Rewrite backlight interface selection logic
> 
> 
> Not mine ;-) But adding relevant people.

Hans, can you have a look at this, please?

Rafael
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 5b59d5ad7d1c..aac212297b49 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -230,10 +230,12 @@  update_connector_routing(struct drm_atomic_state *state, int conn_idx)
 	}
 
 	connector_state->best_encoder = new_encoder;
-	idx = drm_crtc_index(connector_state->crtc);
+	if (connector_state->crtc) {
+		idx = drm_crtc_index(connector_state->crtc);
 
-	crtc_state = state->crtc_states[idx];
-	crtc_state->mode_changed = true;
+		crtc_state = state->crtc_states[idx];
+		crtc_state->mode_changed = true;
+	}
 
 	DRM_DEBUG_ATOMIC("[CONNECTOR:%d:%s] using [ENCODER:%d:%s] on [CRTC:%d]\n",
 			 connector->base.id,