Message ID | 1462284220-14930-1-git-send-email-cpaul@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 03, 2016 at 10:03:40AM -0400, Lyude wrote: > If an MST device is disconnected while the machine is suspended, the > number of connectors will change as well after we call > intel_dp_mst_resume(). This means that any previous atomic state we had > before suspending is no longer valid, since it'll still be pointing to > missing connectors. We need to check for this before committing the > state, otherwise we'll kernel panic on resume whenever if any MST > display was disconnected before we started resuming: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper] > Call Trace: > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > [<ffffffff814da58d>] async_resume+0x1d/0x50 > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > [<ffffffff810b3794>] kthread+0xe4/0x100 > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > Cc: stable@vger.kernel.org > Signed-off-by: Lyude <cpaul@redhat.com> This should be addressed by the connector refcounting fixes Dave Airlie has for 4.7 (not all merged yet though). Can you please retest with those? -Daniel > --- > drivers/gpu/drm/i915/intel_display.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 6e0d828..252c06c 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -15945,6 +15945,17 @@ void intel_display_resume(struct drm_device *dev) > dev_priv->modeset_restore_state = NULL; > > /* > + * With MST, the number of connectors can change between suspend and > + * resume, which means that the state we want to restore might now be > + * impossible to use since it'll be pointing to non-existant > + * connectors. > + */ > + if (state->num_connector != dev->mode_config.num_connector) { > + drm_atomic_state_free(state); > + state = NULL; > + } > + > + /* > * This is a cludge because with real atomic modeset mode_config.mutex > * won't be taken. Unfortunately some probed state like > * audio_codec_enable is still protected by mode_config.mutex, so lock > -- > 2.5.5 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Yeah airlied said the same thing. This patch is more intended for just 4.6 since the refcounting patch isn't very likely to get into 4.6. On Tue, 2016-05-03 at 16:29 +0200, Daniel Vetter wrote: > On Tue, May 03, 2016 at 10:03:40AM -0400, Lyude wrote: > > > > If an MST device is disconnected while the machine is suspended, the > > number of connectors will change as well after we call > > intel_dp_mst_resume(). This means that any previous atomic state we had > > before suspending is no longer valid, since it'll still be pointing to > > missing connectors. We need to check for this before committing the > > state, otherwise we'll kernel panic on resume whenever if any MST > > display was disconnected before we started resuming: > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 > > [drm_kms_helper] > > Call Trace: > > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > > [<ffffffff814da58d>] async_resume+0x1d/0x50 > > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > > [<ffffffff810b3794>] kthread+0xe4/0x100 > > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > > > Cc: stable@vger.kernel.org > > Signed-off-by: Lyude <cpaul@redhat.com> > This should be addressed by the connector refcounting fixes Dave Airlie > has for 4.7 (not all merged yet though). Can you please retest with those? > -Daniel > > > > > --- > > drivers/gpu/drm/i915/intel_display.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > b/drivers/gpu/drm/i915/intel_display.c > > index 6e0d828..252c06c 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -15945,6 +15945,17 @@ void intel_display_resume(struct drm_device *dev) > > dev_priv->modeset_restore_state = NULL; > > > > /* > > + * With MST, the number of connectors can change between suspend > > and > > + * resume, which means that the state we want to restore might now > > be > > + * impossible to use since it'll be pointing to non-existant > > + * connectors. > > + */ > > + if (state->num_connector != dev->mode_config.num_connector) { > > + drm_atomic_state_free(state); > > + state = NULL; > > + } > > + > > + /* > > * This is a cludge because with real atomic modeset > > mode_config.mutex > > * won't be taken. Unfortunately some probed state like > > * audio_codec_enable is still protected by mode_config.mutex, so > > lock > > -- > > 2.5.5 > > > > _______________________________________________ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 6e0d828..252c06c 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15945,6 +15945,17 @@ void intel_display_resume(struct drm_device *dev) dev_priv->modeset_restore_state = NULL; /* + * With MST, the number of connectors can change between suspend and + * resume, which means that the state we want to restore might now be + * impossible to use since it'll be pointing to non-existant + * connectors. + */ + if (state->num_connector != dev->mode_config.num_connector) { + drm_atomic_state_free(state); + state = NULL; + } + + /* * This is a cludge because with real atomic modeset mode_config.mutex * won't be taken. Unfortunately some probed state like * audio_codec_enable is still protected by mode_config.mutex, so lock
If an MST device is disconnected while the machine is suspended, the number of connectors will change as well after we call intel_dp_mst_resume(). This means that any previous atomic state we had before suspending is no longer valid, since it'll still be pointing to missing connectors. We need to check for this before committing the state, otherwise we'll kernel panic on resume whenever if any MST display was disconnected before we started resuming: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper] Call Trace: [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 [<ffffffff814da455>] device_resume+0xd5/0x1f0 [<ffffffff814da58d>] async_resume+0x1d/0x50 [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 [<ffffffff810ad038>] worker_thread+0x48/0x4e0 [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 [<ffffffff810b3794>] kthread+0xe4/0x100 [<ffffffff81742672>] ret_from_fork+0x22/0x50 [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 Cc: stable@vger.kernel.org Signed-off-by: Lyude <cpaul@redhat.com> --- drivers/gpu/drm/i915/intel_display.c | 11 +++++++++++ 1 file changed, 11 insertions(+)