Message ID | 1462570796-22500-1-git-send-email-cpaul@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, May 06, 2016 at 05:39:56PM -0400, Lyude wrote: > If an MST device is disconnected while the machine is suspended, the > number of connectors will change as well after we call > intel_dp_mst_resume(). This means that any previous atomic state we had > before suspending is no longer valid, since it'll still be pointing to > missing connectors. We need to check for this before committing the > state, otherwise we'll kernel panic on resume whenever if any MST > display was disconnected before we started resuming: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper] > Call Trace: > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > [<ffffffff814da58d>] async_resume+0x1d/0x50 > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > [<ffffffff810b3794>] kthread+0xe4/0x100 > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > Changes since v1: > - Move drm_atomic_state_free() call down so we're holding the > appropriate locks when destroying the atomic state > > Cc: stable@vger.kernel.org > Signed-off-by: Lyude <cpaul@redhat.com> Is this still an issue on -nightly with the connector refcounting fixed? -Daniel > --- > drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 0f29ef6..d68efc7 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -15934,6 +15934,18 @@ void intel_display_resume(struct drm_device *dev) > retry: > ret = drm_modeset_lock_all_ctx(dev, &ctx); > > + /* > + * With MST, the number of connectors can change between suspend and > + * resume, which means that the state we want to restore might now be > + * impossible to use since it'll be pointing to non-existant > + * connectors. > + */ > + if (ret == 0 && > + state->num_connector != dev->mode_config.num_connector) { > + drm_atomic_state_free(state); > + state = NULL; > + } > + > if (ret == 0 && !setup) { > setup = true; > > -- > 2.5.5 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Yep, Dave's patches fix the issue on their own so this is only going to be needed for 4.6. On Mon, 2016-05-09 at 08:42 +0200, Daniel Vetter wrote: > On Fri, May 06, 2016 at 05:39:56PM -0400, Lyude wrote: > > > > If an MST device is disconnected while the machine is suspended, the > > number of connectors will change as well after we call > > intel_dp_mst_resume(). This means that any previous atomic state we had > > before suspending is no longer valid, since it'll still be pointing to > > missing connectors. We need to check for this before committing the > > state, otherwise we'll kernel panic on resume whenever if any MST > > display was disconnected before we started resuming: > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 > > [drm_kms_helper] > > Call Trace: > > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > > [<ffffffff814da58d>] async_resume+0x1d/0x50 > > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > > [<ffffffff810b3794>] kthread+0xe4/0x100 > > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > > > Changes since v1: > > - Move drm_atomic_state_free() call down so we're holding the > > appropriate locks when destroying the atomic state > > > > Cc: stable@vger.kernel.org > > Signed-off-by: Lyude <cpaul@redhat.com> > Is this still an issue on -nightly with the connector refcounting fixed? > -Daniel > > > > > --- > > drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > b/drivers/gpu/drm/i915/intel_display.c > > index 0f29ef6..d68efc7 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -15934,6 +15934,18 @@ void intel_display_resume(struct drm_device *dev) > > retry: > > ret = drm_modeset_lock_all_ctx(dev, &ctx); > > > > + /* > > + * With MST, the number of connectors can change between suspend > > and > > + * resume, which means that the state we want to restore might now > > be > > + * impossible to use since it'll be pointing to non-existant > > + * connectors. > > + */ > > + if (ret == 0 && > > + state->num_connector != dev->mode_config.num_connector) { > > + drm_atomic_state_free(state); > > + state = NULL; > > + } > > + > > if (ret == 0 && !setup) { > > setup = true; > > > > -- > > 2.5.5 > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, May 09, 2016 at 10:07:08AM -0400, Lyude Paul wrote: > Yep, Dave's patches fix the issue on their own so this is only going to be > needed for 4.6. Ok, so not the first one that only needs to be applied to stable kernels. For that to work you need to explain that this is the most minimal fix for stabel kernels, and add a reference to the final commit in upstream that fixes the issue for real. Otherwise Greg KH won't take it. It still needs my ack. Probably best to have an entire dp mst series for all of those, but maybe ping me first on irc so I can check them quickly before you hit send. -Daniel > > On Mon, 2016-05-09 at 08:42 +0200, Daniel Vetter wrote: > > On Fri, May 06, 2016 at 05:39:56PM -0400, Lyude wrote: > > > > > > If an MST device is disconnected while the machine is suspended, the > > > number of connectors will change as well after we call > > > intel_dp_mst_resume(). This means that any previous atomic state we had > > > before suspending is no longer valid, since it'll still be pointing to > > > missing connectors. We need to check for this before committing the > > > state, otherwise we'll kernel panic on resume whenever if any MST > > > display was disconnected before we started resuming: > > > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > > > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 > > > [drm_kms_helper] > > > Call Trace: > > > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > > > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > > > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > > > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > > > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > > > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > > > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > > > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > > > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > > > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > > > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > > > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > > > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > > > [<ffffffff814da58d>] async_resume+0x1d/0x50 > > > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > > > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > > > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > > > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > > > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > > > [<ffffffff810b3794>] kthread+0xe4/0x100 > > > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > > > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > > > > > Changes since v1: > > > - Move drm_atomic_state_free() call down so we're holding the > > > appropriate locks when destroying the atomic state > > > > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Lyude <cpaul@redhat.com> > > Is this still an issue on -nightly with the connector refcounting fixed? > > -Daniel > > > > > > > > --- > > > drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > > b/drivers/gpu/drm/i915/intel_display.c > > > index 0f29ef6..d68efc7 100644 > > > --- a/drivers/gpu/drm/i915/intel_display.c > > > +++ b/drivers/gpu/drm/i915/intel_display.c > > > @@ -15934,6 +15934,18 @@ void intel_display_resume(struct drm_device *dev) > > > retry: > > > ret = drm_modeset_lock_all_ctx(dev, &ctx); > > > > > > + /* > > > + * With MST, the number of connectors can change between suspend > > > and > > > + * resume, which means that the state we want to restore might now > > > be > > > + * impossible to use since it'll be pointing to non-existant > > > + * connectors. > > > + */ > > > + if (ret == 0 && > > > + state->num_connector != dev->mode_config.num_connector) { > > > + drm_atomic_state_free(state); > > > + state = NULL; > > > + } > > > + > > > if (ret == 0 && !setup) { > > > setup = true; > > > > > > -- > > > 2.5.5 > > > > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > -- > Cheers, > Lyude >
no problem, I'll let you know when they're ready On Mon, 2016-05-09 at 16:53 +0200, Daniel Vetter wrote: > On Mon, May 09, 2016 at 10:07:08AM -0400, Lyude Paul wrote: > > > > Yep, Dave's patches fix the issue on their own so this is only going to be > > needed for 4.6. > Ok, so not the first one that only needs to be applied to stable kernels. > For that to work you need to explain that this is the most minimal fix for > stabel kernels, and add a reference to the final commit in upstream that > fixes the issue for real. Otherwise Greg KH won't take it. It still needs > my ack. > > Probably best to have an entire dp mst series for all of those, but maybe > ping me first on irc so I can check them quickly before you hit send. > -Daniel > > > > > > > On Mon, 2016-05-09 at 08:42 +0200, Daniel Vetter wrote: > > > > > > On Fri, May 06, 2016 at 05:39:56PM -0400, Lyude wrote: > > > > > > > > > > > > If an MST device is disconnected while the machine is suspended, the > > > > number of connectors will change as well after we call > > > > intel_dp_mst_resume(). This means that any previous atomic state we had > > > > before suspending is no longer valid, since it'll still be pointing to > > > > missing connectors. We need to check for this before committing the > > > > state, otherwise we'll kernel panic on resume whenever if any MST > > > > display was disconnected before we started resuming: > > > > > > > > BUG: unable to handle kernel NULL pointer dereference at > > > > 0000000000000008 > > > > IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 > > > > [drm_kms_helper] > > > > Call Trace: > > > > [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] > > > > [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 > > > > [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 > > > > [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] > > > > [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 > > > > [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] > > > > [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] > > > > [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 > > > > [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] > > > > [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] > > > > [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 > > > > [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 > > > > [<ffffffff814da455>] device_resume+0xd5/0x1f0 > > > > [<ffffffff814da58d>] async_resume+0x1d/0x50 > > > > [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 > > > > [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 > > > > [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 > > > > [<ffffffff810ad038>] worker_thread+0x48/0x4e0 > > > > [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 > > > > [<ffffffff810b3794>] kthread+0xe4/0x100 > > > > [<ffffffff81742672>] ret_from_fork+0x22/0x50 > > > > [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 > > > > > > > > Changes since v1: > > > > - Move drm_atomic_state_free() call down so we're holding the > > > > appropriate locks when destroying the atomic state > > > > > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Lyude <cpaul@redhat.com> > > > Is this still an issue on -nightly with the connector refcounting fixed? > > > -Daniel > > > > > > > > > > > > > > > --- > > > > drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++ > > > > 1 file changed, 12 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > > > b/drivers/gpu/drm/i915/intel_display.c > > > > index 0f29ef6..d68efc7 100644 > > > > --- a/drivers/gpu/drm/i915/intel_display.c > > > > +++ b/drivers/gpu/drm/i915/intel_display.c > > > > @@ -15934,6 +15934,18 @@ void intel_display_resume(struct drm_device > > > > *dev) > > > > retry: > > > > ret = drm_modeset_lock_all_ctx(dev, &ctx); > > > > > > > > + /* > > > > + * With MST, the number of connectors can change between > > > > suspend > > > > and > > > > + * resume, which means that the state we want to restore might > > > > now > > > > be > > > > + * impossible to use since it'll be pointing to non-existant > > > > + * connectors. > > > > + */ > > > > + if (ret == 0 && > > > > + state->num_connector != dev->mode_config.num_connector) { > > > > + drm_atomic_state_free(state); > > > > + state = NULL; > > > > + } > > > > + > > > > if (ret == 0 && !setup) { > > > > setup = true; > > > > > > > > -- > > > > 2.5.5 > > > > > > > > _______________________________________________ > > > > dri-devel mailing list > > > > dri-devel@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > > Cheers, > > Lyude > >
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 0f29ef6..d68efc7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15934,6 +15934,18 @@ void intel_display_resume(struct drm_device *dev) retry: ret = drm_modeset_lock_all_ctx(dev, &ctx); + /* + * With MST, the number of connectors can change between suspend and + * resume, which means that the state we want to restore might now be + * impossible to use since it'll be pointing to non-existant + * connectors. + */ + if (ret == 0 && + state->num_connector != dev->mode_config.num_connector) { + drm_atomic_state_free(state); + state = NULL; + } + if (ret == 0 && !setup) { setup = true;
If an MST device is disconnected while the machine is suspended, the number of connectors will change as well after we call intel_dp_mst_resume(). This means that any previous atomic state we had before suspending is no longer valid, since it'll still be pointing to missing connectors. We need to check for this before committing the state, otherwise we'll kernel panic on resume whenever if any MST display was disconnected before we started resuming: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper] Call Trace: [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915] [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0 [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0 [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm] [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90 [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm] [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915] [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90 [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915] [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915] [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0 [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190 [<ffffffff814da455>] device_resume+0xd5/0x1f0 [<ffffffff814da58d>] async_resume+0x1d/0x50 [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150 [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0 [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0 [<ffffffff810ad038>] worker_thread+0x48/0x4e0 [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0 [<ffffffff810b3794>] kthread+0xe4/0x100 [<ffffffff81742672>] ret_from_fork+0x22/0x50 [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200 Changes since v1: - Move drm_atomic_state_free() call down so we're holding the appropriate locks when destroying the atomic state Cc: stable@vger.kernel.org Signed-off-by: Lyude <cpaul@redhat.com> --- drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)