Message ID | 20190126012436.31382-1-lyude@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Don't send MST hotplugs until after resume | expand |
On Fri, Jan 25, 2019 at 08:24:35PM -0500, Lyude Paul wrote: > Turns out we are sending a lot more hotplug events then we need, and > this is causing some pretty serious issues. Currently, we call > intel_dp_mst_resume() in i915_drm_resume() well before we have any sort > of hotplugging setup. We call hpd_irq_setup() before calling intel_dp_mst_resume(). The only purpose of that part (lifted out from intel_hpd_init()) is to provide the short HPD interrupt functionality MST AUX transfers need. But you are right in that - as a side-effect - we'll also enable generic hotplug functionality that is independent of the above MST requirement. Doing that kind of generic hotplug processing before intel_display_resume() is probably not a good idea, it can interfere at least with the mode restore in __intel_display_resume(). > This is a pretty big problem, because in practice it will generally > result in throwing the power domain refcounts out of wack. > > For instance: On my T480s, removing a previously connected topology > before the system finishes resuming causes > drm_kms_helper_hotplug_event() to be called before HPD is setup again, > which causes us to do a connector reprobe, which then causes > intel_dp_detect() to be called on all DP devices -including- the eDP > display. From there, intel_dp_detect() is run on the eDP display which > triggers DPCD transactions. Those DPCD transactions then cause us to > call edp_panel_vdd_on(), which then causes us to grab an additional > wakeref to the relevant power wells (PORT_DDI_A_IO on this machine). > From there, this wakeref is never released which then causes the next > suspend/resume cycle to entirely fail due to the hardware not being > powered off correctly. > > This sucks really badly, and I don't see any decent way to actually fix > this in intel_dp_detect() easily. Additionally, I don't even think it'd > be worth the time now since we're not expecting to handle any kind of > connector reprobing at the point in which we call intel_dp_mst_resume(), > but we also can't move intel_dp_mst_resume() any higher in the resume > process since MST topologies need to be resumed before > intel_display_resume() is called. > > However, there's a light at the end of the tunnel! After reading through > a lot of code dozens of times, it occurred to me that we -never- > actually need to send hotplug events when calling > drm_dp_mst_topology_mgr_set_mst() since we send hotplug events in > drm_dp_destroy_connector_work(). Imagine that! > > So, since we only seem to call intel_dp_mst_check_status() to disable > MST on the encoder in question and then send a hotplug, get rid of this > and instead just disable MST mode when a hub fails in > intel_dp_mst_resume(). From there, drm_dp_destroy_connector_work() will > eventually send the hotplug event. > > Signed-off-by: Lyude Paul <lyude@redhat.com> > Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)") > Cc: Todd Previte <tprevite@gmail.com> > Cc: Dave Airlie <airlied@redhat.com> > Cc: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: intel-gfx@lists.freedesktop.org > Cc: <stable@vger.kernel.org> # v3.17+ Not knowing enough about the MST code, but we do need to prevent generic hotplug processing at this point: Acked-by: Imre Deak <imre.deak@intel.com> > --- > drivers/gpu/drm/i915/intel_dp.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 681e88405ada..c2399acf177b 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -7096,7 +7096,10 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv) > continue; > > ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr); > - if (ret) > - intel_dp_check_mst_status(intel_dp); > + if (ret) { > + intel_dp->is_mst = false; > + drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr, > + false); > + } > } > } > -- > 2.20.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: 0e32b39ceed6 drm/i915: add DP 1.2 MST support (v0.7). The bot has tested the following trees: v4.20.5, v4.19.18, v4.14.96, v4.9.153, v4.4.172, v3.18.133. v4.20.5: Build OK! v4.19.18: Build OK! v4.14.96: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") v4.9.153: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") v4.4.172: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") 61642ff03523 ("drm/i915: Inspect subunit states on hangcheck") ca82580c9cea ("drm/i915: Do not call API requiring struct_mutex where it is not available") cbdc12a9fc9d ("drm/i915: make A0 wa's applied to A1") e28e404c3e93 ("drm/i915: tidy up a few leftovers") e2f80391478a ("drm/i915: Rename local struct intel_engine_cs variables") e87a005d90c3 ("drm/i915: add helpers for platform specific revision id range checks") ed54c1a1d11c ("drm/i915: abolish separate per-ring default_context pointers") ef712bb4b700 ("drm/i915: remove parens around revision ids") fac5e23e3c38 ("drm/i915: Mass convert dev->dev_private to to_i915(dev)") fffda3f4fb49 ("drm/i915/bxt: add revision id for A1 stepping and use it") v3.18.133: Failed to apply! Possible dependencies: 08524a9ffa39 ("drm/i915/skl: Restore pipe B/C interrupts") 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") 2363d8c97f87 ("drm/i915: Restore resume irq ordering comment") 2aeb7d3a4d42 ("drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/") 2eb5252e2fff ("drm/i915: disable rps irqs earlier during suspend/unload") 8a8b009d1337 ("drm/i915/skl: Skylake shares the interrupt logic with Broadwell") 970104fac6ca ("drm/i915: Remove intel_modeset_suspend_hw") 9c065a7d5b67 ("drm/i915: Extract intel_runtime_pm.c") b963291cf9af ("drm/i915: Use dev_priv instead of dev in irq setup functions") d2dee86cece9 ("drm/i915: extract intel_init_fbc()") fac6adb06a53 ("drm/i915: fix RPS on runtime suspend") How should we proceed with this patch? -- Thanks, Sasha
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: 0e32b39ceed6 drm/i915: add DP 1.2 MST support (v0.7). The bot has tested the following trees: v4.20.5, v4.19.18, v4.14.96, v4.9.153, v4.4.172, v3.18.133. v4.20.5: Build OK! v4.19.18: Build OK! v4.14.96: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") v4.9.153: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") v4.4.172: Failed to apply! Possible dependencies: 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") 61642ff03523 ("drm/i915: Inspect subunit states on hangcheck") ca82580c9cea ("drm/i915: Do not call API requiring struct_mutex where it is not available") cbdc12a9fc9d ("drm/i915: make A0 wa's applied to A1") e28e404c3e93 ("drm/i915: tidy up a few leftovers") e2f80391478a ("drm/i915: Rename local struct intel_engine_cs variables") e87a005d90c3 ("drm/i915: add helpers for platform specific revision id range checks") ed54c1a1d11c ("drm/i915: abolish separate per-ring default_context pointers") ef712bb4b700 ("drm/i915: remove parens around revision ids") fac5e23e3c38 ("drm/i915: Mass convert dev->dev_private to to_i915(dev)") fffda3f4fb49 ("drm/i915/bxt: add revision id for A1 stepping and use it") v3.18.133: Failed to apply! Possible dependencies: 08524a9ffa39 ("drm/i915/skl: Restore pipe B/C interrupts") 1a4313d13b69 ("drm/i915: Rewrite mst suspend/resume in terms of encoders") 2363d8c97f87 ("drm/i915: Restore resume irq ordering comment") 2aeb7d3a4d42 ("drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/") 2eb5252e2fff ("drm/i915: disable rps irqs earlier during suspend/unload") 8a8b009d1337 ("drm/i915/skl: Skylake shares the interrupt logic with Broadwell") 970104fac6ca ("drm/i915: Remove intel_modeset_suspend_hw") 9c065a7d5b67 ("drm/i915: Extract intel_runtime_pm.c") b963291cf9af ("drm/i915: Use dev_priv instead of dev in irq setup functions") d2dee86cece9 ("drm/i915: extract intel_init_fbc()") fac6adb06a53 ("drm/i915: fix RPS on runtime suspend") How should we proceed with this patch? -- Thanks, Sasha
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 681e88405ada..c2399acf177b 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -7096,7 +7096,10 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv) continue; ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr); - if (ret) - intel_dp_check_mst_status(intel_dp); + if (ret) { + intel_dp->is_mst = false; + drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr, + false); + } } }
Turns out we are sending a lot more hotplug events then we need, and this is causing some pretty serious issues. Currently, we call intel_dp_mst_resume() in i915_drm_resume() well before we have any sort of hotplugging setup. This is a pretty big problem, because in practice it will generally result in throwing the power domain refcounts out of wack. For instance: On my T480s, removing a previously connected topology before the system finishes resuming causes drm_kms_helper_hotplug_event() to be called before HPD is setup again, which causes us to do a connector reprobe, which then causes intel_dp_detect() to be called on all DP devices -including- the eDP display. From there, intel_dp_detect() is run on the eDP display which triggers DPCD transactions. Those DPCD transactions then cause us to call edp_panel_vdd_on(), which then causes us to grab an additional wakeref to the relevant power wells (PORT_DDI_A_IO on this machine). From there, this wakeref is never released which then causes the next suspend/resume cycle to entirely fail due to the hardware not being powered off correctly. This sucks really badly, and I don't see any decent way to actually fix this in intel_dp_detect() easily. Additionally, I don't even think it'd be worth the time now since we're not expecting to handle any kind of connector reprobing at the point in which we call intel_dp_mst_resume(), but we also can't move intel_dp_mst_resume() any higher in the resume process since MST topologies need to be resumed before intel_display_resume() is called. However, there's a light at the end of the tunnel! After reading through a lot of code dozens of times, it occurred to me that we -never- actually need to send hotplug events when calling drm_dp_mst_topology_mgr_set_mst() since we send hotplug events in drm_dp_destroy_connector_work(). Imagine that! So, since we only seem to call intel_dp_mst_check_status() to disable MST on the encoder in question and then send a hotplug, get rid of this and instead just disable MST mode when a hub fails in intel_dp_mst_resume(). From there, drm_dp_destroy_connector_work() will eventually send the hotplug event. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)") Cc: Todd Previte <tprevite@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.17+ --- drivers/gpu/drm/i915/intel_dp.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)