mbox series

[00/10] drm/i915: Add support for asynchronous display power disabling

Message ID 20190502232648.4450-1-imre.deak@intel.com (mailing list archive)
Headers show
Series drm/i915: Add support for asynchronous display power disabling | expand

Message

Imre Deak May 2, 2019, 11:26 p.m. UTC
This is a preparation for making hotplug useable on ICL TypeC ports. On
ICL we need a stricter control on when either kind of AUX power domain
(TBT-alt or DP-alt) is enabled. That control becomes unfeasible if the
reference can be held for arbitratry periods due to locking
dependencies. OTOH it makes sense to restrict holding the reference only
for the duration when it's actually needed. One result of that would be
the unnecessary on-off-on power togglings when the reference is dropped
and reacquired quickly.

This patchset adds support for dropping display power domain references
asynchronously with a delay to avoid the unecessary power togglings, and
restricts holding the AUX power domain reference to the sequence where
it's required during detection and HPD pulse handling.

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>

Imre Deak (10):
  drm/i915: Add support for tracking wakerefs w/o power-on guarantee
  drm/i915: Verify power domains state during suspend in all cases
  drm/i915: Add support for asynchronous display power disabling
  drm/i915: Disable power asynchronously during DP AUX transfers
  drm/i915: WARN for eDP encoders in intel_dp_detect_dpcd()
  drm/i915: Remove the unneeded AUX power ref from intel_dp_detect()
  drm/i915: Remove the unneeded AUX power ref from intel_dp_hpd_pulse()
  drm/i915: Replace use of PLLS power domain with DISPLAY_CORE domain
  drm/i915: Avoid taking the PPS lock for non-eDP/VLV/CHV
  drm/i915: Assert that TypeC ports are not used for eDP

 drivers/gpu/drm/i915/i915_drv.h         |   6 +
 drivers/gpu/drm/i915/intel_display.c    |   2 +-
 drivers/gpu/drm/i915/intel_display.h    |   2 +-
 drivers/gpu/drm/i915/intel_dp.c         |  76 ++--
 drivers/gpu/drm/i915/intel_dpll_mgr.c   |  36 +-
 drivers/gpu/drm/i915/intel_psr.c        |   6 +
 drivers/gpu/drm/i915/intel_runtime_pm.c | 443 ++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_runtime_pm.h |   4 +
 8 files changed, 491 insertions(+), 84 deletions(-)

Comments

Imre Deak May 3, 2019, 10:07 a.m. UTC | #1
On Fri, May 03, 2019 at 07:50:26AM +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Add support for asynchronous display power disabling
> URL   : https://patchwork.freedesktop.org/series/60242/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_6032_full -> Patchwork_12955_full
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_12955_full absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_12955_full, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_12955_full:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@gem_persistent_relocs@forked-interruptible-thrashing:
>     - shard-glk:          [PASS][1] -> [TIMEOUT][2]
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html

Looks like an unrelated issue: on this GLK there are two HDMI displays
connected, so the change shouldn't make any diffence on it. The change
only affects the DP detect and hotplug paths, where we'll do now an
async power domain put.

The machine is still up when the problem happens, the test seems to get
stuck and aborted by the test runner (after ~6mins according to [1]).

[43/82] (762s left) gem_persistent_relocs (forked-interruptible-thrashing)
Starting subtest: forked-interruptible-thrashing
Timeout. Killing the current test with SIGQUIT.
Timeout. Killing the current test with SIGKILL.
Build timed out (after 20 minutes)

Err:	
Starting subtest: forked-interruptible-thrashing
Received signal SIGQUIT.
Stack trace: 
 #0 [fatal_sig_handler+0xd5]
 #1 [killpg+0x40]
 #2 [waitpid+0x12]
 #3 [__waitpid+0x38]
 #4 [igt_wait_helper+0x44]
 #5 [igt_stop_helper+0x52]
 #6 [do_forked_test+0x13e]
 #7 [__real_main317+0x1b4]
 #8 [main+0x44]
 #9 [__libc_start_main+0xe7]
 #10 [_start+0x2a]

[1] https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/runtimes25.log

> 
>   
> Known issues
> ------------
> 
>   Here are the changes found in Patchwork_12955_full that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt@i915_pm_rpm@drm-resources-equal:
>     - shard-skl:          [PASS][3] -> [INCOMPLETE][4] ([fdo#107807] / [fdo#110581])
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl10/igt@i915_pm_rpm@drm-resources-equal.html
>    [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl8/igt@i915_pm_rpm@drm-resources-equal.html
> 
>   * igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels:
>     - shard-iclb:         [PASS][5] -> [DMESG-WARN][6] ([fdo#107724])
>    [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb1/igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels.html
>    [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb2/igt@kms_atomic_transition@plane-all-modeset-transition-fencing-internal-panels.html
> 
>   * igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled:
>     - shard-skl:          [PASS][7] -> [FAIL][8] ([fdo#103184] / [fdo#103232])
>    [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl2/igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled.html
>    [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl9/igt@kms_draw_crc@draw-method-xrgb8888-render-xtiled.html
> 
>   * igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu:
>     - shard-skl:          [PASS][9] -> [FAIL][10] ([fdo#103167] / [fdo#110379])
>    [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl2/igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu.html
>    [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl9/igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-cpu.html
> 
>   * igt@kms_frontbuffer_tracking@fbc-tilingchange:
>     - shard-iclb:         [PASS][11] -> [FAIL][12] ([fdo#103167]) +4 similar issues
>    [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb8/igt@kms_frontbuffer_tracking@fbc-tilingchange.html
>    [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-tilingchange.html
> 
>   * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt:
>     - shard-skl:          [PASS][13] -> [FAIL][14] ([fdo#108040])
>    [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt.html
>    [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl9/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt.html
> 
>   * igt@kms_plane@pixel-format-pipe-a-planes-source-clamping:
>     - shard-glk:          [PASS][15] -> [SKIP][16] ([fdo#109271])
>    [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk9/igt@kms_plane@pixel-format-pipe-a-planes-source-clamping.html
>    [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk4/igt@kms_plane@pixel-format-pipe-a-planes-source-clamping.html
> 
>   * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes:
>     - shard-apl:          [PASS][17] -> [DMESG-WARN][18] ([fdo#108566]) +3 similar issues
>    [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-apl3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes.html
>    [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-apl3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes.html
> 
>   * igt@kms_plane@plane-position-hole-dpms-pipe-b-planes:
>     - shard-snb:          [PASS][19] -> [SKIP][20] ([fdo#109271]) +1 similar issue
>    [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-snb2/igt@kms_plane@plane-position-hole-dpms-pipe-b-planes.html
>    [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-snb2/igt@kms_plane@plane-position-hole-dpms-pipe-b-planes.html
> 
>   * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
>     - shard-skl:          [PASS][21] -> [FAIL][22] ([fdo#108145])
>    [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl1/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
>    [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl9/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
> 
>   * igt@kms_psr@psr2_primary_mmap_cpu:
>     - shard-iclb:         [PASS][23] -> [SKIP][24] ([fdo#109441]) +2 similar issues
>    [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
>    [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb3/igt@kms_psr@psr2_primary_mmap_cpu.html
> 
>   
> #### Possible fixes ####
> 
>   * igt@gem_workarounds@suspend-resume:
>     - shard-apl:          [DMESG-WARN][25] ([fdo#108566]) -> [PASS][26] +6 similar issues
>    [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-apl7/igt@gem_workarounds@suspend-resume.html
>    [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-apl4/igt@gem_workarounds@suspend-resume.html
> 
>   * igt@i915_pm_rpm@basic-pci-d3-state:
>     - shard-iclb:         [INCOMPLETE][27] ([fdo#107713] / [fdo#108840] / [fdo#110581]) -> [PASS][28]
>    [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb4/igt@i915_pm_rpm@basic-pci-d3-state.html
>    [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb7/igt@i915_pm_rpm@basic-pci-d3-state.html
> 
>   * igt@i915_pm_rpm@i2c:
>     - shard-iclb:         [DMESG-WARN][29] ([fdo#109982]) -> [PASS][30]
>    [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb2/igt@i915_pm_rpm@i2c.html
>    [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb3/igt@i915_pm_rpm@i2c.html
> 
>   * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-mmap-gtt:
>     - shard-iclb:         [FAIL][31] ([fdo#103167]) -> [PASS][32] +7 similar issues
>    [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-mmap-gtt.html
>    [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-mmap-gtt.html
> 
>   * igt@kms_plane@pixel-format-pipe-b-planes-source-clamping:
>     - shard-glk:          [SKIP][33] ([fdo#109271]) -> [PASS][34]
>    [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk4/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html
>    [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk9/igt@kms_plane@pixel-format-pipe-b-planes-source-clamping.html
> 
>   * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
>     - shard-skl:          [FAIL][35] ([fdo#108145] / [fdo#110403]) -> [PASS][36]
>    [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
>    [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
> 
>   * igt@kms_plane_scaling@pipe-a-scaler-with-pixel-format:
>     - shard-glk:          [SKIP][37] ([fdo#109271] / [fdo#109278]) -> [PASS][38] +1 similar issue
>    [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk3/igt@kms_plane_scaling@pipe-a-scaler-with-pixel-format.html
>    [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk9/igt@kms_plane_scaling@pipe-a-scaler-with-pixel-format.html
> 
>   * igt@kms_psr@psr2_cursor_mmap_cpu:
>     - shard-iclb:         [SKIP][39] ([fdo#109441]) -> [PASS][40]
>    [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb1/igt@kms_psr@psr2_cursor_mmap_cpu.html
>    [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_cpu.html
> 
>   * igt@kms_setmode@basic:
>     - shard-apl:          [FAIL][41] ([fdo#99912]) -> [PASS][42]
>    [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-apl8/igt@kms_setmode@basic.html
>    [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-apl1/igt@kms_setmode@basic.html
>     - shard-kbl:          [FAIL][43] ([fdo#99912]) -> [PASS][44]
>    [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-kbl1/igt@kms_setmode@basic.html
>    [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-kbl6/igt@kms_setmode@basic.html
> 
>   * igt@kms_sysfs_edid_timing:
>     - shard-iclb:         [FAIL][45] ([fdo#100047]) -> [PASS][46]
>    [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb3/igt@kms_sysfs_edid_timing.html
>    [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb5/igt@kms_sysfs_edid_timing.html
> 
>   * igt@tools_test@tools_test:
>     - shard-iclb:         [SKIP][47] ([fdo#109352]) -> [PASS][48]
>    [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-iclb6/igt@tools_test@tools_test.html
>    [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-iclb1/igt@tools_test@tools_test.html
> 
>   
> #### Warnings ####
> 
>   * igt@kms_atomic_transition@3x-modeset-transitions:
>     - shard-snb:          [SKIP][49] ([fdo#109271] / [fdo#109278]) -> [SKIP][50] ([fdo#109271])
>    [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-snb2/igt@kms_atomic_transition@3x-modeset-transitions.html
>    [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-snb2/igt@kms_atomic_transition@3x-modeset-transitions.html
> 
>   
>   [fdo#100047]: https://bugs.freedesktop.org/show_bug.cgi?id=100047
>   [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
>   [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
>   [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
>   [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
>   [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
>   [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
>   [fdo#108040]: https://bugs.freedesktop.org/show_bug.cgi?id=108040
>   [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
>   [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
>   [fdo#108840]: https://bugs.freedesktop.org/show_bug.cgi?id=108840
>   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
>   [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
>   [fdo#109352]: https://bugs.freedesktop.org/show_bug.cgi?id=109352
>   [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
>   [fdo#109982]: https://bugs.freedesktop.org/show_bug.cgi?id=109982
>   [fdo#110379]: https://bugs.freedesktop.org/show_bug.cgi?id=110379
>   [fdo#110403]: https://bugs.freedesktop.org/show_bug.cgi?id=110403
>   [fdo#110581]: https://bugs.freedesktop.org/show_bug.cgi?id=110581
>   [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912
> 
> 
> Participating hosts (10 -> 10)
> ------------------------------
> 
>   No changes in participating hosts
> 
> 
> Build changes
> -------------
> 
>   * Linux: CI_DRM_6032 -> Patchwork_12955
> 
>   CI_DRM_6032: 6ad93073ac75c314e859cfe1020b569d0c63ccf5 @ git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
>   Patchwork_12955: ba990253d5d8234db983bd54d29ed627c9a613c8 @ git://anongit.freedesktop.org/gfx-ci/linux
>   piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/
Chris Wilson May 3, 2019, 12:37 p.m. UTC | #2
Quoting Imre Deak (2019-05-03 11:07:55)
> On Fri, May 03, 2019 at 07:50:26AM +0000, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: drm/i915: Add support for asynchronous display power disabling
> > URL   : https://patchwork.freedesktop.org/series/60242/
> > State : failure
> > 
> > == Summary ==
> > 
> > CI Bug Log - changes from CI_DRM_6032_full -> Patchwork_12955_full
> > ====================================================
> > 
> > Summary
> > -------
> > 
> >   **FAILURE**
> > 
> >   Serious unknown changes coming with Patchwork_12955_full absolutely need to be
> >   verified manually.
> >   
> >   If you think the reported changes have nothing to do with the changes
> >   introduced in Patchwork_12955_full, please notify your bug team to allow them
> >   to document this new failure mode, which will reduce false positives in CI.
> > 
> >   
> > 
> > Possible new issues
> > -------------------
> > 
> >   Here are the unknown changes that may have been introduced in Patchwork_12955_full:
> > 
> > ### IGT changes ###
> > 
> > #### Possible regressions ####
> > 
> >   * igt@gem_persistent_relocs@forked-interruptible-thrashing:
> >     - shard-glk:          [PASS][1] -> [TIMEOUT][2]
> >    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> >    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> 
> Looks like an unrelated issue: on this GLK there are two HDMI displays
> connected, so the change shouldn't make any diffence on it. The change
> only affects the DP detect and hotplug paths, where we'll do now an
> async power domain put.

There's no history of glk locking up there, 

> The machine is still up when the problem happens, the test seems to get
> stuck and aborted by the test runner (after ~6mins according to [1]).
> 
> [43/82] (762s left) gem_persistent_relocs (forked-interruptible-thrashing)
> Starting subtest: forked-interruptible-thrashing
> Timeout. Killing the current test with SIGQUIT.
> Timeout. Killing the current test with SIGKILL.

and yet it locked up sufficiently to not respond to a signal, suggesting
an oops (the test takes 3s normally on glk).
-Chris
Imre Deak May 3, 2019, 1:52 p.m. UTC | #3
On Fri, May 03, 2019 at 01:37:53PM +0100, Chris Wilson wrote:
> Quoting Imre Deak (2019-05-03 11:07:55)
> > On Fri, May 03, 2019 at 07:50:26AM +0000, Patchwork wrote:
> > > == Series Details ==
> > > 
> > > Series: drm/i915: Add support for asynchronous display power disabling
> > > URL   : https://patchwork.freedesktop.org/series/60242/
> > > State : failure
> > > 
> > > == Summary ==
> > > 
> > > CI Bug Log - changes from CI_DRM_6032_full -> Patchwork_12955_full
> > > ====================================================
> > > 
> > > Summary
> > > -------
> > > 
> > >   **FAILURE**
> > > 
> > >   Serious unknown changes coming with Patchwork_12955_full absolutely need to be
> > >   verified manually.
> > >   
> > >   If you think the reported changes have nothing to do with the changes
> > >   introduced in Patchwork_12955_full, please notify your bug team to allow them
> > >   to document this new failure mode, which will reduce false positives in CI.
> > > 
> > >   
> > > 
> > > Possible new issues
> > > -------------------
> > > 
> > >   Here are the unknown changes that may have been introduced in Patchwork_12955_full:
> > > 
> > > ### IGT changes ###
> > > 
> > > #### Possible regressions ####
> > > 
> > >   * igt@gem_persistent_relocs@forked-interruptible-thrashing:
> > >     - shard-glk:          [PASS][1] -> [TIMEOUT][2]
> > >    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > >    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > 
> > Looks like an unrelated issue: on this GLK there are two HDMI displays
> > connected, so the change shouldn't make any diffence on it. The change
> > only affects the DP detect and hotplug paths, where we'll do now an
> > async power domain put.
> 
> There's no history of glk locking up there, 
> 
> > The machine is still up when the problem happens, the test seems to get
> > stuck and aborted by the test runner (after ~6mins according to [1]).
> > 
> > [43/82] (762s left) gem_persistent_relocs (forked-interruptible-thrashing)
> > Starting subtest: forked-interruptible-thrashing
> > Timeout. Killing the current test with SIGQUIT.
> > Timeout. Killing the current test with SIGKILL.
> 
> and yet it locked up sufficiently to not respond to a signal, suggesting
> an oops (the test takes 3s normally on glk).

No pstore logs either. I also noticed that the run [1] above resulted in
an incomplete, if that's indicative of anything. The same goes for the
previous Patchwork_12954 run.

> -Chris
Imre Deak May 3, 2019, 2:21 p.m. UTC | #4
On Fri, May 03, 2019 at 04:52:58PM +0300, Imre Deak wrote:
> > > > [...]
> > > >   * igt@gem_persistent_relocs@forked-interruptible-thrashing:
> > > >     - shard-glk:          [PASS][1] -> [TIMEOUT][2]
> > > >    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > > >    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > > 
> > > Looks like an unrelated issue: on this GLK there are two HDMI displays
> > > connected, so the change shouldn't make any diffence on it. The change
> > > only affects the DP detect and hotplug paths, where we'll do now an
> > > async power domain put.
> > 
> > There's no history of glk locking up there, 
> > 
> > > The machine is still up when the problem happens, the test seems to get
> > > stuck and aborted by the test runner (after ~6mins according to [1]).
> > > 
> > > [43/82] (762s left) gem_persistent_relocs (forked-interruptible-thrashing)
> > > Starting subtest: forked-interruptible-thrashing
> > > Timeout. Killing the current test with SIGQUIT.
> > > Timeout. Killing the current test with SIGKILL.
> > 
> > and yet it locked up sufficiently to not respond to a signal, suggesting
> > an oops (the test takes 3s normally on glk).
> 
> No pstore logs either. I also noticed that the run [1] above resulted in
> an incomplete, if that's indicative of anything. The same goes for the
> previous Patchwork_12954 run.

Ah, there is actually a pstore log it's just not linked the html page
for some reason, Tomi?

Here it is:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/pstore25-1556858613_Panic_1.log

--Imre
Imre Deak May 6, 2019, 9:44 a.m. UTC | #5
On Fri, May 03, 2019 at 04:52:58PM +0300, Imre Deak wrote:
> > > > [...]
> > > >   * igt@gem_persistent_relocs@forked-interruptible-thrashing:
> > > >     - shard-glk:          [PASS][1] -> [TIMEOUT][2]
> > > >    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6032/shard-glk6/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > > >    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12955/shard-glk7/igt@gem_persistent_relocs@forked-interruptible-thrashing.html
> > > 
> > > Looks like an unrelated issue: on this GLK there are two HDMI displays
> > > connected, so the change shouldn't make any diffence on it. The change
> > > only affects the DP detect and hotplug paths, where we'll do now an
> > > async power domain put.
> > 
> > There's no history of glk locking up there, 
> > 
> > > The machine is still up when the problem happens, the test seems to get
> > > stuck and aborted by the test runner (after ~6mins according to [1]).
> > > 
> > > [43/82] (762s left) gem_persistent_relocs (forked-interruptible-thrashing)
> > > Starting subtest: forked-interruptible-thrashing
> > > Timeout. Killing the current test with SIGQUIT.
> > > Timeout. Killing the current test with SIGKILL.
> > 
> > and yet it locked up sufficiently to not respond to a signal, suggesting
> > an oops (the test takes 3s normally on glk).
> 
> No pstore logs either. I also noticed that the run [1] above resulted in
> an incomplete, if that's indicative of anything. The same goes for the
> previous Patchwork_12954 run.

For reference what we discussed on IRC:

There is also a previous Trybot run on SKL that hang in the same test in
a similar way:
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_4242/shard-skl9/igt%40gem_persistent_relocs%40forked-interruptible-thrashing.html

We're missing stack dumps to better isolate the problem, but based on
the above it's unlikely to be caused by the changes in this patchset.
Chris suggested

cat /proc/*/stack on timeout and ptracing the child from the IGT runner,
adding Petri for that.

I'm trying to repro the problem on SKL/GLK with and without this
patchset, so far I didn't hit the issue.

I opened the following bug to capture the findings:
https://bugs.freedesktop.org/show_bug.cgi?id=110618

and will check out Chris' patchset that may be related/fix the problem:
https://patchwork.freedesktop.org/series/60257/

--Imre