mbox series

[v2,0/2] Improvements to GuC load failure handling

Message ID 20230316220632.3312218-1-John.C.Harrison@Intel.com (mailing list archive)
Headers show
Series Improvements to GuC load failure handling | expand

Message

John Harrison March 16, 2023, 10:06 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

Add more decoding of the GuC load failures. Also include information
about GT frequency to see if timeouts are due to a failure to boost
the clocks. Finally, increase the timeout to accommodate situations
where the clock boost does fail.

v2: Reduce timeout in release builds, add bug references, make usage
of 'success' variable a litte clearer (review feedback from Daniele).

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>


John Harrison (2):
  drm/i915/guc: Improve GuC load error reporting
  drm/i915/guc: Allow for very slow GuC loading

 .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  17 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c     | 141 +++++++++++++++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_reg.h    |   4 +-
 3 files changed, 140 insertions(+), 22 deletions(-)

Comments

John Harrison March 23, 2023, 10:29 p.m. UTC | #1
On 3/22/2023 19:52, Patchwork wrote:
> Project List - Patchwork *Patch Details*
> *Series:* 	Improvements to GuC load failure handling (rev3)
> *URL:* 	https://patchwork.freedesktop.org/series/114168/
> *State:* 	failure
> *Details:* 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/index.html
>
>
>   CI Bug Log - changes from CI_DRM_12902 -> Patchwork_114168v3
>
>
>     Summary
>
> *FAILURE*
>
> Serious unknown changes coming with Patchwork_114168v3 absolutely need 
> to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_114168v3, please notify your bug team to allow 
> them
> to document this new failure mode, which will reduce false positives 
> in CI.
>
> External URL: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/index.html
>
>
>     Participating hosts (37 -> 35)
>
> Missing (2): fi-tgl-1115g4 fi-snb-2520m
>
>
>     Possible new issues
>
> Here are the unknown changes that may have been introduced in 
> Patchwork_114168v3:
>
>
>       IGT changes
>
>
>         Possible regressions
>
>   * igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-vga-1:
>       o fi-hsw-4770: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/fi-hsw-4770/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-vga-1.html>
>         -> ABORT
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/fi-hsw-4770/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-vga-1.html>
>
>
>     Known issues
>
> Here are the changes found in Patchwork_114168v3 that come from known 
> issues:
>
>
>       IGT changes
>
>
>         Issues hit
>
>  *
>
>     igt@i915_selftest@live@gt_heartbeat:
>
>       o fi-cfl-8109u: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/fi-cfl-8109u/igt@i915_selftest@live@gt_heartbeat.html>
>         -> DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/fi-cfl-8109u/igt@i915_selftest@live@gt_heartbeat.html>
>         (i915#5334 <https://gitlab.freedesktop.org/drm/intel/issues/5334>)
>  *
>
>     igt@i915_selftest@live@hangcheck:
>
>       o fi-skl-guc: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/fi-skl-guc/igt@i915_selftest@live@hangcheck.html>
>         -> DMESG-WARN
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/fi-skl-guc/igt@i915_selftest@live@hangcheck.html>
>         (i915#8073 <https://gitlab.freedesktop.org/drm/intel/issues/8073>)
>  *
>
>     igt@i915_selftest@live@migrate:
>
>       o bat-atsm-1: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-atsm-1/igt@i915_selftest@live@migrate.html>
>         -> DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-atsm-1/igt@i915_selftest@live@migrate.html>
>         (i915#7699 <https://gitlab.freedesktop.org/drm/intel/issues/7699>)
>  *
>
>     igt@i915_selftest@live@slpc:
>
>       o bat-adln-1: NOTRUN -> DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-adln-1/igt@i915_selftest@live@slpc.html>
>         (i915#6997 <https://gitlab.freedesktop.org/drm/intel/issues/6997>)
>  *
>
>     igt@kms_chamelium_hpd@common-hpd-after-suspend:
>
>      o
>
>         bat-rpls-2: NOTRUN -> SKIP
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-2/igt@kms_chamelium_hpd@common-hpd-after-suspend.html>
>         (i915#7828 <https://gitlab.freedesktop.org/drm/intel/issues/7828>)
>
>      o
>
>         bat-adln-1: NOTRUN -> SKIP
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-adln-1/igt@kms_chamelium_hpd@common-hpd-after-suspend.html>
>         (i915#7828 <https://gitlab.freedesktop.org/drm/intel/issues/7828>)
>
>      o
>
>         bat-rpls-1: NOTRUN -> SKIP
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-1/igt@kms_chamelium_hpd@common-hpd-after-suspend.html>
>         (i915#7828 <https://gitlab.freedesktop.org/drm/intel/issues/7828>)
>
>  *
>
>     igt@kms_pipe_crc_basic@suspend-read-crc:
>
>      o
>
>         bat-rpls-1: NOTRUN -> SKIP
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-1/igt@kms_pipe_crc_basic@suspend-read-crc.html>
>         (i915#1845 <https://gitlab.freedesktop.org/drm/intel/issues/1845>)
>
>      o
>
>         bat-rpls-2: NOTRUN -> SKIP
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-2/igt@kms_pipe_crc_basic@suspend-read-crc.html>
>         (i915#1845 <https://gitlab.freedesktop.org/drm/intel/issues/1845>)
>
>
>         Possible fixes
>
>  *
>
>     igt@gem_exec_suspend@basic-s0@smem:
>
>       o bat-rpls-2: ABORT
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-rpls-2/igt@gem_exec_suspend@basic-s0@smem.html>
>         -> PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-2/igt@gem_exec_suspend@basic-s0@smem.html>
>  *
>
>     igt@gem_exec_suspend@basic-s3@lmem0:
>
>       o bat-dg2-9: FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-dg2-9/igt@gem_exec_suspend@basic-s3@lmem0.html>
>         (fdo#103375
>         <https://bugs.freedesktop.org/show_bug.cgi?id=103375>) -> PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-dg2-9/igt@gem_exec_suspend@basic-s3@lmem0.html>
>         +3 similar issues
>  *
>
>     igt@i915_pm_rps@basic-api:
>
>       o bat-dg2-11: FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-dg2-11/igt@i915_pm_rps@basic-api.html>
>         -> PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-dg2-11/igt@i915_pm_rps@basic-api.html>
>  *
>
>     igt@i915_selftest@live@reset:
>
>       o bat-rpls-1: ABORT
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-rpls-1/igt@i915_selftest@live@reset.html>
>         (i915#4983
>         <https://gitlab.freedesktop.org/drm/intel/issues/4983> /
>         i915#7981
>         <https://gitlab.freedesktop.org/drm/intel/issues/7981>) ->
>         PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-1/igt@i915_selftest@live@reset.html>
>  *
>
>     igt@i915_selftest@live@workarounds:
>
>       o bat-adln-1: INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-adln-1/igt@i915_selftest@live@workarounds.html>
>         (i915#4983
>         <https://gitlab.freedesktop.org/drm/intel/issues/4983> /
>         i915#7467
>         <https://gitlab.freedesktop.org/drm/intel/issues/7467> /
>         i915#7981
>         <https://gitlab.freedesktop.org/drm/intel/issues/7981>) ->
>         PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-adln-1/igt@i915_selftest@live@workarounds.html>
>  *
>
>     igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3:
>
>       o bat-dg2-9: FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3.html>
>         (fdo#103375
>         <https://bugs.freedesktop.org/show_bug.cgi?id=103375> /
>         i915#7932
>         <https://gitlab.freedesktop.org/drm/intel/issues/7932>) ->
>         PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-dg2-9/igt@kms_pipe_crc_basic@suspend-read-crc@pipe-c-dp-3.html>
>
>
>         Warnings
>
>   * igt@i915_selftest@live@slpc:
>       o bat-rpls-2: DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12902/bat-rpls-2/igt@i915_selftest@live@slpc.html>
>         (i915#6997
>         <https://gitlab.freedesktop.org/drm/intel/issues/6997> /
>         i915#7913
>         <https://gitlab.freedesktop.org/drm/intel/issues/7913>) ->
>         DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_114168v3/bat-rpls-2/igt@i915_selftest@live@slpc.html>
>         (i915#6367
>         <https://gitlab.freedesktop.org/drm/intel/issues/6367> /
>         i915#7913
>         <https://gitlab.freedesktop.org/drm/intel/issues/7913> /
>         i915#7996 <https://gitlab.freedesktop.org/drm/intel/issues/7996>)
>
These patches only change GuC firmware loading (reporting of errors and 
longer timeouts). None of the above issues are related to GuC firmware 
loading. Therefore, they are not caused by these changes.

John.

>  *
>
>
>     Build changes
>
>   * Linux: CI_DRM_12902 -> Patchwork_114168v3
>
> CI-20190529: 20190529
> CI_DRM_12902: c8333f1c10ebbdaad7a642cc66041b4f90bc81be @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> IGT_7211: c0cc1de7b2f4041ca68960362aa55f881d416bac @ 
> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> Patchwork_114168v3: c8333f1c10ebbdaad7a642cc66041b4f90bc81be @ 
> git://anongit.freedesktop.org/gfx-ci/linux
>
>
>       Linux commits
>
> 247aa4568644 drm/i915/guc: Allow for very slow GuC loading
> 0c9bcdd1e7e7 drm/i915/guc: Improve GuC load error reporting
>
John Harrison March 23, 2023, 10:30 p.m. UTC | #2
On 3/22/2023 19:40, Patchwork wrote:
> == Series Details ==
>
> Series: Improvements to GuC load failure handling (rev3)
> URL   : https://patchwork.freedesktop.org/series/114168/
> State : warning
>
> == Summary ==
>
> Error: dim checkpatch failed
> b4df7f16c846 drm/i915/guc: Improve GuC load error reporting
> 2be0fcf3087c drm/i915/guc: Allow for very slow GuC loading
> -:21: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'References:', use 'Link:' instead
> #21:
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/7931
>
> -:22: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'References:', use 'Link:' instead
> #22:
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8060
>
> -:23: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'References:', use 'Link:' instead
> #23:
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8083
>
> -:24: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'References:', use 'Link:' instead
> #24:
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8136
>
> -:25: WARNING:COMMIT_LOG_USE_LINK: Unknown link reference 'References:', use 'Link:' instead
These issues appear to be the tool getting confused about bug references 
versus patchwork links. Other patches in the tree use the references tag 
for bug links.

John.

> #25:
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/8137
>
> total: 0 errors, 5 warnings, 0 checks, 85 lines checked
>
>