diff mbox series

[v7,1/8] drm/i915/gem: Break out some shmem backend utils

Message ID 20211006091614.970596-1-matthew.auld@intel.com (mailing list archive)
State New, archived
Headers show
Series [v7,1/8] drm/i915/gem: Break out some shmem backend utils | expand

Commit Message

Matthew Auld Oct. 6, 2021, 9:16 a.m. UTC
From: Thomas Hellström <thomas.hellstrom@linux.intel.com>

Break out some shmem backend utils for future reuse by the TTM backend:
shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can
use to provide a shmem-backed TTM page pool for cached-only TTM
buffer objects.

Main functional change here is that we now compute the page sizes using
the dma segments rather than using the physical page address segments.

v2(Reported-by: kernel test robot <lkp@intel.com>)
    - Make sure we initialise the mapping on the error path in
      shmem_get_pages()

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 181 +++++++++++++---------
 1 file changed, 106 insertions(+), 75 deletions(-)

Comments

Matthew Auld Oct. 7, 2021, 9:04 a.m. UTC | #1
On Wed, 6 Oct 2021 at 16:26, Patchwork <patchwork@emeril.freedesktop.org>
wrote:

> *Patch Details*
> *Series:* series starting with [v7,1/8] drm/i915/gem: Break out some
> shmem backend utils
> *URL:* https://patchwork.freedesktop.org/series/95501/
> *State:* failure
> *Details:*
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html CI
> Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full Summary
>
> *FAILURE*
>
> Serious unknown changes coming with Patchwork_21264_full absolutely need
> to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_21264_full, please notify your bug team to allow
> them
> to document this new failure mode, which will reduce false positives in CI.
> Possible new issues
>
> Here are the unknown changes that may have been introduced in
> Patchwork_21264_full:
> IGT changes Possible regressions
>
>    -
>
>    igt@gem_sync@basic-many-each:
>    - shard-apl: NOTRUN -> INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html>
>
>
Looks unrelated to this series. There were some recent changes merged in
this area in the last day or so.


>    -
>       -
>
>    igt@i915_pm_dc@dc9-dpms:
>    - shard-iclb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb5/igt@i915_pm_dc@dc9-dpms.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb2/igt@i915_pm_dc@dc9-dpms.html>
>
>
Also unrelated.


>    -
>
> Suppressed
>
> The following results come from untrusted machines, tests, or statuses.
> They do not affect the overall result.
>
>    -
>
>    {igt@gem_pxp@dmabuf-shared-protected-dst-is-context-refcounted}:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@gem_pxp@dmabuf-shared-protected-dst-is-context-refcounted.html>
>    -
>
>    {igt@gem_pxp@verify-pxp-execution-after-suspend-resume}:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@gem_pxp@verify-pxp-execution-after-suspend-resume.html>
>       +1 similar issue
>
> Known issues
>
> Here are the changes found in Patchwork_21264_full that come from known
> issues:
> IGT changes Issues hit
>
>    -
>
>    igt@feature_discovery@psr2:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@feature_discovery@psr2.html>
>       ([i915#658]) +1 similar issue
>    -
>
>    igt@gem_ctx_isolation@preservation-s3@vecs0:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl10/igt@gem_ctx_isolation@preservation-s3@vecs0.html>
>       -> INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl7/igt@gem_ctx_isolation@preservation-s3@vecs0.html>
>       ([i915#146] / [i915#198])
>    -
>
>    igt@gem_ctx_param@set-priority-not-supported:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@gem_ctx_param@set-priority-not-supported.html>
>       ([fdo#109314])
>    -
>
>    igt@gem_ctx_persistence@smoketest:
>    - shard-snb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb2/igt@gem_ctx_persistence@smoketest.html>
>       ([fdo#109271] / [i915#1099]) +4 similar issues
>    -
>
>    igt@gem_eio@in-flight-contexts-immediate:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl5/igt@gem_eio@in-flight-contexts-immediate.html>
>       -> TIMEOUT
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl3/igt@gem_eio@in-flight-contexts-immediate.html>
>       ([i915#3063])
>    -
>
>    igt@gem_eio@unwedge-stress:
>    - shard-snb: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb2/igt@gem_eio@unwedge-stress.html>
>       ([i915#3354])
>    -
>
>    igt@gem_exec_fair@basic-none@vcs1:
>    - shard-kbl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl6/igt@gem_exec_fair@basic-none@vcs1.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl1/igt@gem_exec_fair@basic-none@vcs1.html>
>       ([i915#2842])
>    -
>
>    igt@gem_exec_fair@basic-pace-share@rcs0:
>    - shard-tglb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-tglb3/igt@gem_exec_fair@basic-pace-share@rcs0.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@gem_exec_fair@basic-pace-share@rcs0.html>
>       ([i915#2842])
>    -
>
>    igt@gem_exec_fair@basic-pace@vcs1:
>    - shard-iclb: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb2/igt@gem_exec_fair@basic-pace@vcs1.html>
>       ([i915#2842])
>    -
>
>    igt@gem_exec_fair@basic-sync@rcs0:
>    - shard-kbl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl3/igt@gem_exec_fair@basic-sync@rcs0.html>
>       -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl6/igt@gem_exec_fair@basic-sync@rcs0.html>
>       ([fdo#109271]) +1 similar issue
>    -
>
>    igt@gem_exec_fair@basic-throttle@rcs0:
>    -
>
>       shard-glk: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-glk5/igt@gem_exec_fair@basic-throttle@rcs0.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-glk7/igt@gem_exec_fair@basic-throttle@rcs0.html>
>       ([i915#2842])
>       -
>
>       shard-iclb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb1/igt@gem_exec_fair@basic-throttle@rcs0.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb7/igt@gem_exec_fair@basic-throttle@rcs0.html>
>       ([i915#2849])
>       -
>
>    igt@gem_exec_flush@basic-batch-kernel-default-cmd:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@gem_exec_flush@basic-batch-kernel-default-cmd.html>
>       ([fdo#109313])
>    -
>
>    igt@gem_exec_params@no-blt:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@gem_exec_params@no-blt.html>
>       ([fdo#109283])
>    -
>
>    igt@gem_exec_params@no-vebox:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@gem_exec_params@no-vebox.html>
>       ([fdo#109283])
>    -
>
>    igt@gem_mmap_gtt@cpuset-medium-copy-xy:
>    - shard-apl: NOTRUN -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl8/igt@gem_mmap_gtt@cpuset-medium-copy-xy.html>
>       ([i915#180] / [i915#203] / [i915#62]) +4 similar issues
>    -
>
>    igt@gem_pread@exhaustion:
>    - shard-kbl: NOTRUN -> WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@gem_pread@exhaustion.html>
>       ([i915#2658])
>    -
>
>    igt@gem_pwrite@basic-exhaustion:
>    - shard-apl: NOTRUN -> WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl2/igt@gem_pwrite@basic-exhaustion.html>
>       ([i915#2658])
>    -
>
>    igt@gem_userptr_blits@dmabuf-sync:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@gem_userptr_blits@dmabuf-sync.html>
>       ([i915#3323])
>    -
>
>    igt@gem_userptr_blits@dmabuf-unsync:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@gem_userptr_blits@dmabuf-unsync.html>
>       ([i915#3297])
>    -
>
>    igt@gem_userptr_blits@input-checking:
>    - shard-apl: NOTRUN -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl2/igt@gem_userptr_blits@input-checking.html>
>       ([i915#3002]) +1 similar issue
>    -
>
>    igt@gem_userptr_blits@vma-merge:
>    - shard-snb: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb2/igt@gem_userptr_blits@vma-merge.html>
>       ([i915#2724])
>    -
>
>    igt@gen3_render_tiledy_blits:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@gen3_render_tiledy_blits.html>
>       ([fdo#109289]) +1 similar issue
>    -
>
>    igt@gen9_exec_parse@batch-without-end:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@gen9_exec_parse@batch-without-end.html>
>       ([i915#2856])
>    -
>
>    igt@gen9_exec_parse@bb-start-cmd:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@gen9_exec_parse@bb-start-cmd.html>
>       ([i915#2856]) +2 similar issues
>    -
>
>    igt@i915_module_load@reload-with-fault-injection:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl6/igt@i915_module_load@reload-with-fault-injection.html>
>       -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl4/igt@i915_module_load@reload-with-fault-injection.html>
>       ([i915#1982]) +2 similar issues
>    -
>
>    igt@i915_pm_dc@dc6-dpms:
>    - shard-iclb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb7/igt@i915_pm_dc@dc6-dpms.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html>
>       ([i915#454])
>    -
>
>    igt@i915_selftest@live@gt_lrc:
>    - shard-tglb: NOTRUN -> DMESG-FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@i915_selftest@live@gt_lrc.html>
>       ([i915#2373])
>    -
>
>    igt@i915_selftest@live@gt_pm:
>    - shard-tglb: NOTRUN -> DMESG-FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@i915_selftest@live@gt_pm.html>
>       ([i915#1759] / [i915#2291])
>    -
>
>    igt@i915_selftest@live@mman:
>    - shard-apl: NOTRUN -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl8/igt@i915_selftest@live@mman.html>
>       ([i915#203]) +33 similar issues
>    -
>
>    igt@i915_suspend@sysfs-reader:
>    - shard-apl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-apl2/igt@i915_suspend@sysfs-reader.html>
>       -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl1/igt@i915_suspend@sysfs-reader.html>
>       ([i915#180]) +3 similar issues
>    -
>
>    igt@kms_big_fb@linear-64bpp-rotate-270:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_big_fb@linear-64bpp-rotate-270.html>
>       ([fdo#110725] / [fdo#111614])
>    -
>
>    igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-hflip:
>    - shard-kbl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-hflip.html>
>       ([fdo#109271] / [i915#3777])
>    -
>
>    igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180:
>    - shard-apl: NOTRUN -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl8/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180.html>
>       ([i915#180] / [i915#1982] / [i915#203] / [i915#62])
>    -
>
>    igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-hflip:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl6/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-hflip.html>
>       ([fdo#109271] / [i915#3777])
>    -
>
>    igt@kms_big_fb@yf-tiled-32bpp-rotate-270:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@kms_big_fb@yf-tiled-32bpp-rotate-270.html>
>       ([fdo#111615]) +3 similar issues
>    -
>
>    igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html>
>       ([fdo#110723])
>    -
>
>    igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs:
>    -
>
>       shard-kbl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl2/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html>
>       ([fdo#109271] / [i915#3886]) +1 similar issue
>       -
>
>       shard-skl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl6/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_gen12_mc_ccs.html>
>       ([fdo#109271] / [i915#3886])
>       -
>
>    igt@kms_ccs@pipe-b-crc-primary-rotation-180-y_tiled_gen12_mc_ccs:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_ccs@pipe-b-crc-primary-rotation-180-y_tiled_gen12_mc_ccs.html>
>       ([fdo#109278] / [i915#3886])
>    -
>
>    igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl6/igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs.html>
>       ([fdo#109271] / [i915#3886]) +11 similar issues
>    -
>
>    igt@kms_ccs@pipe-c-crc-primary-rotation-180-y_tiled_gen12_mc_ccs:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@kms_ccs@pipe-c-crc-primary-rotation-180-y_tiled_gen12_mc_ccs.html>
>       ([i915#3689] / [i915#3886]) +1 similar issue
>    -
>
>    igt@kms_ccs@pipe-c-missing-ccs-buffer-yf_tiled_ccs:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_ccs@pipe-c-missing-ccs-buffer-yf_tiled_ccs.html>
>       ([i915#3689]) +3 similar issues
>    -
>
>    igt@kms_chamelium@dp-hpd:
>    - shard-skl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl6/igt@kms_chamelium@dp-hpd.html>
>       ([fdo#109271] / [fdo#111827])
>    -
>
>    igt@kms_chamelium@hdmi-mode-timings:
>    - shard-snb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb7/igt@kms_chamelium@hdmi-mode-timings.html>
>       ([fdo#109271] / [fdo#111827]) +14 similar issues
>    -
>
>    igt@kms_color@pipe-d-ctm-0-75:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@kms_color@pipe-d-ctm-0-75.html>
>       ([fdo#109278] / [i915#1149])
>    -
>
>    igt@kms_color_chamelium@pipe-a-ctm-blue-to-red:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_color_chamelium@pipe-a-ctm-blue-to-red.html>
>       ([fdo#109284] / [fdo#111827]) +1 similar issue
>    -
>
>    igt@kms_color_chamelium@pipe-a-ctm-limited-range:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl6/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html>
>       ([fdo#109271] / [fdo#111827]) +23 similar issues
>    -
>
>    igt@kms_color_chamelium@pipe-c-ctm-limited-range:
>    - shard-kbl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@kms_color_chamelium@pipe-c-ctm-limited-range.html>
>       ([fdo#109271] / [fdo#111827]) +1 similar issue
>    -
>
>    igt@kms_color_chamelium@pipe-d-ctm-green-to-red:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_color_chamelium@pipe-d-ctm-green-to-red.html>
>       ([fdo#109284] / [fdo#111827]) +6 similar issues
>    -
>
>    igt@kms_content_protection@atomic-dpms:
>    - shard-apl: NOTRUN -> TIMEOUT
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl6/igt@kms_content_protection@atomic-dpms.html>
>       ([i915#1319])
>    -
>
>    igt@kms_content_protection@legacy:
>    - shard-kbl: NOTRUN -> TIMEOUT
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@kms_content_protection@legacy.html>
>       ([i915#1319])
>    -
>
>    igt@kms_cursor_crc@pipe-a-cursor-512x512-offscreen:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_cursor_crc@pipe-a-cursor-512x512-offscreen.html>
>       ([fdo#109278] / [fdo#109279])
>    -
>
>    igt@kms_cursor_crc@pipe-a-cursor-suspend:
>    - shard-kbl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl3/igt@kms_cursor_crc@pipe-a-cursor-suspend.html>
>       -> DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl7/igt@kms_cursor_crc@pipe-a-cursor-suspend.html>
>       ([i915#180]) +3 similar issues
>    -
>
>    igt@kms_cursor_crc@pipe-b-cursor-32x10-random:
>    - shard-kbl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@kms_cursor_crc@pipe-b-cursor-32x10-random.html>
>       ([fdo#109271]) +73 similar issues
>    -
>
>    igt@kms_cursor_crc@pipe-b-cursor-512x170-rapid-movement:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@kms_cursor_crc@pipe-b-cursor-512x170-rapid-movement.html>
>       ([i915#3359])
>    -
>
>    igt@kms_cursor_crc@pipe-b-cursor-512x512-random:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_cursor_crc@pipe-b-cursor-512x512-random.html>
>       ([fdo#109279] / [i915#3359])
>    -
>
>    igt@kms_cursor_crc@pipe-c-cursor-32x32-onscreen:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_cursor_crc@pipe-c-cursor-32x32-onscreen.html>
>       ([i915#3319])
>    -
>
>    igt@kms_cursor_crc@pipe-d-cursor-64x64-rapid-movement:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_cursor_crc@pipe-d-cursor-64x64-rapid-movement.html>
>       ([fdo#109278]) +7 similar issues
>    -
>
>    igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html>
>       ([i915#2346])
>    -
>
>    igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium.html>
>       ([i915#3528])
>    -
>
>    igt@kms_fbcon_fbt@fbc-suspend:
>    -
>
>       shard-apl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-apl6/igt@kms_fbcon_fbt@fbc-suspend.html>
>       -> INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl8/igt@kms_fbcon_fbt@fbc-suspend.html>
>       ([i915#180] / [i915#1982])
>       -
>
>       shard-tglb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-tglb5/igt@kms_fbcon_fbt@fbc-suspend.html>
>       -> INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb7/igt@kms_fbcon_fbt@fbc-suspend.html>
>       ([i915#456])
>       -
>
>    igt@kms_flip@2x-absolute-wf_vblank:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_flip@2x-absolute-wf_vblank.html>
>       ([fdo#111825] / [i915#3966])
>    -
>
>    igt@kms_flip@2x-plain-flip-ts-check@ab-hdmi-a1-hdmi-a2:
>    - shard-glk: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-glk4/igt@kms_flip@2x-plain-flip-ts-check@ab-hdmi-a1-hdmi-a2.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-glk6/igt@kms_flip@2x-plain-flip-ts-check@ab-hdmi-a1-hdmi-a2.html>
>       ([i915#2122])
>    -
>
>    igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl9/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl2/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html>
>       ([i915#79])
>    -
>
>    igt@kms_flip@flip-vs-expired-vblank@a-edp1:
>    - shard-skl: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl6/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html>
>       ([i915#79])
>    -
>
>    igt@kms_flip@flip-vs-suspend@a-edp1:
>    - shard-tglb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-tglb5/igt@kms_flip@flip-vs-suspend@a-edp1.html>
>       -> INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb7/igt@kms_flip@flip-vs-suspend@a-edp1.html>
>       ([i915#2411] / [i915#456])
>    -
>
>    igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-blt:
>    - shard-snb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb2/igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-blt.html>
>       ([fdo#109271]) +309 similar issues
>    -
>
>    igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-wc:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-indfb-draw-mmap-wc.html>
>       ([fdo#109280])
>    -
>
>    igt@kms_frontbuffer_tracking@psr-2p-scndscrn-spr-indfb-draw-pwrite:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-spr-indfb-draw-pwrite.html>
>       ([fdo#111825]) +15 similar issues
>    -
>
>    igt@kms_hdr@bpc-switch-dpms:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl5/igt@kms_hdr@bpc-switch-dpms.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl10/igt@kms_hdr@bpc-switch-dpms.html>
>       ([i915#1188])
>    -
>
>    igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
>    - shard-apl: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl3/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html>
>       ([fdo#108145] / [i915#265]) +1 similar issue
>    -
>
>    igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb:
>    - shard-kbl: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl2/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html>
>       ([i915#265])
>    -
>
>    igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb:
>    - shard-apl: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl2/igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb.html>
>       ([i915#265])
>    -
>
>    igt@kms_plane_alpha_blend@pipe-c-alpha-7efc:
>    - shard-kbl: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl2/igt@kms_plane_alpha_blend@pipe-c-alpha-7efc.html>
>       ([fdo#108145] / [i915#265]) +1 similar issue
>    -
>
>    igt@kms_plane_lowres@pipe-c-tiling-x:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@kms_plane_lowres@pipe-c-tiling-x.html>
>       ([i915#3536])
>    -
>
>    igt@kms_plane_scaling
>    @scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html>
>       ([fdo#109271] / [i915#2733])
>    -
>
>    igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3:
>    - shard-kbl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3.html>
>       ([fdo#109271] / [i915#658])
>    -
>
>    igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-4.html>
>       ([fdo#109271] / [i915#658]) +4 similar issues
>    -
>
>    igt@kms_psr@psr2_cursor_plane_onoff:
>    - shard-iclb: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html>
>       -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb1/igt@kms_psr@psr2_cursor_plane_onoff.html>
>       ([fdo#109441]) +1 similar issue
>    -
>
>    igt@kms_psr@psr2_primary_blt:
>    - shard-tglb: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@kms_psr@psr2_primary_blt.html>
>       ([i915#132] / [i915#3467])
>    -
>
>    igt@kms_psr@psr2_primary_mmap_cpu:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@kms_psr@psr2_primary_mmap_cpu.html>
>       ([fdo#109441])
>    -
>
>    igt@kms_setmode@basic:
>    - shard-snb: NOTRUN -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-snb2/igt@kms_setmode@basic.html>
>       ([i915#31])
>    -
>
>    igt@kms_vblank@pipe-d-wait-idle:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@kms_vblank@pipe-d-wait-idle.html>
>       ([fdo#109271] / [i915#533]) +2 similar issues
>    -
>
>    igt@kms_writeback@writeback-fb-id:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl3/igt@kms_writeback@writeback-fb-id.html>
>       ([fdo#109271] / [i915#2437])
>    -
>
>    igt@nouveau_crc@pipe-b-ctx-flip-detection:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb6/igt@nouveau_crc@pipe-b-ctx-flip-detection.html>
>       ([i915#2530]) +1 similar issue
>    -
>
>    igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl2/igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame.html>
>       ([fdo#109271]) +249 similar issues
>    -
>
>    igt@nouveau_crc@pipe-d-source-outp-complete:
>    -
>
>       shard-skl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl6/igt@nouveau_crc@pipe-d-source-outp-complete.html>
>       ([fdo#109271]) +17 similar issues
>       -
>
>       shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@nouveau_crc@pipe-d-source-outp-complete.html>
>       ([fdo#109278] / [i915#2530])
>       -
>
>    igt@perf@polling:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl4/igt@perf@polling.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl5/igt@perf@polling.html>
>       ([i915#1542])
>    -
>
>    igt@prime_nv_api@i915_nv_import_twice:
>    - shard-iclb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb3/igt@prime_nv_api@i915_nv_import_twice.html>
>       ([fdo#109291]) +2 similar issues
>    -
>
>    igt@prime_nv_api@i915_self_import_to_different_fd:
>    - shard-tglb: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-tglb5/igt@prime_nv_api@i915_self_import_to_different_fd.html>
>       ([fdo#109291]) +2 similar issues
>    -
>
>    igt@sysfs_clients@recycle-many:
>    - shard-apl: NOTRUN -> SKIP
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl3/igt@sysfs_clients@recycle-many.html>
>       ([fdo#109271] / [i915#2994]) +1 similar issue
>    -
>
>    igt@sysfs_heartbeat_interval@mixed@bcs0:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl7/igt@sysfs_heartbeat_interval@mixed@bcs0.html>
>       -> WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl3/igt@sysfs_heartbeat_interval@mixed@bcs0.html>
>       ([i915#4055])
>    -
>
>    igt@sysfs_heartbeat_interval@mixed@vcs0:
>    - shard-skl: PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-skl7/igt@sysfs_heartbeat_interval@mixed@vcs0.html>
>       -> FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-skl3/igt@sysfs_heartbeat_interval@mixed@vcs0.html>
>       ([i915#1731])
>
> Possible fixes
>
>    -
>
>    igt@gem_eio@unwedge-stress:
>    - shard-iclb: TIMEOUT
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb6/igt@gem_eio@unwedge-stress.html>
>       ([i915#2369] / [i915#2481] / [i915#3070]) -> PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb6/igt@gem_eio@unwedge-stress.html>
>    -
>
>    igt@gem_exec_fair@basic-none@vecs0:
>    - shard-kbl: FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl6/igt@gem_exec_fair@basic-none@vecs0.html>
>       ([i915#2842]) -> PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl1/igt@gem_exec_fair@basic-none@vecs0.html>
>    -
>
>    igt@gem_exec_fair@basic-pace@bcs0:
>    - shard-iclb: FAIL
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-iclb5/igt@gem_exec_fair@basic-pace@bcs0.html>
>       ([i915#2842]) -> PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-iclb2/igt@gem_exec_fair@basic-pace@bcs0.html>
>    -
>
>    igt@gem_sync@basic-many-each:
>    - shard-kbl: INCOMPLETE
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl4/igt@gem_sync@basic-many-each.html>
>       -> PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl4/igt@gem_sync@basic-many-each.html>
>    -
>
>    igt@gem_workarounds@suspend-resume:
>    - shard-kbl: DMESG-WARN
>       <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10688/shard-kbl6/igt@gem_workarounds@suspend-resume.html>
>       ([i915#180]) -> PASS
>       <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-kbl7/igt@gem_workarounds@suspend-resume.html>
>       +1 similar issue
>    -
>
>    igt
>
>
Tvrtko Ursulin Oct. 7, 2021, 9:15 a.m. UTC | #2
Hi,

On 06/10/2021 16:26, Patchwork wrote:
> *Patch Details*
> *Series:*	series starting with [v7,1/8] drm/i915/gem: Break out some 
> shmem backend utils
> *URL:*	https://patchwork.freedesktop.org/series/95501/ 
> <https://patchwork.freedesktop.org/series/95501/>
> *State:*	failure
> *Details:* 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html 
> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html>
> 
> 
>   CI Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full
> 
> 
>     Summary
> 
> *FAILURE*
> 
> Serious unknown changes coming with Patchwork_21264_full absolutely need 
> to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_21264_full, please notify your bug team to allow 
> them
> to document this new failure mode, which will reduce false positives in CI.
> 
> 
>     Possible new issues
> 
> Here are the unknown changes that may have been introduced in 
> Patchwork_21264_full:
> 
> 
>       IGT changes
> 
> 
>         Possible regressions
> 
>   *
> 
>     igt@gem_sync@basic-many-each:
> 
>       o shard-apl: NOTRUN -> INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html>
Something still fishy in the unlocked iterator? Or dma_resv_get_fences using it?

<6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
<1> [188.935462] BUG: kernel NULL pointer dereference, address: 0000000000000010
<1> [188.935485] #PF: supervisor write access in kernel mode
<1> [188.935495] #PF: error_code(0x0002) - not-present page
<6> [188.935504] PGD 0 P4D 0
<4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
<4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted 5.15.0-rc4-CI-Patchwork_21264+ #1
<4> [188.935535] Hardware name:  /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
<4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
<4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 8b 44 24 10
<4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
<4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 0000000000000001
<4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: ffffc900011dbcc8
<4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 00000000fffffffe
<4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: ffffc900011dbd44
<4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 0000000000000000
<4> [188.935643] FS:  00007f68d17e9700(0000) GS:ffff888277900000(0000) knlGS:0000000000000000
<4> [188.935655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4: 00000000003506e0
<4> [188.935676] Call Trace:
<4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
<4> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915]
<4> [188.936533]  drm_ioctl_kernel+0xae/0x140
<4> [188.936546]  drm_ioctl+0x201/0x3d0
<4> [188.936555]  ? i915_gem_object_wait+0x410/0x410 [i915]
<4> [188.936820]  ? __fget_files+0xc2/0x1c0
<4> [188.936830]  ? __fget_files+0xda/0x1c0
<4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0
<4> [188.936848]  do_syscall_64+0x3a/0xb0
<4> [188.936859]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Regards,

Tvrtko
Christian König Oct. 7, 2021, 9:19 a.m. UTC | #3
Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
> Hi,
>
> On 06/10/2021 16:26, Patchwork wrote:
>> *Patch Details*
>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out 
>> some shmem backend utils
>> *URL:*    https://patchwork.freedesktop.org/series/95501/ 
>> <https://patchwork.freedesktop.org/series/95501/>
>> *State:*    failure
>> *Details:* 
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html 
>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html>
>>
>>
>>   CI Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full
>>
>>
>>     Summary
>>
>> *FAILURE*
>>
>> Serious unknown changes coming with Patchwork_21264_full absolutely 
>> need to be
>> verified manually.
>>
>> If you think the reported changes have nothing to do with the changes
>> introduced in Patchwork_21264_full, please notify your bug team to 
>> allow them
>> to document this new failure mode, which will reduce false positives 
>> in CI.
>>
>>
>>     Possible new issues
>>
>> Here are the unknown changes that may have been introduced in 
>> Patchwork_21264_full:
>>
>>
>>       IGT changes
>>
>>
>>         Possible regressions
>>
>>   *
>>
>>     igt@gem_sync@basic-many-each:
>>
>>       o shard-apl: NOTRUN -> INCOMPLETE
>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html>
> Something still fishy in the unlocked iterator? Or dma_resv_get_fences 
> using it?

Probably the later. I'm going to take a look.

Thanks for the notice,
Christian.

>
> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
> <1> [188.935462] BUG: kernel NULL pointer dereference, address: 
> 0000000000000010
> <1> [188.935485] #PF: supervisor write access in kernel mode
> <1> [188.935495] #PF: error_code(0x0002) - not-present page
> <6> [188.935504] PGD 0 P4D 0
> <4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
> <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted 
> 5.15.0-rc4-CI-Patchwork_21264+ #1
> <4> [188.935535] Hardware name:  /NUC6CAYB, BIOS 
> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
> <4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff eb 
> bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01 41 
> 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 8b 44 
> 24 10
> <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
> <4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 
> 0000000000000001
> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: 
> ffffc900011dbcc8
> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 
> 00000000fffffffe
> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: 
> ffffc900011dbd44
> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 
> 0000000000000000
> <4> [188.935643] FS:  00007f68d17e9700(0000) GS:ffff888277900000(0000) 
> knlGS:0000000000000000
> <4> [188.935655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4: 
> 00000000003506e0
> <4> [188.935676] Call Trace:
> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915]
> <4> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
> <4> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915]
> <4> [188.936533]  drm_ioctl_kernel+0xae/0x140
> <4> [188.936546]  drm_ioctl+0x201/0x3d0
> <4> [188.936555]  ? i915_gem_object_wait+0x410/0x410 [i915]
> <4> [188.936820]  ? __fget_files+0xc2/0x1c0
> <4> [188.936830]  ? __fget_files+0xda/0x1c0
> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0
> <4> [188.936848]  do_syscall_64+0x3a/0xb0
> <4> [188.936859]  entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Regards,
>
> Tvrtko
Tvrtko Ursulin Oct. 7, 2021, 10:51 a.m. UTC | #4
On 07/10/2021 10:19, Christian König wrote:
> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>> Hi,
>>
>> On 06/10/2021 16:26, Patchwork wrote:
>>> *Patch Details*
>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out 
>>> some shmem backend utils
>>> *URL:*    https://patchwork.freedesktop.org/series/95501/ 
>>> <https://patchwork.freedesktop.org/series/95501/>
>>> *State:*    failure
>>> *Details:* 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html 
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html>
>>>
>>>
>>>   CI Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full
>>>
>>>
>>>     Summary
>>>
>>> *FAILURE*
>>>
>>> Serious unknown changes coming with Patchwork_21264_full absolutely 
>>> need to be
>>> verified manually.
>>>
>>> If you think the reported changes have nothing to do with the changes
>>> introduced in Patchwork_21264_full, please notify your bug team to 
>>> allow them
>>> to document this new failure mode, which will reduce false positives 
>>> in CI.
>>>
>>>
>>>     Possible new issues
>>>
>>> Here are the unknown changes that may have been introduced in 
>>> Patchwork_21264_full:
>>>
>>>
>>>       IGT changes
>>>
>>>
>>>         Possible regressions
>>>
>>>   *
>>>
>>>     igt@gem_sync@basic-many-each:
>>>
>>>       o shard-apl: NOTRUN -> INCOMPLETE
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html> 
>>>
>> Something still fishy in the unlocked iterator? Or dma_resv_get_fences 
>> using it?
> 
> Probably the later. I'm going to take a look.
> 
> Thanks for the notice,
> Christian.
> 
>>
>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>> <1> [188.935462] BUG: kernel NULL pointer dereference, address: 
>> 0000000000000010
>> <1> [188.935485] #PF: supervisor write access in kernel mode
>> <1> [188.935495] #PF: error_code(0x0002) - not-present page
>> <6> [188.935504] PGD 0 P4D 0
>> <4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
>> <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted 
>> 5.15.0-rc4-CI-Patchwork_21264+ #1
>> <4> [188.935535] Hardware name:  /NUC6CAYB, BIOS 
>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
>> <4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff eb 
>> bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01 41 
>> 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 8b 44 
>> 24 10
>> <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
>> <4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 
>> 0000000000000001
>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: 
>> ffffc900011dbcc8
>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 
>> 00000000fffffffe
>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: 
>> ffffc900011dbd44
>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 
>> 0000000000000000
>> <4> [188.935643] FS:  00007f68d17e9700(0000) GS:ffff888277900000(0000) 
>> knlGS:0000000000000000
>> <4> [188.935655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4: 
>> 00000000003506e0
>> <4> [188.935676] Call Trace:
>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915]
>> <4> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
>> <4> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915]
>> <4> [188.936533]  drm_ioctl_kernel+0xae/0x140
>> <4> [188.936546]  drm_ioctl+0x201/0x3d0
>> <4> [188.936555]  ? i915_gem_object_wait+0x410/0x410 [i915]
>> <4> [188.936820]  ? __fget_files+0xc2/0x1c0
>> <4> [188.936830]  ? __fget_files+0xda/0x1c0
>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0
>> <4> [188.936848]  do_syscall_64+0x3a/0xb0
>> <4> [188.936859]  entry_SYSCALL_64_after_hwframe+0x44/0xae

FWIW if you disassemble the code it seems to be crashing in:

   (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)

RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the 
next iteration. R13 is share and R12 shared_count.

That *shared can contain 0000000000000010 makes no sense to me. At least 
yet. :)

Regards,

Tvrtko
Christian König Oct. 7, 2021, 12:57 p.m. UTC | #5
Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>
> On 07/10/2021 10:19, Christian König wrote:
>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>> Hi,
>>>
>>> On 06/10/2021 16:26, Patchwork wrote:
>>>> *Patch Details*
>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out 
>>>> some shmem backend utils
>>>> *URL:*    https://patchwork.freedesktop.org/series/95501/ 
>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>> *State:*    failure
>>>> *Details:* 
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html 
>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html>
>>>>
>>>>
>>>>   CI Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full
>>>>
>>>>
>>>>     Summary
>>>>
>>>> *FAILURE*
>>>>
>>>> Serious unknown changes coming with Patchwork_21264_full absolutely 
>>>> need to be
>>>> verified manually.
>>>>
>>>> If you think the reported changes have nothing to do with the changes
>>>> introduced in Patchwork_21264_full, please notify your bug team to 
>>>> allow them
>>>> to document this new failure mode, which will reduce false 
>>>> positives in CI.
>>>>
>>>>
>>>>     Possible new issues
>>>>
>>>> Here are the unknown changes that may have been introduced in 
>>>> Patchwork_21264_full:
>>>>
>>>>
>>>>       IGT changes
>>>>
>>>>
>>>>         Possible regressions
>>>>
>>>>   *
>>>>
>>>>     igt@gem_sync@basic-many-each:
>>>>
>>>>       o shard-apl: NOTRUN -> INCOMPLETE
>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html> 
>>>>
>>> Something still fishy in the unlocked iterator? Or 
>>> dma_resv_get_fences using it?
>>
>> Probably the later. I'm going to take a look.
>>
>> Thanks for the notice,
>> Christian.
>>
>>>
>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address: 
>>> 0000000000000010
>>> <1> [188.935485] #PF: supervisor write access in kernel mode
>>> <1> [188.935495] #PF: error_code(0x0002) - not-present page
>>> <6> [188.935504] PGD 0 P4D 0
>>> <4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
>>> <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted 
>>> 5.15.0-rc4-CI-Patchwork_21264+ #1
>>> <4> [188.935535] Hardware name:  /NUC6CAYB, BIOS 
>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
>>> <4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff 
>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01 
>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 
>>> 8b 44 24 10
>>> <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
>>> <4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 
>>> 0000000000000001
>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: 
>>> ffffc900011dbcc8
>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 
>>> 00000000fffffffe
>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: 
>>> ffffc900011dbd44
>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 
>>> 0000000000000000
>>> <4> [188.935643] FS:  00007f68d17e9700(0000) 
>>> GS:ffff888277900000(0000) knlGS:0000000000000000
>>> <4> [188.935655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4: 
>>> 00000000003506e0
>>> <4> [188.935676] Call Trace:
>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915]
>>> <4> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
>>> <4> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915]
>>> <4> [188.936533]  drm_ioctl_kernel+0xae/0x140
>>> <4> [188.936546]  drm_ioctl+0x201/0x3d0
>>> <4> [188.936555]  ? i915_gem_object_wait+0x410/0x410 [i915]
>>> <4> [188.936820]  ? __fget_files+0xc2/0x1c0
>>> <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0
>>> <4> [188.936848]  do_syscall_64+0x3a/0xb0
>>> <4> [188.936859] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> FWIW if you disassemble the code it seems to be crashing in:
>
>   (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>
> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the 
> next iteration. R13 is share and R12 shared_count.
>
> That *shared can contain 0000000000000010 makes no sense to me. At 
> least yet. :)

Yeah, me neither. I've gone over the whole code multiple time now and 
absolutely don't get what's happening here.

Adding some more selftests didn't helped either. As far as I can see the 
code works as intended.

Do we have any other reports of crashes?

Thanks,
Christian.

>
> Regards,
>
> Tvrtko
Tvrtko Ursulin Oct. 7, 2021, 1:40 p.m. UTC | #6
On 07/10/2021 13:57, Christian König wrote:
> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>
>> On 07/10/2021 10:19, Christian König wrote:
>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>> Hi,
>>>>
>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>> *Patch Details*
>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out 
>>>>> some shmem backend utils
>>>>> *URL:*    https://patchwork.freedesktop.org/series/95501/ 
>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>> *State:*    failure
>>>>> *Details:* 
>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html 
>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.html>
>>>>>
>>>>>
>>>>>   CI Bug Log - changes from CI_DRM_10688_full -> Patchwork_21264_full
>>>>>
>>>>>
>>>>>     Summary
>>>>>
>>>>> *FAILURE*
>>>>>
>>>>> Serious unknown changes coming with Patchwork_21264_full absolutely 
>>>>> need to be
>>>>> verified manually.
>>>>>
>>>>> If you think the reported changes have nothing to do with the changes
>>>>> introduced in Patchwork_21264_full, please notify your bug team to 
>>>>> allow them
>>>>> to document this new failure mode, which will reduce false 
>>>>> positives in CI.
>>>>>
>>>>>
>>>>>     Possible new issues
>>>>>
>>>>> Here are the unknown changes that may have been introduced in 
>>>>> Patchwork_21264_full:
>>>>>
>>>>>
>>>>>       IGT changes
>>>>>
>>>>>
>>>>>         Possible regressions
>>>>>
>>>>>   *
>>>>>
>>>>>     igt@gem_sync@basic-many-each:
>>>>>
>>>>>       o shard-apl: NOTRUN -> INCOMPLETE
>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-apl7/igt@gem_sync@basic-many-each.html> 
>>>>>
>>>> Something still fishy in the unlocked iterator? Or 
>>>> dma_resv_get_fences using it?
>>>
>>> Probably the later. I'm going to take a look.
>>>
>>> Thanks for the notice,
>>> Christian.
>>>
>>>>
>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address: 
>>>> 0000000000000010
>>>> <1> [188.935485] #PF: supervisor write access in kernel mode
>>>> <1> [188.935495] #PF: error_code(0x0002) - not-present page
>>>> <6> [188.935504] PGD 0 P4D 0
>>>> <4> [188.935512] Oops: 0002 [#1] PREEMPT SMP NOPTI
>>>> <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not tainted 
>>>> 5.15.0-rc4-CI-Patchwork_21264+ #1
>>>> <4> [188.935535] Hardware name:  /NUC6CAYB, BIOS 
>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
>>>> <4> [188.935546] RIP: 0010:dma_resv_get_fences+0x116/0x2d0
>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff 
>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 01 
>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 
>>>> 8b 44 24 10
>>>> <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 00010202
>>>> <4> [188.935593] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 
>>>> 0000000000000001
>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: 
>>>> ffffc900011dbcc8
>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 
>>>> 00000000fffffffe
>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: 
>>>> ffffc900011dbd44
>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 
>>>> 0000000000000000
>>>> <4> [188.935643] FS:  00007f68d17e9700(0000) 
>>>> GS:ffff888277900000(0000) knlGS:0000000000000000
>>>> <4> [188.935655] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> <4> [188.935665] CR2: 0000000000000010 CR3: 000000012d0a4000 CR4: 
>>>> 00000000003506e0
>>>> <4> [188.935676] Call Trace:
>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915]
>>>> <4> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915]
>>>> <4> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915]
>>>> <4> [188.936533]  drm_ioctl_kernel+0xae/0x140
>>>> <4> [188.936546]  drm_ioctl+0x201/0x3d0
>>>> <4> [188.936555]  ? i915_gem_object_wait+0x410/0x410 [i915]
>>>> <4> [188.936820]  ? __fget_files+0xc2/0x1c0
>>>> <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0
>>>> <4> [188.936848]  do_syscall_64+0x3a/0xb0
>>>> <4> [188.936859] entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> FWIW if you disassemble the code it seems to be crashing in:
>>
>>   (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>
>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the 
>> next iteration. R13 is share and R12 shared_count.
>>
>> That *shared can contain 0000000000000010 makes no sense to me. At 
>> least yet. :)
> 
> Yeah, me neither. I've gone over the whole code multiple time now and 
> absolutely don't get what's happening here.
> 
> Adding some more selftests didn't helped either. As far as I can see the 
> code works as intended.
> 
> Do we have any other reports of crashes?

Yes, sporadic but present across different platforms since the change 
went it: 
https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html. 
So issue is probably real.

Did not find any other tests failing with the same signature. Lakshmi 
are you perhaps able to search for the same or similar signature across 
the whole set of recent results?

Regards,

Tvrtko
Vudum, Lakshminarayana Oct. 7, 2021, 3:18 p.m. UTC | #7
-----Original Message-----
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> 
Sent: Thursday, October 7, 2021 6:41 AM
To: Christian König <ckoenig.leichtzumerken@gmail.com>; intel-gfx@lists.freedesktop.org
Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [v7,1/8] drm/i915/gem: Break out some shmem backend utils


On 07/10/2021 13:57, Christian König wrote:
> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>
>> On 07/10/2021 10:19, Christian König wrote:
>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>> Hi,
>>>>
>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>> *Patch Details*
>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out 
>>>>> some shmem backend utils
>>>>> *URL:*    https://patchwork.freedesktop.org/series/95501/
>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>> *State:*    failure
>>>>> *Details:*
>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.htm
>>>>> l 
>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.ht
>>>>> ml>
>>>>>
>>>>>
>>>>>   CI Bug Log - changes from CI_DRM_10688_full -> 
>>>>> Patchwork_21264_full
>>>>>
>>>>>
>>>>>     Summary
>>>>>
>>>>> *FAILURE*
>>>>>
>>>>> Serious unknown changes coming with Patchwork_21264_full 
>>>>> absolutely need to be verified manually.
>>>>>
>>>>> If you think the reported changes have nothing to do with the 
>>>>> changes introduced in Patchwork_21264_full, please notify your bug 
>>>>> team to allow them to document this new failure mode, which will 
>>>>> reduce false positives in CI.
>>>>>
>>>>>
>>>>>     Possible new issues
>>>>>
>>>>> Here are the unknown changes that may have been introduced in
>>>>> Patchwork_21264_full:
>>>>>
>>>>>
>>>>>       IGT changes
>>>>>
>>>>>
>>>>>         Possible regressions
>>>>>
>>>>>   *
>>>>>
>>>>>     igt@gem_sync@basic-many-each:
>>>>>
>>>>>       o shard-apl: NOTRUN -> INCOMPLETE 
>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-ap
>>>>> l7/igt@gem_sync@basic-many-each.html>
>>>>>
>>>> Something still fishy in the unlocked iterator? Or 
>>>> dma_resv_get_fences using it?
>>>
>>> Probably the later. I'm going to take a look.
>>>
>>> Thanks for the notice,
>>> Christian.
>>>
>>>>
>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each 
>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address:
>>>> 0000000000000010
>>>> <1> [188.935485] #PF: supervisor write access in kernel mode <1> 
>>>> [188.935495] #PF: error_code(0x0002) - not-present page <6> 
>>>> [188.935504] PGD 0 P4D 0 <4> [188.935512] Oops: 0002 [#1] PREEMPT 
>>>> SMP NOPTI <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not 
>>>> tainted 5.15.0-rc4-CI-Patchwork_21264+ #1 <4> [188.935535] Hardware 
>>>> name:  /NUC6CAYB, BIOS
>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [188.935546] RIP: 
>>>> 0010:dma_resv_get_fences+0x116/0x2d0
>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff 
>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48 
>>>> 01
>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48 
>>>> 8b 44 24 10 <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS: 
>>>> 00010202 <4> [188.935593] RAX: 0000000000000000 RBX: 
>>>> 00000000ffffffff RCX:
>>>> 0000000000000001
>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI: 
>>>> ffffc900011dbcc8
>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09: 
>>>> 00000000fffffffe
>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12: 
>>>> ffffc900011dbd44
>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15: 
>>>> 0000000000000000
>>>> <4> [188.935643] FS:  00007f68d17e9700(0000)
>>>> GS:ffff888277900000(0000) knlGS:0000000000000000 <4> [188.935655] 
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [188.935665] 
>>>> CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
>>>> 00000000003506e0
>>>> <4> [188.935676] Call Trace:
>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915] <4> 
>>>> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915] <4> 
>>>> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915] <4> 
>>>> [188.936533]  drm_ioctl_kernel+0xae/0x140 <4> [188.936546]  
>>>> drm_ioctl+0x201/0x3d0 <4> [188.936555]  ? 
>>>> i915_gem_object_wait+0x410/0x410 [i915] <4> [188.936820]  ? 
>>>> __fget_files+0xc2/0x1c0 <4> [188.936830]  ? __fget_files+0xda/0x1c0 
>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0 <4> [188.936848]  
>>>> do_syscall_64+0x3a/0xb0 <4> [188.936859] 
>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> FWIW if you disassemble the code it seems to be crashing in:
>>
>>   (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>
>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the 
>> next iteration. R13 is share and R12 shared_count.
>>
>> That *shared can contain 0000000000000010 makes no sense to me. At 
>> least yet. :)
> 
> Yeah, me neither. I've gone over the whole code multiple time now and 
> absolutely don't get what's happening here.
> 
> Adding some more selftests didn't helped either. As far as I can see 
> the code works as intended.
> 
> Do we have any other reports of crashes?

Yes, sporadic but present across different platforms since the change went it: 
https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html. 
So issue is probably real.

Did not find any other tests failing with the same signature. Lakshmi are you perhaps able to search for the same or similar signature across the whole set of recent results?

[Lakshmi] Both the regressions failures are new. I filed below issues and reported.
https://gitlab.freedesktop.org/drm/intel/-/issues/4275
igt@i915_pm_dc@dc9-dpms - fail - Failed assertion: dc9_wait_entry(data->debugfs_fd, dc_target, prev_dc, 3000), DC9 state is not achieved

https://gitlab.freedesktop.org/drm/intel/-/issues/4274
igt@gem_sync@basic-many-each - incomplete - RIP: 0010:dma_resv_get_fences

Regards,

Tvrtko
Tvrtko Ursulin Oct. 7, 2021, 3:53 p.m. UTC | #8
On 07/10/2021 16:18, Vudum, Lakshminarayana wrote:
> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: Thursday, October 7, 2021 6:41 AM
> To: Christian König <ckoenig.leichtzumerken@gmail.com>; intel-gfx@lists.freedesktop.org
> Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
> Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [v7,1/8] drm/i915/gem: Break out some shmem backend utils
> 
> 
> On 07/10/2021 13:57, Christian König wrote:
>> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>>
>>> On 07/10/2021 10:19, Christian König wrote:
>>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>>> Hi,
>>>>>
>>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>>> *Patch Details*
>>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out
>>>>>> some shmem backend utils
>>>>>> *URL:*    https://patchwork.freedesktop.org/series/95501/
>>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>>> *State:*    failure
>>>>>> *Details:*
>>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.htm
>>>>>> l
>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.ht
>>>>>> ml>
>>>>>>
>>>>>>
>>>>>>    CI Bug Log - changes from CI_DRM_10688_full ->
>>>>>> Patchwork_21264_full
>>>>>>
>>>>>>
>>>>>>      Summary
>>>>>>
>>>>>> *FAILURE*
>>>>>>
>>>>>> Serious unknown changes coming with Patchwork_21264_full
>>>>>> absolutely need to be verified manually.
>>>>>>
>>>>>> If you think the reported changes have nothing to do with the
>>>>>> changes introduced in Patchwork_21264_full, please notify your bug
>>>>>> team to allow them to document this new failure mode, which will
>>>>>> reduce false positives in CI.
>>>>>>
>>>>>>
>>>>>>      Possible new issues
>>>>>>
>>>>>> Here are the unknown changes that may have been introduced in
>>>>>> Patchwork_21264_full:
>>>>>>
>>>>>>
>>>>>>        IGT changes
>>>>>>
>>>>>>
>>>>>>          Possible regressions
>>>>>>
>>>>>>    *
>>>>>>
>>>>>>      igt@gem_sync@basic-many-each:
>>>>>>
>>>>>>        o shard-apl: NOTRUN -> INCOMPLETE
>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-ap
>>>>>> l7/igt@gem_sync@basic-many-each.html>
>>>>>>
>>>>> Something still fishy in the unlocked iterator? Or
>>>>> dma_resv_get_fences using it?
>>>>
>>>> Probably the later. I'm going to take a look.
>>>>
>>>> Thanks for the notice,
>>>> Christian.
>>>>
>>>>>
>>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address:
>>>>> 0000000000000010
>>>>> <1> [188.935485] #PF: supervisor write access in kernel mode <1>
>>>>> [188.935495] #PF: error_code(0x0002) - not-present page <6>
>>>>> [188.935504] PGD 0 P4D 0 <4> [188.935512] Oops: 0002 [#1] PREEMPT
>>>>> SMP NOPTI <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not
>>>>> tainted 5.15.0-rc4-CI-Patchwork_21264+ #1 <4> [188.935535] Hardware
>>>>> name:  /NUC6CAYB, BIOS
>>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [188.935546] RIP:
>>>>> 0010:dma_resv_get_fences+0x116/0x2d0
>>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff
>>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48
>>>>> 01
>>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48
>>>>> 8b 44 24 10 <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS:
>>>>> 00010202 <4> [188.935593] RAX: 0000000000000000 RBX:
>>>>> 00000000ffffffff RCX:
>>>>> 0000000000000001
>>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI:
>>>>> ffffc900011dbcc8
>>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09:
>>>>> 00000000fffffffe
>>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12:
>>>>> ffffc900011dbd44
>>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15:
>>>>> 0000000000000000
>>>>> <4> [188.935643] FS:  00007f68d17e9700(0000)
>>>>> GS:ffff888277900000(0000) knlGS:0000000000000000 <4> [188.935655]
>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [188.935665]
>>>>> CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
>>>>> 00000000003506e0
>>>>> <4> [188.935676] Call Trace:
>>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915] <4>
>>>>> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915] <4>
>>>>> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915] <4>
>>>>> [188.936533]  drm_ioctl_kernel+0xae/0x140 <4> [188.936546]
>>>>> drm_ioctl+0x201/0x3d0 <4> [188.936555]  ?
>>>>> i915_gem_object_wait+0x410/0x410 [i915] <4> [188.936820]  ?
>>>>> __fget_files+0xc2/0x1c0 <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0 <4> [188.936848]
>>>>> do_syscall_64+0x3a/0xb0 <4> [188.936859]
>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>
>>> FWIW if you disassemble the code it seems to be crashing in:
>>>
>>>    (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>>
>>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the
>>> next iteration. R13 is share and R12 shared_count.
>>>
>>> That *shared can contain 0000000000000010 makes no sense to me. At
>>> least yet. :)
>>
>> Yeah, me neither. I've gone over the whole code multiple time now and
>> absolutely don't get what's happening here.
>>
>> Adding some more selftests didn't helped either. As far as I can see
>> the code works as intended.
>>
>> Do we have any other reports of crashes?
> 
> Yes, sporadic but present across different platforms since the change went it:
> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html.
> So issue is probably real.
> 
> Did not find any other tests failing with the same signature. Lakshmi are you perhaps able to search for the same or similar signature across the whole set of recent results?
> 
> [Lakshmi] Both the regressions failures are new. I filed below issues and reported.


Thanks Lakshmi!

Christian, maybe revert for now since it looks tricky to figure out? I 
at least couldn't spent much time looking at it today. Or try to find a 
third set of eyes to look at it quickly in case we are not seeing something.

Looks like a good selftest will be needed here for robustness. Including 
threads to trigger restarts and external manipulation to hit the 
refcount zero.

Regards,

Tvrtko

> https://gitlab.freedesktop.org/drm/intel/-/issues/4275
> igt@i915_pm_dc@dc9-dpms - fail - Failed assertion: dc9_wait_entry(data->debugfs_fd, dc_target, prev_dc, 3000), DC9 state is not achieved
> 
> https://gitlab.freedesktop.org/drm/intel/-/issues/4274
> igt@gem_sync@basic-many-each - incomplete - RIP: 0010:dma_resv_get_fences
> 
> Regards,
> 
> Tvrtko
>
Christian König Oct. 7, 2021, 6:18 p.m. UTC | #9
Am 07.10.21 um 17:53 schrieb Tvrtko Ursulin:
>
> On 07/10/2021 16:18, Vudum, Lakshminarayana wrote:
>> -----Original Message-----
>> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Sent: Thursday, October 7, 2021 6:41 AM
>> To: Christian König <ckoenig.leichtzumerken@gmail.com>; 
>> intel-gfx@lists.freedesktop.org
>> Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
>> Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting 
>> with [v7,1/8] drm/i915/gem: Break out some shmem backend utils
>>
>>
>> On 07/10/2021 13:57, Christian König wrote:
>>> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>>>
>>>> On 07/10/2021 10:19, Christian König wrote:
>>>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>>>> Hi,
>>>>>>
>>>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>>>> *Patch Details*
>>>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out
>>>>>>> some shmem backend utils
>>>>>>> *URL:* https://patchwork.freedesktop.org/series/95501/
>>>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>>>> *State:*    failure
>>>>>>> *Details:*
>>>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.htm
>>>>>>> l
>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.ht
>>>>>>> ml>
>>>>>>>
>>>>>>>
>>>>>>>    CI Bug Log - changes from CI_DRM_10688_full ->
>>>>>>> Patchwork_21264_full
>>>>>>>
>>>>>>>
>>>>>>>      Summary
>>>>>>>
>>>>>>> *FAILURE*
>>>>>>>
>>>>>>> Serious unknown changes coming with Patchwork_21264_full
>>>>>>> absolutely need to be verified manually.
>>>>>>>
>>>>>>> If you think the reported changes have nothing to do with the
>>>>>>> changes introduced in Patchwork_21264_full, please notify your bug
>>>>>>> team to allow them to document this new failure mode, which will
>>>>>>> reduce false positives in CI.
>>>>>>>
>>>>>>>
>>>>>>>      Possible new issues
>>>>>>>
>>>>>>> Here are the unknown changes that may have been introduced in
>>>>>>> Patchwork_21264_full:
>>>>>>>
>>>>>>>
>>>>>>>        IGT changes
>>>>>>>
>>>>>>>
>>>>>>>          Possible regressions
>>>>>>>
>>>>>>>    *
>>>>>>>
>>>>>>>      igt@gem_sync@basic-many-each:
>>>>>>>
>>>>>>>        o shard-apl: NOTRUN -> INCOMPLETE
>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-ap
>>>>>>> l7/igt@gem_sync@basic-many-each.html>
>>>>>>>
>>>>>> Something still fishy in the unlocked iterator? Or
>>>>>> dma_resv_get_fences using it?
>>>>>
>>>>> Probably the later. I'm going to take a look.
>>>>>
>>>>> Thanks for the notice,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address:
>>>>>> 0000000000000010
>>>>>> <1> [188.935485] #PF: supervisor write access in kernel mode <1>
>>>>>> [188.935495] #PF: error_code(0x0002) - not-present page <6>
>>>>>> [188.935504] PGD 0 P4D 0 <4> [188.935512] Oops: 0002 [#1] PREEMPT
>>>>>> SMP NOPTI <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not
>>>>>> tainted 5.15.0-rc4-CI-Patchwork_21264+ #1 <4> [188.935535] Hardware
>>>>>> name:  /NUC6CAYB, BIOS
>>>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [188.935546] RIP:
>>>>>> 0010:dma_resv_get_fences+0x116/0x2d0
>>>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff
>>>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48
>>>>>> 01
>>>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48
>>>>>> 8b 44 24 10 <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS:
>>>>>> 00010202 <4> [188.935593] RAX: 0000000000000000 RBX:
>>>>>> 00000000ffffffff RCX:
>>>>>> 0000000000000001
>>>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI:
>>>>>> ffffc900011dbcc8
>>>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09:
>>>>>> 00000000fffffffe
>>>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12:
>>>>>> ffffc900011dbd44
>>>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15:
>>>>>> 0000000000000000
>>>>>> <4> [188.935643] FS:  00007f68d17e9700(0000)
>>>>>> GS:ffff888277900000(0000) knlGS:0000000000000000 <4> [188.935655]
>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [188.935665]
>>>>>> CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
>>>>>> 00000000003506e0
>>>>>> <4> [188.935676] Call Trace:
>>>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915] <4>
>>>>>> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915] <4>
>>>>>> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915] <4>
>>>>>> [188.936533]  drm_ioctl_kernel+0xae/0x140 <4> [188.936546]
>>>>>> drm_ioctl+0x201/0x3d0 <4> [188.936555]  ?
>>>>>> i915_gem_object_wait+0x410/0x410 [i915] <4> [188.936820]  ?
>>>>>> __fget_files+0xc2/0x1c0 <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0 <4> [188.936848]
>>>>>> do_syscall_64+0x3a/0xb0 <4> [188.936859]
>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>>
>>>> FWIW if you disassemble the code it seems to be crashing in:
>>>>
>>>>    (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>>>
>>>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the
>>>> next iteration. R13 is share and R12 shared_count.
>>>>
>>>> That *shared can contain 0000000000000010 makes no sense to me. At
>>>> least yet. :)
>>>
>>> Yeah, me neither. I've gone over the whole code multiple time now and
>>> absolutely don't get what's happening here.
>>>
>>> Adding some more selftests didn't helped either. As far as I can see
>>> the code works as intended.
>>>
>>> Do we have any other reports of crashes?
>>
>> Yes, sporadic but present across different platforms since the change 
>> went it:
>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html. 
>>
>> So issue is probably real.
>>
>> Did not find any other tests failing with the same signature. Lakshmi 
>> are you perhaps able to search for the same or similar signature 
>> across the whole set of recent results?
>>
>> [Lakshmi] Both the regressions failures are new. I filed below issues 
>> and reported.
>
>
> Thanks Lakshmi!
>
> Christian, maybe revert for now since it looks tricky to figure out? I 
> at least couldn't spent much time looking at it today. Or try to find 
> a third set of eyes to look at it quickly in case we are not seeing 
> something.
>
> Looks like a good selftest will be needed here for robustness. 
> Including threads to trigger restarts and external manipulation to hit 
> the refcount zero.

Yeah, agree. Already working on that.

Going to send out the revert for dma_resv_get_fences() tomorrow.

Christian.

>
> Regards,
>
> Tvrtko
>
>> https://gitlab.freedesktop.org/drm/intel/-/issues/4275
>> igt@i915_pm_dc@dc9-dpms - fail - Failed assertion: 
>> dc9_wait_entry(data->debugfs_fd, dc_target, prev_dc, 3000), DC9 state 
>> is not achieved
>>
>> https://gitlab.freedesktop.org/drm/intel/-/issues/4274
>> igt@gem_sync@basic-many-each - incomplete - RIP: 
>> 0010:dma_resv_get_fences
>>
>> Regards,
>>
>> Tvrtko
>>
Tvrtko Ursulin Oct. 8, 2021, 9:17 a.m. UTC | #10
On 07/10/2021 19:18, Christian König wrote:
> Am 07.10.21 um 17:53 schrieb Tvrtko Ursulin:
>>
>> On 07/10/2021 16:18, Vudum, Lakshminarayana wrote:
>>> -----Original Message-----
>>> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Sent: Thursday, October 7, 2021 6:41 AM
>>> To: Christian König <ckoenig.leichtzumerken@gmail.com>; 
>>> intel-gfx@lists.freedesktop.org
>>> Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
>>> Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting 
>>> with [v7,1/8] drm/i915/gem: Break out some shmem backend utils
>>>
>>>
>>> On 07/10/2021 13:57, Christian König wrote:
>>>> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>>>>
>>>>> On 07/10/2021 10:19, Christian König wrote:
>>>>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>>>>> *Patch Details*
>>>>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out
>>>>>>>> some shmem backend utils
>>>>>>>> *URL:* https://patchwork.freedesktop.org/series/95501/
>>>>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>>>>> *State:*    failure
>>>>>>>> *Details:*
>>>>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.htm
>>>>>>>> l
>>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.ht
>>>>>>>> ml>
>>>>>>>>
>>>>>>>>
>>>>>>>>    CI Bug Log - changes from CI_DRM_10688_full ->
>>>>>>>> Patchwork_21264_full
>>>>>>>>
>>>>>>>>
>>>>>>>>      Summary
>>>>>>>>
>>>>>>>> *FAILURE*
>>>>>>>>
>>>>>>>> Serious unknown changes coming with Patchwork_21264_full
>>>>>>>> absolutely need to be verified manually.
>>>>>>>>
>>>>>>>> If you think the reported changes have nothing to do with the
>>>>>>>> changes introduced in Patchwork_21264_full, please notify your bug
>>>>>>>> team to allow them to document this new failure mode, which will
>>>>>>>> reduce false positives in CI.
>>>>>>>>
>>>>>>>>
>>>>>>>>      Possible new issues
>>>>>>>>
>>>>>>>> Here are the unknown changes that may have been introduced in
>>>>>>>> Patchwork_21264_full:
>>>>>>>>
>>>>>>>>
>>>>>>>>        IGT changes
>>>>>>>>
>>>>>>>>
>>>>>>>>          Possible regressions
>>>>>>>>
>>>>>>>>    *
>>>>>>>>
>>>>>>>>      igt@gem_sync@basic-many-each:
>>>>>>>>
>>>>>>>>        o shard-apl: NOTRUN -> INCOMPLETE
>>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-ap
>>>>>>>> l7/igt@gem_sync@basic-many-each.html>
>>>>>>>>
>>>>>>> Something still fishy in the unlocked iterator? Or
>>>>>>> dma_resv_get_fences using it?
>>>>>>
>>>>>> Probably the later. I'm going to take a look.
>>>>>>
>>>>>> Thanks for the notice,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>>>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address:
>>>>>>> 0000000000000010
>>>>>>> <1> [188.935485] #PF: supervisor write access in kernel mode <1>
>>>>>>> [188.935495] #PF: error_code(0x0002) - not-present page <6>
>>>>>>> [188.935504] PGD 0 P4D 0 <4> [188.935512] Oops: 0002 [#1] PREEMPT
>>>>>>> SMP NOPTI <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not
>>>>>>> tainted 5.15.0-rc4-CI-Patchwork_21264+ #1 <4> [188.935535] Hardware
>>>>>>> name:  /NUC6CAYB, BIOS
>>>>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [188.935546] RIP:
>>>>>>> 0010:dma_resv_get_fences+0x116/0x2d0
>>>>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff
>>>>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48
>>>>>>> 01
>>>>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48
>>>>>>> 8b 44 24 10 <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS:
>>>>>>> 00010202 <4> [188.935593] RAX: 0000000000000000 RBX:
>>>>>>> 00000000ffffffff RCX:
>>>>>>> 0000000000000001
>>>>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI:
>>>>>>> ffffc900011dbcc8
>>>>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09:
>>>>>>> 00000000fffffffe
>>>>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12:
>>>>>>> ffffc900011dbd44
>>>>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15:
>>>>>>> 0000000000000000
>>>>>>> <4> [188.935643] FS:  00007f68d17e9700(0000)
>>>>>>> GS:ffff888277900000(0000) knlGS:0000000000000000 <4> [188.935655]
>>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [188.935665]
>>>>>>> CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
>>>>>>> 00000000003506e0
>>>>>>> <4> [188.935676] Call Trace:
>>>>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915] <4>
>>>>>>> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915] <4>
>>>>>>> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915] <4>
>>>>>>> [188.936533]  drm_ioctl_kernel+0xae/0x140 <4> [188.936546]
>>>>>>> drm_ioctl+0x201/0x3d0 <4> [188.936555]  ?
>>>>>>> i915_gem_object_wait+0x410/0x410 [i915] <4> [188.936820]  ?
>>>>>>> __fget_files+0xc2/0x1c0 <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>>>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0 <4> [188.936848]
>>>>>>> do_syscall_64+0x3a/0xb0 <4> [188.936859]
>>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>>>
>>>>> FWIW if you disassemble the code it seems to be crashing in:
>>>>>
>>>>>    (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>>>>
>>>>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the
>>>>> next iteration. R13 is share and R12 shared_count.
>>>>>
>>>>> That *shared can contain 0000000000000010 makes no sense to me. At
>>>>> least yet. :)
>>>>
>>>> Yeah, me neither. I've gone over the whole code multiple time now and
>>>> absolutely don't get what's happening here.
>>>>
>>>> Adding some more selftests didn't helped either. As far as I can see
>>>> the code works as intended.
>>>>
>>>> Do we have any other reports of crashes?
>>>
>>> Yes, sporadic but present across different platforms since the change 
>>> went it:
>>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html. 
>>>
>>> So issue is probably real.
>>>
>>> Did not find any other tests failing with the same signature. Lakshmi 
>>> are you perhaps able to search for the same or similar signature 
>>> across the whole set of recent results?
>>>
>>> [Lakshmi] Both the regressions failures are new. I filed below issues 
>>> and reported.
>>
>>
>> Thanks Lakshmi!
>>
>> Christian, maybe revert for now since it looks tricky to figure out? I 
>> at least couldn't spent much time looking at it today. Or try to find 
>> a third set of eyes to look at it quickly in case we are not seeing 
>> something.
>>
>> Looks like a good selftest will be needed here for robustness. 
>> Including threads to trigger restarts and external manipulation to hit 
>> the refcount zero.
> 
> Yeah, agree. Already working on that.
> 
> Going to send out the revert for dma_resv_get_fences() tomorrow.

Looks like the issue is actually in the unlocked iterator.

What happens in practice when it crashes is that the fence count in the 
shared fences object is zero, which means no space gets allocated in 
dma_resv_get_fences. But clearly shared_count was not zero in 
dma_resv_iter_walk_unlocked, otherwise the loop in dma_resv_get_fences 
wouldn't run.

I suspect it is not safe to drop the RCU lock having peeking at the 
dma_resv_shared_list.

Regards,

Tvrtko
Tvrtko Ursulin Oct. 8, 2021, 9:22 a.m. UTC | #11
On 08/10/2021 10:17, Tvrtko Ursulin wrote:
> 
> On 07/10/2021 19:18, Christian König wrote:
>> Am 07.10.21 um 17:53 schrieb Tvrtko Ursulin:
>>>
>>> On 07/10/2021 16:18, Vudum, Lakshminarayana wrote:
>>>> -----Original Message-----
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>> Sent: Thursday, October 7, 2021 6:41 AM
>>>> To: Christian König <ckoenig.leichtzumerken@gmail.com>; 
>>>> intel-gfx@lists.freedesktop.org
>>>> Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
>>>> Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting 
>>>> with [v7,1/8] drm/i915/gem: Break out some shmem backend utils
>>>>
>>>>
>>>> On 07/10/2021 13:57, Christian König wrote:
>>>>> Am 07.10.21 um 12:51 schrieb Tvrtko Ursulin:
>>>>>>
>>>>>> On 07/10/2021 10:19, Christian König wrote:
>>>>>>> Am 07.10.21 um 11:15 schrieb Tvrtko Ursulin:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 06/10/2021 16:26, Patchwork wrote:
>>>>>>>>> *Patch Details*
>>>>>>>>> *Series:*    series starting with [v7,1/8] drm/i915/gem: Break out
>>>>>>>>> some shmem backend utils
>>>>>>>>> *URL:* https://patchwork.freedesktop.org/series/95501/
>>>>>>>>> <https://patchwork.freedesktop.org/series/95501/>
>>>>>>>>> *State:*    failure
>>>>>>>>> *Details:*
>>>>>>>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.htm
>>>>>>>>> l
>>>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/index.ht
>>>>>>>>> ml>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    CI Bug Log - changes from CI_DRM_10688_full ->
>>>>>>>>> Patchwork_21264_full
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      Summary
>>>>>>>>>
>>>>>>>>> *FAILURE*
>>>>>>>>>
>>>>>>>>> Serious unknown changes coming with Patchwork_21264_full
>>>>>>>>> absolutely need to be verified manually.
>>>>>>>>>
>>>>>>>>> If you think the reported changes have nothing to do with the
>>>>>>>>> changes introduced in Patchwork_21264_full, please notify your bug
>>>>>>>>> team to allow them to document this new failure mode, which will
>>>>>>>>> reduce false positives in CI.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      Possible new issues
>>>>>>>>>
>>>>>>>>> Here are the unknown changes that may have been introduced in
>>>>>>>>> Patchwork_21264_full:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>        IGT changes
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>          Possible regressions
>>>>>>>>>
>>>>>>>>>    *
>>>>>>>>>
>>>>>>>>>      igt@gem_sync@basic-many-each:
>>>>>>>>>
>>>>>>>>>        o shard-apl: NOTRUN -> INCOMPLETE
>>>>>>>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21264/shard-ap
>>>>>>>>> l7/igt@gem_sync@basic-many-each.html>
>>>>>>>>>
>>>>>>>> Something still fishy in the unlocked iterator? Or
>>>>>>>> dma_resv_get_fences using it?
>>>>>>>
>>>>>>> Probably the later. I'm going to take a look.
>>>>>>>
>>>>>>> Thanks for the notice,
>>>>>>> Christian.
>>>>>>>
>>>>>>>>
>>>>>>>> <6> [187.551235] [IGT] gem_sync: starting subtest basic-many-each
>>>>>>>> <1> [188.935462] BUG: kernel NULL pointer dereference, address:
>>>>>>>> 0000000000000010
>>>>>>>> <1> [188.935485] #PF: supervisor write access in kernel mode <1>
>>>>>>>> [188.935495] #PF: error_code(0x0002) - not-present page <6>
>>>>>>>> [188.935504] PGD 0 P4D 0 <4> [188.935512] Oops: 0002 [#1] PREEMPT
>>>>>>>> SMP NOPTI <4> [188.935521] CPU: 2 PID: 1467 Comm: gem_sync Not
>>>>>>>> tainted 5.15.0-rc4-CI-Patchwork_21264+ #1 <4> [188.935535] Hardware
>>>>>>>> name:  /NUC6CAYB, BIOS
>>>>>>>> AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018 <4> [188.935546] RIP:
>>>>>>>> 0010:dma_resv_get_fences+0x116/0x2d0
>>>>>>>> <4> [188.935560] Code: 10 85 c0 7f c9 be 03 00 00 00 e8 15 8b df ff
>>>>>>>> eb bd e8 8e c6 ff ff eb b6 41 8b 04 24 49 8b 55 00 48 89 e7 8d 48
>>>>>>>> 01
>>>>>>>> 41 89 0c 24 <4c> 89 34 c2 e8 41 f2 ff ff 49 89 c6 48 85 c0 75 8c 48
>>>>>>>> 8b 44 24 10 <4> [188.935583] RSP: 0018:ffffc900011dbcc8 EFLAGS:
>>>>>>>> 00010202 <4> [188.935593] RAX: 0000000000000000 RBX:
>>>>>>>> 00000000ffffffff RCX:
>>>>>>>> 0000000000000001
>>>>>>>> <4> [188.935603] RDX: 0000000000000010 RSI: ffffffff822e343c RDI:
>>>>>>>> ffffc900011dbcc8
>>>>>>>> <4> [188.935613] RBP: ffffc900011dbd48 R08: ffff88812d255bb8 R09:
>>>>>>>> 00000000fffffffe
>>>>>>>> <4> [188.935623] R10: 0000000000000001 R11: 0000000000000000 R12:
>>>>>>>> ffffc900011dbd44
>>>>>>>> <4> [188.935633] R13: ffffc900011dbd50 R14: ffff888113d29cc0 R15:
>>>>>>>> 0000000000000000
>>>>>>>> <4> [188.935643] FS:  00007f68d17e9700(0000)
>>>>>>>> GS:ffff888277900000(0000) knlGS:0000000000000000 <4> [188.935655]
>>>>>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [188.935665]
>>>>>>>> CR2: 0000000000000010 CR3: 000000012d0a4000 CR4:
>>>>>>>> 00000000003506e0
>>>>>>>> <4> [188.935676] Call Trace:
>>>>>>>> <4> [188.935685]  i915_gem_object_wait+0x1ff/0x410 [i915] <4>
>>>>>>>> [188.935988]  i915_gem_wait_ioctl+0xf2/0x2a0 [i915] <4>
>>>>>>>> [188.936272]  ? i915_gem_object_wait+0x410/0x410 [i915] <4>
>>>>>>>> [188.936533]  drm_ioctl_kernel+0xae/0x140 <4> [188.936546]
>>>>>>>> drm_ioctl+0x201/0x3d0 <4> [188.936555]  ?
>>>>>>>> i915_gem_object_wait+0x410/0x410 [i915] <4> [188.936820]  ?
>>>>>>>> __fget_files+0xc2/0x1c0 <4> [188.936830]  ? __fget_files+0xda/0x1c0
>>>>>>>> <4> [188.936839]  __x64_sys_ioctl+0x6d/0xa0 <4> [188.936848]
>>>>>>>> do_syscall_64+0x3a/0xb0 <4> [188.936859]
>>>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>>>>
>>>>>> FWIW if you disassemble the code it seems to be crashing in:
>>>>>>
>>>>>>    (*shared)[(*shared_count)++] = fence; // mov %r14, (%rdx, %rax, 8)
>>>>>>
>>>>>> RDX is *shared, RAX is *shared_count, RCX is *shared_count++ (for the
>>>>>> next iteration. R13 is share and R12 shared_count.
>>>>>>
>>>>>> That *shared can contain 0000000000000010 makes no sense to me. At
>>>>>> least yet. :)
>>>>>
>>>>> Yeah, me neither. I've gone over the whole code multiple time now and
>>>>> absolutely don't get what's happening here.
>>>>>
>>>>> Adding some more selftests didn't helped either. As far as I can see
>>>>> the code works as intended.
>>>>>
>>>>> Do we have any other reports of crashes?
>>>>
>>>> Yes, sporadic but present across different platforms since the 
>>>> change went it:
>>>> https://intel-gfx-ci.01.org/tree/drm-tip/igt@gem_sync@basic-many-each.html. 
>>>>
>>>> So issue is probably real.
>>>>
>>>> Did not find any other tests failing with the same signature. 
>>>> Lakshmi are you perhaps able to search for the same or similar 
>>>> signature across the whole set of recent results?
>>>>
>>>> [Lakshmi] Both the regressions failures are new. I filed below 
>>>> issues and reported.
>>>
>>>
>>> Thanks Lakshmi!
>>>
>>> Christian, maybe revert for now since it looks tricky to figure out? 
>>> I at least couldn't spent much time looking at it today. Or try to 
>>> find a third set of eyes to look at it quickly in case we are not 
>>> seeing something.
>>>
>>> Looks like a good selftest will be needed here for robustness. 
>>> Including threads to trigger restarts and external manipulation to 
>>> hit the refcount zero.
>>
>> Yeah, agree. Already working on that.
>>
>> Going to send out the revert for dma_resv_get_fences() tomorrow.
> 
> Looks like the issue is actually in the unlocked iterator.
> 
> What happens in practice when it crashes is that the fence count in the 
> shared fences object is zero, which means no space gets allocated in 
> dma_resv_get_fences. But clearly shared_count was not zero in 
> dma_resv_iter_walk_unlocked, otherwise the loop in dma_resv_get_fences 
> wouldn't run.
> 
> I suspect it is not safe to drop the RCU lock having peeking at the 
> dma_resv_shared_list.

It may work to cache cursor.fences->shared_count into 
cursor.shared_count at restart time, so dma_resv_get_fences could use it 
to guarantee consistent view and allocated space correctly. Then 
dma_resv_iter_next_unlocked would notice restart and cause unwind.

Regards,

Tvrtko
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 11f072193f3b..36b711ae9e28 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -25,46 +25,61 @@  static void check_release_pagevec(struct pagevec *pvec)
 	cond_resched();
 }
 
-static int shmem_get_pages(struct drm_i915_gem_object *obj)
+static void shmem_free_st(struct sg_table *st, struct address_space *mapping,
+			  bool dirty, bool backup)
 {
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct intel_memory_region *mem = obj->mm.region;
-	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	struct sgt_iter sgt_iter;
+	struct pagevec pvec;
+	struct page *page;
+
+	mapping_clear_unevictable(mapping);
+
+	pagevec_init(&pvec);
+	for_each_sgt_page(page, sgt_iter, st) {
+		if (dirty)
+			set_page_dirty(page);
+
+		if (backup)
+			mark_page_accessed(page);
+
+		if (!pagevec_add(&pvec, page))
+			check_release_pagevec(&pvec);
+	}
+	if (pagevec_count(&pvec))
+		check_release_pagevec(&pvec);
+
+	sg_free_table(st);
+	kfree(st);
+}
+
+static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915,
+				       size_t size, struct intel_memory_region *mr,
+				       struct address_space *mapping,
+				       unsigned int max_segment)
+{
+	const unsigned long page_count = size / PAGE_SIZE;
 	unsigned long i;
-	struct address_space *mapping;
 	struct sg_table *st;
 	struct scatterlist *sg;
-	struct sgt_iter sgt_iter;
 	struct page *page;
 	unsigned long last_pfn = 0;	/* suppress gcc warning */
-	unsigned int max_segment = i915_sg_segment_size();
-	unsigned int sg_page_sizes;
 	gfp_t noreclaim;
 	int ret;
 
-	/*
-	 * Assert that the object is not currently in any GPU domain. As it
-	 * wasn't in the GTT, there shouldn't be any way it could have been in
-	 * a GPU cache
-	 */
-	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
-	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
-
 	/*
 	 * If there's no chance of allocating enough pages for the whole
 	 * object, bail early.
 	 */
-	if (obj->base.size > resource_size(&mem->region))
-		return -ENOMEM;
+	if (size > resource_size(&mr->region))
+		return ERR_PTR(-ENOMEM);
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (!st)
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 
-rebuild_st:
 	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
 		kfree(st);
-		return -ENOMEM;
+		return ERR_PTR(-ENOMEM);
 	}
 
 	/*
@@ -73,14 +88,12 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 *
 	 * Fail silently without starting the shrinker
 	 */
-	mapping = obj->base.filp->f_mapping;
 	mapping_set_unevictable(mapping);
 	noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM);
 	noreclaim |= __GFP_NORETRY | __GFP_NOWARN;
 
 	sg = st->sgl;
 	st->nents = 0;
-	sg_page_sizes = 0;
 	for (i = 0; i < page_count; i++) {
 		const unsigned int shrink[] = {
 			I915_SHRINK_BOUND | I915_SHRINK_UNBOUND,
@@ -135,10 +148,9 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		if (!i ||
 		    sg->length >= max_segment ||
 		    page_to_pfn(page) != last_pfn + 1) {
-			if (i) {
-				sg_page_sizes |= sg->length;
+			if (i)
 				sg = sg_next(sg);
-			}
+
 			st->nents++;
 			sg_set_page(sg, page, PAGE_SIZE, 0);
 		} else {
@@ -149,14 +161,65 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		/* Check that the i965g/gm workaround works. */
 		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
 	}
-	if (sg) { /* loop terminated early; short sg table */
-		sg_page_sizes |= sg->length;
+	if (sg) /* loop terminated early; short sg table */
 		sg_mark_end(sg);
-	}
 
 	/* Trim unused sg entries to avoid wasting memory. */
 	i915_sg_trim(st);
 
+	return st;
+err_sg:
+	sg_mark_end(sg);
+	if (sg != st->sgl) {
+		shmem_free_st(st, mapping, false, false);
+	} else {
+		mapping_clear_unevictable(mapping);
+		sg_free_table(st);
+		kfree(st);
+	}
+
+	/*
+	 * shmemfs first checks if there is enough memory to allocate the page
+	 * and reports ENOSPC should there be insufficient, along with the usual
+	 * ENOMEM for a genuine allocation failure.
+	 *
+	 * We use ENOSPC in our driver to mean that we have run out of aperture
+	 * space and so want to translate the error from shmemfs back to our
+	 * usual understanding of ENOMEM.
+	 */
+	if (ret == -ENOSPC)
+		ret = -ENOMEM;
+
+	return ERR_PTR(ret);
+}
+
+static int shmem_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct intel_memory_region *mem = obj->mm.region;
+	struct address_space *mapping = obj->base.filp->f_mapping;
+	const unsigned long page_count = obj->base.size / PAGE_SIZE;
+	unsigned int max_segment = i915_sg_segment_size();
+	struct sg_table *st;
+	struct sgt_iter sgt_iter;
+	struct page *page;
+	int ret;
+
+	/*
+	 * Assert that the object is not currently in any GPU domain. As it
+	 * wasn't in the GTT, there shouldn't be any way it could have been in
+	 * a GPU cache
+	 */
+	GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS);
+	GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS);
+
+rebuild_st:
+	st = shmem_alloc_st(i915, obj->base.size, mem, mapping, max_segment);
+	if (IS_ERR(st)) {
+		ret = PTR_ERR(st);
+		goto err_st;
+	}
+
 	ret = i915_gem_gtt_prepare_pages(obj, st);
 	if (ret) {
 		/*
@@ -168,6 +231,7 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 			for_each_sgt_page(page, sgt_iter, st)
 				put_page(page);
 			sg_free_table(st);
+			kfree(st);
 
 			max_segment = PAGE_SIZE;
 			goto rebuild_st;
@@ -200,28 +264,12 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER)
 		obj->cache_dirty = true;
 
-	__i915_gem_object_set_pages(obj, st, sg_page_sizes);
+	__i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl));
 
 	return 0;
 
-err_sg:
-	sg_mark_end(sg);
 err_pages:
-	mapping_clear_unevictable(mapping);
-	if (sg != st->sgl) {
-		struct pagevec pvec;
-
-		pagevec_init(&pvec);
-		for_each_sgt_page(page, sgt_iter, st) {
-			if (!pagevec_add(&pvec, page))
-				check_release_pagevec(&pvec);
-		}
-		if (pagevec_count(&pvec))
-			check_release_pagevec(&pvec);
-	}
-	sg_free_table(st);
-	kfree(st);
-
+	shmem_free_st(st, mapping, false, false);
 	/*
 	 * shmemfs first checks if there is enough memory to allocate the page
 	 * and reports ENOSPC should there be insufficient, along with the usual
@@ -231,6 +279,7 @@  static int shmem_get_pages(struct drm_i915_gem_object *obj)
 	 * space and so want to translate the error from shmemfs back to our
 	 * usual understanding of ENOMEM.
 	 */
+err_st:
 	if (ret == -ENOSPC)
 		ret = -ENOMEM;
 
@@ -251,10 +300,8 @@  shmem_truncate(struct drm_i915_gem_object *obj)
 	obj->mm.pages = ERR_PTR(-EFAULT);
 }
 
-static void
-shmem_writeback(struct drm_i915_gem_object *obj)
+static void __shmem_writeback(size_t size, struct address_space *mapping)
 {
-	struct address_space *mapping;
 	struct writeback_control wbc = {
 		.sync_mode = WB_SYNC_NONE,
 		.nr_to_write = SWAP_CLUSTER_MAX,
@@ -270,10 +317,9 @@  shmem_writeback(struct drm_i915_gem_object *obj)
 	 * instead of invoking writeback so they are aged and paged out
 	 * as normal.
 	 */
-	mapping = obj->base.filp->f_mapping;
 
 	/* Begin writeback on each dirty page */
-	for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) {
+	for (i = 0; i < size >> PAGE_SHIFT; i++) {
 		struct page *page;
 
 		page = find_lock_page(mapping, i);
@@ -296,6 +342,12 @@  shmem_writeback(struct drm_i915_gem_object *obj)
 	}
 }
 
+static void
+shmem_writeback(struct drm_i915_gem_object *obj)
+{
+	__shmem_writeback(obj->base.size, obj->base.filp->f_mapping);
+}
+
 void
 __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 				struct sg_table *pages,
@@ -316,11 +368,6 @@  __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages)
 {
-	struct sgt_iter sgt_iter;
-	struct pagevec pvec;
-	struct page *page;
-
-	GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev)));
 	__i915_gem_object_release_shmem(obj, pages, true);
 
 	i915_gem_gtt_finish_pages(obj, pages);
@@ -328,25 +375,9 @@  void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_save_bit_17_swizzle(obj, pages);
 
-	mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping);
-
-	pagevec_init(&pvec);
-	for_each_sgt_page(page, sgt_iter, pages) {
-		if (obj->mm.dirty)
-			set_page_dirty(page);
-
-		if (obj->mm.madv == I915_MADV_WILLNEED)
-			mark_page_accessed(page);
-
-		if (!pagevec_add(&pvec, page))
-			check_release_pagevec(&pvec);
-	}
-	if (pagevec_count(&pvec))
-		check_release_pagevec(&pvec);
+	shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping,
+		      obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED);
 	obj->mm.dirty = false;
-
-	sg_free_table(pages);
-	kfree(pages);
 }
 
 static void