✗ Fi.CI.IGT: failure for series starting with [v8,01/12] drm/i915: Park before resetting the submission backend
diff mbox

Message ID 7b39371f-fe0f-ae76-d8f5-812931ff47ae@intel.com
State New
Headers show

Commit Message

sagar.a.kamble@intel.com April 10, 2018, 5:32 a.m. UTC
On 4/9/2018 9:02 PM, Michal Wajdeczko wrote:
> On Mon, 09 Apr 2018 17:09:18 +0200, Patchwork 
> <patchwork@emeril.freedesktop.org> wrote:
>
>> == Series Details ==
>>
>> Series: series starting with [v8,01/12] drm/i915: Park before 
>> resetting the submission backend
>> URL   : https://patchwork.freedesktop.org/series/41365/
>> State : failure
>>
>> == Summary ==
>>
>> ---- Possible new issues:
>
> two variants:
>
>>
>> Test drm_mm:
>>         Subgroup sanitycheck:
>>                 pass       -> INCOMPLETE (shard-apl)
>
> #1
>
> <0>[  400.245461] drv_self-5775    1.... 400208508us : 
> intel_guc_submission_disable: intel_guc_submission_disable:1255 
> GEM_BUG_ON(dev_priv->gt.awake)
>
> <4>[  400.245871] Call Trace:
> <4>[  400.245959]  intel_uc_fini_hw+0x4b/0xe0 [i915]
> <4>[  400.246047]  i915_gem_fini_hw+0x16/0x30 [i915]
> <4>[  400.246129]  i915_reset+0x1e8/0x2b0 [i915]
> <4>[  400.246222]  igt_global_reset+0x38/0xe0 [i915]
>
Without gem_set_wedged if i915_reset path is invoked we can face this issue.
igt_global_reset and gem_eio resets are directly invoking 
i915_handle_error/i915_reset so I think we should fix the IGTs.
>> Test drv_hangman:
>>         Subgroup error-state-capture-blt:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup error-state-capture-bsd:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup error-state-capture-render:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup error-state-capture-vebox:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test drv_selftest:
>>         Subgroup live_guc:
>>                 pass       -> SKIP       (shard-apl)
>>         Subgroup live_hangcheck:
>>                 pass       -> DMESG-FAIL (shard-apl)
>> Test gem_eio:
>>         Subgroup execbuf:
>>                 pass       -> INCOMPLETE (shard-apl)
>
> #2:
>
> <3>[  227.833798] intel_engine_unpin_breadcrumbs_irq:219 
> GEM_BUG_ON(!b->irq_enabled)
>
> <4>[  227.834607] Call Trace:
> <4>[  227.834691]  intel_engines_park+0xef/0x180 [i915]
> <4>[  227.834709]  ? synchronize_irq+0x3e/0xb0
> <4>[  227.834781]  __i915_gem_park+0x3e/0x160 [i915]
> <4>[  227.834850]  i915_gem_idle_work_handler+0x1cd/0x220 [i915]
> <4>[  227.834868]  process_one_work+0x21a/0x640
>
>
irq disabling with GuC submission is not taking into consideration if it 
was enabled by waiter.
May be we should skip disarming interrupts while parking if there was no 
waiter since we will disarm them
during engine->park. Something like below?

          * so if the bottom-half remains asleep, it missed the request
          * completion.
>>         Subgroup in-flight-external:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test gem_mocs_settings:
>>         Subgroup mocs-reset-dirty-render:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test gem_request_retire:
>>         Subgroup retire-vma-not-inactive:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test gem_workarounds:
>>         Subgroup reset-context:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test kms_vblank:
>>         Subgroup pipe-a-query-idle-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-a-ts-continuation-idle-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-a-wait-busy-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-a-wait-forked-busy-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-a-wait-idle-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-b-query-forked-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-c-query-busy-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-c-query-forked-busy-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-c-query-forked-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>>         Subgroup pipe-c-ts-continuation-idle-hang:
>>                 pass       -> INCOMPLETE (shard-apl)
>> Test perf:
>>         Subgroup gen8-unprivileged-single-ctx-counters:
>>                 pass       -> FAIL       (shard-apl)
>>
>> ---- Known issues:
>>
>> Test drv_missed_irq:
>>                 pass       -> SKIP       (shard-apl) fdo#103199
>> Test gem_eio:
>>         Subgroup in-flight-suspend:
>>                 pass       -> INCOMPLETE (shard-apl) fdo#103375
>> Test kms_flip:
>>         Subgroup flip-vs-expired-vblank:
>>                 fail       -> PASS       (shard-hsw) fdo#102887
>>         Subgroup modeset-vs-vblank-race-interruptible:
>>                 pass       -> FAIL       (shard-hsw) fdo#103060
>> Test kms_plane_multiple:
>>         Subgroup atomic-pipe-c-tiling-x:
>>                 pass       -> FAIL       (shard-apl) fdo#103166
>> Test kms_rotation_crc:
>>         Subgroup sprite-rotation-90:
>>                 fail       -> PASS       (shard-apl) fdo#103925
>>
>> fdo#103199 https://bugs.freedesktop.org/show_bug.cgi?id=103199
>> fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
>> fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
>> fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
>> fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
>> fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
>>
>> shard-apl        total:1541 pass:1003 dwarn:1   dfail:1 fail:9   
>> skip:497 time:2569s
>> shard-hsw        total:2680 pass:1784 dwarn:1   dfail:0 fail:3   
>> skip:891 time:11411s
>> Blacklisted hosts:
>> shard-kbl        total:1439 pass:1014 dwarn:1   dfail:1 fail:6   
>> skip:386 time:1390s
>> shard-snb        total:2680 pass:1378 dwarn:1   dfail:0 fail:3   
>> skip:1298 time:6927s
>>
>> == Logs ==
>>
>> For more details see: 
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8640/shards.html
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Comments

Chris Wilson April 10, 2018, 9:21 a.m. UTC | #1
Quoting Sagar Arun Kamble (2018-04-10 06:32:29)
> 
> 
> On 4/9/2018 9:02 PM, Michal Wajdeczko wrote:
> > On Mon, 09 Apr 2018 17:09:18 +0200, Patchwork 
> > <patchwork@emeril.freedesktop.org> wrote:
> >
> >> == Series Details ==
> >>
> >> Series: series starting with [v8,01/12] drm/i915: Park before 
> >> resetting the submission backend
> >> URL   : https://patchwork.freedesktop.org/series/41365/
> >> State : failure
> >>
> >> == Summary ==
> >>
> >> ---- Possible new issues:
> >
> > two variants:
> >
> >>
> >> Test drm_mm:
> >>         Subgroup sanitycheck:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >
> > #1
> >
> > <0>[  400.245461] drv_self-5775    1.... 400208508us : 
> > intel_guc_submission_disable: intel_guc_submission_disable:1255 
> > GEM_BUG_ON(dev_priv->gt.awake)
> >
> > <4>[  400.245871] Call Trace:
> > <4>[  400.245959]  intel_uc_fini_hw+0x4b/0xe0 [i915]
> > <4>[  400.246047]  i915_gem_fini_hw+0x16/0x30 [i915]
> > <4>[  400.246129]  i915_reset+0x1e8/0x2b0 [i915]
> > <4>[  400.246222]  igt_global_reset+0x38/0xe0 [i915]
> >
> Without gem_set_wedged if i915_reset path is invoked we can face this issue.
> igt_global_reset and gem_eio resets are directly invoking 
> i915_handle_error/i915_reset so I think we should fix the IGTs.

No, wrong answer.

> >> Test drv_hangman:
> >>         Subgroup error-state-capture-blt:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >>         Subgroup error-state-capture-bsd:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >>         Subgroup error-state-capture-render:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >>         Subgroup error-state-capture-vebox:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >> Test drv_selftest:
> >>         Subgroup live_guc:
> >>                 pass       -> SKIP       (shard-apl)
> >>         Subgroup live_hangcheck:
> >>                 pass       -> DMESG-FAIL (shard-apl)
> >> Test gem_eio:
> >>         Subgroup execbuf:
> >>                 pass       -> INCOMPLETE (shard-apl)
> >
> > #2:
> >
> > <3>[  227.833798] intel_engine_unpin_breadcrumbs_irq:219 
> > GEM_BUG_ON(!b->irq_enabled)
> >
> > <4>[  227.834607] Call Trace:
> > <4>[  227.834691]  intel_engines_park+0xef/0x180 [i915]
> > <4>[  227.834709]  ? synchronize_irq+0x3e/0xb0
> > <4>[  227.834781]  __i915_gem_park+0x3e/0x160 [i915]
> > <4>[  227.834850]  i915_gem_idle_work_handler+0x1cd/0x220 [i915]
> > <4>[  227.834868]  process_one_work+0x21a/0x640
> >
> >
> irq disabling with GuC submission is not taking into consideration if it 
> was enabled by waiter.

irqs cannot be disabled while in guc mode. It is still the same problem
of being unbalanced across enabling. (i.e. we switch to another mode to
submit the request and then enable the guc, ergo the guc never pins the
irq for itself.)
-Chris

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 671a6d6..f8c0c4d 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -231,6 +231,13 @@  void intel_engine_disarm_breadcrumbs(struct 
intel_engine_cs *engine)
                 return;

         /*
+        * In case of reset with GuC submission we disarm the interrupts
+        * while parking if there are no waiters.
+        */
+       if (USES_GUC_SUBMISSION(engine->i915) && !b->irq_wait)
+               return;
+
+       /*
          * We only disarm the irq when we are idle (all requests 
completed),