Message ID | 20180620135929.23956-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 6/20/2018 6:59 AM, Chris Wilson wrote: > The guc submission backends expects to only be run from (at least) > softirq context, but during our intel_engine_is_idle() check we would > call into the tasklet to make sure it was flushed. As this could occur > from process context, occasionally we would be caught out using a > wait_for_atomic() not from an atomic context: > > [ 59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0)) > [ 59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915] > [ 59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers > [ 59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G U W 4.18.0-rc1-g93475d62c730-drmtip_67+ #1 > [ 59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018 > [ 59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915] > [ 59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f > [ 59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286 > [ 59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001 > [ 59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff > [ 59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000 > [ 59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700 > [ 59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8 > [ 59.939260] FS: 00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000 > [ 59.939261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0 > [ 59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 59.939265] Call Trace: > [ 59.939288] ? intel_engine_is_idle+0x64/0x160 [i915] > [ 59.939323] ? intel_engine_dump+0x638/0x890 [i915] > [ 59.939327] ? seq_printf+0x49/0x70 > [ 59.939353] ? i915_engine_info+0xc8/0x100 [i915] > [ 59.939356] ? drm_get_color_range_name+0x20/0x20 > [ 59.939361] ? seq_read+0xf1/0x470 > [ 59.939365] ? trace_hardirqs_on_caller+0xe0/0x1b0 > [ 59.939370] ? full_proxy_read+0x51/0x80 > [ 59.939389] ? __vfs_read+0x31/0x170 > [ 59.939395] ? do_sys_open+0x13b/0x240 > [ 59.939398] ? rcu_read_lock_sched_held+0x6f/0x80 > [ 59.939401] ? vfs_read+0x9e/0x140 > [ 59.939404] ? ksys_read+0x50/0xc0 > [ 59.939409] ? do_syscall_64+0x55/0x190 > [ 59.939412] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe > [ 59.939420] irq event stamp: 552834 > [ 59.939422] hardirqs last enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600 > [ 59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100 > [ 59.939451] softirqs last enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915] > [ 59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915] > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977 > Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Michał Winiarski <michal.winiarski@intel.com> > Cc: Michel Thierry <michel.thierry@intel.com> > --- > drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c > index 32bf3a408d46..d3264bd6e9dc 100644 > --- a/drivers/gpu/drm/i915/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > @@ -1000,10 +1000,12 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) > if (READ_ONCE(engine->execlists.active)) { > struct intel_engine_execlists *execlists = &engine->execlists; > > + local_bh_disable(); > if (tasklet_trylock(&execlists->tasklet)) { > execlists->tasklet.func(execlists->tasklet.data); > tasklet_unlock(&execlists->tasklet); > } > + local_bh_enable(); It could be just when USES_GUC_SUBMISSION is true, but nothing wrong with this. Reviewed-by: Michel Thierry <michel.thierry@intel.com> > > if (READ_ONCE(execlists->active)) > return false; >
Quoting Michel Thierry (2018-06-20 21:24:04) > On 6/20/2018 6:59 AM, Chris Wilson wrote: > > The guc submission backends expects to only be run from (at least) > > softirq context, but during our intel_engine_is_idle() check we would > > call into the tasklet to make sure it was flushed. As this could occur > > from process context, occasionally we would be caught out using a > > wait_for_atomic() not from an atomic context: > > > > [ 59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0)) > > [ 59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915] > > [ 59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers > > [ 59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G U W 4.18.0-rc1-g93475d62c730-drmtip_67+ #1 > > [ 59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018 > > [ 59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915] > > [ 59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f > > [ 59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286 > > [ 59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001 > > [ 59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff > > [ 59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000 > > [ 59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700 > > [ 59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8 > > [ 59.939260] FS: 00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000 > > [ 59.939261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0 > > [ 59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [ 59.939265] Call Trace: > > [ 59.939288] ? intel_engine_is_idle+0x64/0x160 [i915] > > [ 59.939323] ? intel_engine_dump+0x638/0x890 [i915] > > [ 59.939327] ? seq_printf+0x49/0x70 > > [ 59.939353] ? i915_engine_info+0xc8/0x100 [i915] > > [ 59.939356] ? drm_get_color_range_name+0x20/0x20 > > [ 59.939361] ? seq_read+0xf1/0x470 > > [ 59.939365] ? trace_hardirqs_on_caller+0xe0/0x1b0 > > [ 59.939370] ? full_proxy_read+0x51/0x80 > > [ 59.939389] ? __vfs_read+0x31/0x170 > > [ 59.939395] ? do_sys_open+0x13b/0x240 > > [ 59.939398] ? rcu_read_lock_sched_held+0x6f/0x80 > > [ 59.939401] ? vfs_read+0x9e/0x140 > > [ 59.939404] ? ksys_read+0x50/0xc0 > > [ 59.939409] ? do_syscall_64+0x55/0x190 > > [ 59.939412] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe > > [ 59.939420] irq event stamp: 552834 > > [ 59.939422] hardirqs last enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600 > > [ 59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100 > > [ 59.939451] softirqs last enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915] > > [ 59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915] > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977 > > Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets") > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Michał Winiarski <michal.winiarski@intel.com> > > Cc: Michel Thierry <michel.thierry@intel.com> > > --- > > drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c > > index 32bf3a408d46..d3264bd6e9dc 100644 > > --- a/drivers/gpu/drm/i915/intel_engine_cs.c > > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c > > @@ -1000,10 +1000,12 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) > > if (READ_ONCE(engine->execlists.active)) { > > struct intel_engine_execlists *execlists = &engine->execlists; > > > > + local_bh_disable(); > > if (tasklet_trylock(&execlists->tasklet)) { > > execlists->tasklet.func(execlists->tasklet.data); > > tasklet_unlock(&execlists->tasklet); > > } > > + local_bh_enable(); > > It could be just when USES_GUC_SUBMISSION is true, but nothing wrong > with this. Could do, but I think it's simpler overall if we say that tasklets are always run in bh/irq context. Such direct invocations are rare... Thanks for the quick review, -Chris
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 32bf3a408d46..d3264bd6e9dc 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1000,10 +1000,12 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) if (READ_ONCE(engine->execlists.active)) { struct intel_engine_execlists *execlists = &engine->execlists; + local_bh_disable(); if (tasklet_trylock(&execlists->tasklet)) { execlists->tasklet.func(execlists->tasklet.data); tasklet_unlock(&execlists->tasklet); } + local_bh_enable(); if (READ_ONCE(execlists->active)) return false;
The guc submission backends expects to only be run from (at least) softirq context, but during our intel_engine_is_idle() check we would call into the tasklet to make sure it was flushed. As this could occur from process context, occasionally we would be caught out using a wait_for_atomic() not from an atomic context: [ 59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0)) [ 59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915] [ 59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [ 59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G U W 4.18.0-rc1-g93475d62c730-drmtip_67+ #1 [ 59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018 [ 59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915] [ 59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f [ 59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286 [ 59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001 [ 59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff [ 59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000 [ 59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700 [ 59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8 [ 59.939260] FS: 00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000 [ 59.939261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0 [ 59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 59.939265] Call Trace: [ 59.939288] ? intel_engine_is_idle+0x64/0x160 [i915] [ 59.939323] ? intel_engine_dump+0x638/0x890 [i915] [ 59.939327] ? seq_printf+0x49/0x70 [ 59.939353] ? i915_engine_info+0xc8/0x100 [i915] [ 59.939356] ? drm_get_color_range_name+0x20/0x20 [ 59.939361] ? seq_read+0xf1/0x470 [ 59.939365] ? trace_hardirqs_on_caller+0xe0/0x1b0 [ 59.939370] ? full_proxy_read+0x51/0x80 [ 59.939389] ? __vfs_read+0x31/0x170 [ 59.939395] ? do_sys_open+0x13b/0x240 [ 59.939398] ? rcu_read_lock_sched_held+0x6f/0x80 [ 59.939401] ? vfs_read+0x9e/0x140 [ 59.939404] ? ksys_read+0x50/0xc0 [ 59.939409] ? do_syscall_64+0x55/0x190 [ 59.939412] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 59.939420] irq event stamp: 552834 [ 59.939422] hardirqs last enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600 [ 59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100 [ 59.939451] softirqs last enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915] [ 59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977 Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> --- drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++ 1 file changed, 2 insertions(+)