diff mbox

drm/i915: Disable bh around call to tasklet

Message ID 20180620135929.23956-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson June 20, 2018, 1:59 p.m. UTC
The guc submission backends expects to only be run from (at least)
softirq context, but during our intel_engine_is_idle() check we would
call into the tasklet to make sure it was flushed. As this could occur
from process context, occasionally we would be caught out using a
wait_for_atomic() not from an atomic context:

[   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
[   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
[   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
[   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
[   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
[   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f
[   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
[   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
[   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
[   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
[   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
[   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
[   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
[   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
[   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.939265] Call Trace:
[   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
[   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
[   59.939327]  ? seq_printf+0x49/0x70
[   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
[   59.939356]  ? drm_get_color_range_name+0x20/0x20
[   59.939361]  ? seq_read+0xf1/0x470
[   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[   59.939370]  ? full_proxy_read+0x51/0x80
[   59.939389]  ? __vfs_read+0x31/0x170
[   59.939395]  ? do_sys_open+0x13b/0x240
[   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
[   59.939401]  ? vfs_read+0x9e/0x140
[   59.939404]  ? ksys_read+0x50/0xc0
[   59.939409]  ? do_syscall_64+0x55/0x190
[   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   59.939420] irq event stamp: 552834
[   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
[   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
[   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
[   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977
Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Michel Thierry June 20, 2018, 8:24 p.m. UTC | #1
On 6/20/2018 6:59 AM, Chris Wilson wrote:
> The guc submission backends expects to only be run from (at least)
> softirq context, but during our intel_engine_is_idle() check we would
> call into the tasklet to make sure it was flushed. As this could occur
> from process context, occasionally we would be caught out using a
> wait_for_atomic() not from an atomic context:
> 
> [   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
> [   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
> [   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
> [   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
> [   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
> [   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
> [   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f
> [   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
> [   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
> [   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
> [   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
> [   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
> [   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
> [   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
> [   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
> [   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.939265] Call Trace:
> [   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
> [   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
> [   59.939327]  ? seq_printf+0x49/0x70
> [   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
> [   59.939356]  ? drm_get_color_range_name+0x20/0x20
> [   59.939361]  ? seq_read+0xf1/0x470
> [   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> [   59.939370]  ? full_proxy_read+0x51/0x80
> [   59.939389]  ? __vfs_read+0x31/0x170
> [   59.939395]  ? do_sys_open+0x13b/0x240
> [   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
> [   59.939401]  ? vfs_read+0x9e/0x140
> [   59.939404]  ? ksys_read+0x50/0xc0
> [   59.939409]  ? do_syscall_64+0x55/0x190
> [   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   59.939420] irq event stamp: 552834
> [   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
> [   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
> [   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
> [   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977
> Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 32bf3a408d46..d3264bd6e9dc 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1000,10 +1000,12 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>   	if (READ_ONCE(engine->execlists.active)) {
>   		struct intel_engine_execlists *execlists = &engine->execlists;
>   
> +		local_bh_disable();
>   		if (tasklet_trylock(&execlists->tasklet)) {
>   			execlists->tasklet.func(execlists->tasklet.data);
>   			tasklet_unlock(&execlists->tasklet);
>   		}
> +		local_bh_enable();

It could be just when USES_GUC_SUBMISSION is true, but nothing wrong 
with this.

Reviewed-by: Michel Thierry <michel.thierry@intel.com>

>   
>   		if (READ_ONCE(execlists->active))
>   			return false;
>
Chris Wilson June 20, 2018, 9:28 p.m. UTC | #2
Quoting Michel Thierry (2018-06-20 21:24:04)
> On 6/20/2018 6:59 AM, Chris Wilson wrote:
> > The guc submission backends expects to only be run from (at least)
> > softirq context, but during our intel_engine_is_idle() check we would
> > call into the tasklet to make sure it was flushed. As this could occur
> > from process context, occasionally we would be caught out using a
> > wait_for_atomic() not from an atomic context:
> > 
> > [   59.939091] WARN_ON_ONCE((1) && !(preempt_count() != 0))
> > [   59.939142] WARNING: CPU: 1 PID: 2901 at drivers/gpu/drm/i915/intel_guc_submission.c:615 guc_submission_tasklet+0x784/0xa90 [i915]
> > [   59.939143] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers
> > [   59.939164] CPU: 1 PID: 2901 Comm: gem_exec_schedu Tainted: G     U  W         4.18.0-rc1-g93475d62c730-drmtip_67+ #1
> > [   59.939165] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
> > [   59.939188] RIP: 0010:guc_submission_tasklet+0x784/0xa90 [i915]
> > [   59.939189] Code: fc ff ff 80 3d 2f 87 11 00 00 0f 85 80 fb ff ff 48 c7 c6 f8 49 40 c0 48 c7 c7 80 41 3e c0 c6 05 14 87 11 00 01 e8 2c ea d6 d3 <0f> 0b e9 5f fb ff ff 8b 46 38 89 cf 31 c7 83 e7 c0 75 08 39 c1 0f
> > [   59.939253] RSP: 0018:ffffaafe08a03c10 EFLAGS: 00010286
> > [   59.939255] RAX: 0000000000000000 RBX: ffff8f9112c246f0 RCX: 0000000000000001
> > [   59.939256] RDX: 0000000080000001 RSI: ffffffff95086d8e RDI: 00000000ffffffff
> > [   59.939257] RBP: ffff8f9112c24680 R08: 000000009517be77 R09: 0000000000000000
> > [   59.939258] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f9112c24700
> > [   59.939259] R13: ffff8f9112c24700 R14: 0000000000000000 R15: ffff8f9112c242a8
> > [   59.939260] FS:  00007fc2cc7e5980(0000) GS:ffff8f9136c40000(0000) knlGS:0000000000000000
> > [   59.939261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   59.939262] CR2: 00007fc2cc815040 CR3: 000000021f10e003 CR4: 00000000003606e0
> > [   59.939263] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   59.939264] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   59.939265] Call Trace:
> > [   59.939288]  ? intel_engine_is_idle+0x64/0x160 [i915]
> > [   59.939323]  ? intel_engine_dump+0x638/0x890 [i915]
> > [   59.939327]  ? seq_printf+0x49/0x70
> > [   59.939353]  ? i915_engine_info+0xc8/0x100 [i915]
> > [   59.939356]  ? drm_get_color_range_name+0x20/0x20
> > [   59.939361]  ? seq_read+0xf1/0x470
> > [   59.939365]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> > [   59.939370]  ? full_proxy_read+0x51/0x80
> > [   59.939389]  ? __vfs_read+0x31/0x170
> > [   59.939395]  ? do_sys_open+0x13b/0x240
> > [   59.939398]  ? rcu_read_lock_sched_held+0x6f/0x80
> > [   59.939401]  ? vfs_read+0x9e/0x140
> > [   59.939404]  ? ksys_read+0x50/0xc0
> > [   59.939409]  ? do_syscall_64+0x55/0x190
> > [   59.939412]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [   59.939420] irq event stamp: 552834
> > [   59.939422] hardirqs last  enabled at (552833): [<ffffffff940fc74c>] console_unlock+0x3fc/0x600
> > [   59.939425] hardirqs last disabled at (552834): [<ffffffff94a0111c>] error_entry+0x7c/0x100
> > [   59.939451] softirqs last  enabled at (552614): [<ffffffffc02e0f53>] i915_request_add+0x2e3/0x7b0 [i915]
> > [   59.939470] softirqs last disabled at (552604): [<ffffffffc02e0ecb>] i915_request_add+0x25b/0x7b0 [i915]
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106977
> > Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_engine_cs.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index 32bf3a408d46..d3264bd6e9dc 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -1000,10 +1000,12 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
> >       if (READ_ONCE(engine->execlists.active)) {
> >               struct intel_engine_execlists *execlists = &engine->execlists;
> >   
> > +             local_bh_disable();
> >               if (tasklet_trylock(&execlists->tasklet)) {
> >                       execlists->tasklet.func(execlists->tasklet.data);
> >                       tasklet_unlock(&execlists->tasklet);
> >               }
> > +             local_bh_enable();
> 
> It could be just when USES_GUC_SUBMISSION is true, but nothing wrong 
> with this.

Could do, but I think it's simpler overall if we say that tasklets are
always run in bh/irq context. Such direct invocations are rare...

Thanks for the quick review,
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 32bf3a408d46..d3264bd6e9dc 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1000,10 +1000,12 @@  bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (READ_ONCE(engine->execlists.active)) {
 		struct intel_engine_execlists *execlists = &engine->execlists;
 
+		local_bh_disable();
 		if (tasklet_trylock(&execlists->tasklet)) {
 			execlists->tasklet.func(execlists->tasklet.data);
 			tasklet_unlock(&execlists->tasklet);
 		}
+		local_bh_enable();
 
 		if (READ_ONCE(execlists->active))
 			return false;