drm/i915: Verify the engine after acquiring the active.lock
diff mbox series

Message ID 20190916113808.30386-1-chris@chris-wilson.co.uk
State New
Headers show
Series
  • drm/i915: Verify the engine after acquiring the active.lock
Related show

Commit Message

Chris Wilson Sept. 16, 2019, 11:38 a.m. UTC
When using virtual engines, the rq->engine is not stable until we hold
the engine->active.lock (as the virtual engine may be exchanged with the
sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
we may retire a request concurrently with resubmitting it to HW, we need
to be extra careful to verify we are holding the correct lock for the
request's active list. This is similar to the issue we saw with
rescheduling the virtual requests, see sched_lock_engine().

Or else:

<4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
<4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
<4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
<4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G     U            5.3.0-CI-CI_DRM_6898+ #1
<4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
<4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
<4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
<4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
<4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
<4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
<4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
<4> [876.736168] FS:  0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
<4> [876.736169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
<4> [876.736171] PKRU: 55555554
<4> [876.736172] Call Trace:
<4> [876.736226]  __i915_request_submit+0x152/0x370 [i915]
<4> [876.736263]  __execlists_submission_tasklet+0x6da/0x1f50 [i915]
<4> [876.736293]  ? execlists_submission_tasklet+0x29/0x50 [i915]
<4> [876.736321]  execlists_submission_tasklet+0x34/0x50 [i915]
<4> [876.736325]  tasklet_action_common.isra.5+0x47/0xb0
<4> [876.736328]  __do_softirq+0xd8/0x4ae
<4> [876.736332]  ? smpboot_thread_fn+0x23/0x280
<4> [876.736334]  ? smpboot_thread_fn+0x6b/0x280
<4> [876.736336]  run_ksoftirqd+0x2b/0x50
<4> [876.736338]  smpboot_thread_fn+0x1d3/0x280
<4> [876.736341]  ? sort_range+0x20/0x20
<4> [876.736343]  kthread+0x119/0x130
<4> [876.736345]  ? kthread_park+0xa0/0xa0
<4> [876.736347]  ret_from_fork+0x24/0x50
<4> [876.736353] irq event stamp: 2290145
<4> [876.736356] hardirqs last  enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
<4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
<4> [876.736360] softirqs last  enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
<4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
<4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
<4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
<4> [876.736406] ------------[ cut here ]------------
<4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
<4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
<4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
<4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G     U  W         5.3.0-CI-CI_DRM_6898+ #1
<4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
<4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
<4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
<4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
<4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
<4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
<4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
<4> [876.736447] FS:  00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
<4> [876.736448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
<4> [876.736450] PKRU: 55555554
<4> [876.736451] Call Trace:
<4> [876.736488]  i915_request_retire+0x224/0x8e0 [i915]
<4> [876.736521]  i915_request_create+0x4b/0x1b0 [i915]
<4> [876.736550]  nop_virtual_engine+0x230/0x4d0 [i915]

Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Tvrtko Ursulin Sept. 17, 2019, 2:59 p.m. UTC | #1
On 16/09/2019 12:38, Chris Wilson wrote:
> When using virtual engines, the rq->engine is not stable until we hold
> the engine->active.lock (as the virtual engine may be exchanged with the
> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> we may retire a request concurrently with resubmitting it to HW, we need
> to be extra careful to verify we are holding the correct lock for the
> request's active list. This is similar to the issue we saw with
> rescheduling the virtual requests, see sched_lock_engine().
> 
> Or else:
> 
> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
> <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
> <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
> <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G     U            5.3.0-CI-CI_DRM_6898+ #1
> <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
> <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
> <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
> <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
> <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
> <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
> <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
> <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
> <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
> <4> [876.736168] FS:  0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
> <4> [876.736169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
> <4> [876.736171] PKRU: 55555554
> <4> [876.736172] Call Trace:
> <4> [876.736226]  __i915_request_submit+0x152/0x370 [i915]
> <4> [876.736263]  __execlists_submission_tasklet+0x6da/0x1f50 [i915]
> <4> [876.736293]  ? execlists_submission_tasklet+0x29/0x50 [i915]
> <4> [876.736321]  execlists_submission_tasklet+0x34/0x50 [i915]
> <4> [876.736325]  tasklet_action_common.isra.5+0x47/0xb0
> <4> [876.736328]  __do_softirq+0xd8/0x4ae
> <4> [876.736332]  ? smpboot_thread_fn+0x23/0x280
> <4> [876.736334]  ? smpboot_thread_fn+0x6b/0x280
> <4> [876.736336]  run_ksoftirqd+0x2b/0x50
> <4> [876.736338]  smpboot_thread_fn+0x1d3/0x280
> <4> [876.736341]  ? sort_range+0x20/0x20
> <4> [876.736343]  kthread+0x119/0x130
> <4> [876.736345]  ? kthread_park+0xa0/0xa0
> <4> [876.736347]  ret_from_fork+0x24/0x50
> <4> [876.736353] irq event stamp: 2290145
> <4> [876.736356] hardirqs last  enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
> <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
> <4> [876.736360] softirqs last  enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
> <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
> <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
> <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
> <4> [876.736406] ------------[ cut here ]------------
> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
> <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
> <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
> <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G     U  W         5.3.0-CI-CI_DRM_6898+ #1
> <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
> <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
> <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
> <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
> <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
> <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
> <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
> <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
> <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
> <4> [876.736447] FS:  00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
> <4> [876.736448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
> <4> [876.736450] PKRU: 55555554
> <4> [876.736451] Call Trace:
> <4> [876.736488]  i915_request_retire+0x224/0x8e0 [i915]
> <4> [876.736521]  i915_request_create+0x4b/0x1b0 [i915]
> <4> [876.736550]  nop_virtual_engine+0x230/0x4d0 [i915]
> 
> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Matthew Auld <matthew.auld@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 25 ++++++++++++++++++++++---
>   1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 754a78364a63..f12358150097 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -195,6 +195,27 @@ static void free_capture_list(struct i915_request *request)
>   	}
>   }
>   
> +static void remove_from_engine(struct i915_request *rq)
> +{
> +	struct intel_engine_cs *engine, *locked;
> +
> +	/*
> +	 * Virtual engines complicate acquiring the engine timeline lock,
> +	 * as their rq->engine pointer is not stable until under that
> +	 * engine lock. The simple ploy we use is to take the lock then
> +	 * check that the rq still belongs to the newly locked engine.
> +	 */
> +	locked = READ_ONCE(rq->engine);

I thought we had rq->owner and so could bail early if that one was 
non-virtual, but that field seems like a distant memory. Still, could be 
a thought to store the original/fixed engine in there somewhere. Maybe 
it would enable some interesting asserts. Or maybe not..

> +	spin_lock(&locked->active.lock);
> +	while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
> +		spin_unlock(&locked->active.lock);
> +		spin_lock(&engine->active.lock);
> +		locked = engine;
> +	}
> +	list_del(&rq->sched.link);
> +	spin_unlock(&locked->active.lock);
> +}
> +
>   static bool i915_request_retire(struct i915_request *rq)
>   {
>   	struct i915_active_request *active, *next;
> @@ -260,9 +281,7 @@ static bool i915_request_retire(struct i915_request *rq)
>   	 * request that we have removed from the HW and put back on a run
>   	 * queue.
>   	 */
> -	spin_lock(&rq->engine->active.lock);
> -	list_del(&rq->sched.link);
> -	spin_unlock(&rq->engine->active.lock);
> +	remove_from_engine(rq);
>   
>   	spin_lock(&rq->lock);
>   	i915_request_mark_complete(rq);
> 

How exactly we can retire and resubmit at the same time? Isn't 
retirement an one off event when seqno is complete, rq signaled?

Regards,

Tvrtko
Chris Wilson Sept. 17, 2019, 3:17 p.m. UTC | #2
Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
> 
> On 16/09/2019 12:38, Chris Wilson wrote:
> > When using virtual engines, the rq->engine is not stable until we hold
> > the engine->active.lock (as the virtual engine may be exchanged with the
> > sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > we may retire a request concurrently with resubmitting it to HW, we need
> > to be extra careful to verify we are holding the correct lock for the
> > request's active list. This is similar to the issue we saw with
> > rescheduling the virtual requests, see sched_lock_engine().
> > 
> > Or else:
> > 
> > <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
> > <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
> > <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
> > <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G     U            5.3.0-CI-CI_DRM_6898+ #1
> > <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
> > <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
> > <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
> > <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
> > <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
> > <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
> > <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
> > <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
> > <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
> > <4> [876.736168] FS:  0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
> > <4> [876.736169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
> > <4> [876.736171] PKRU: 55555554
> > <4> [876.736172] Call Trace:
> > <4> [876.736226]  __i915_request_submit+0x152/0x370 [i915]
> > <4> [876.736263]  __execlists_submission_tasklet+0x6da/0x1f50 [i915]
> > <4> [876.736293]  ? execlists_submission_tasklet+0x29/0x50 [i915]
> > <4> [876.736321]  execlists_submission_tasklet+0x34/0x50 [i915]
> > <4> [876.736325]  tasklet_action_common.isra.5+0x47/0xb0
> > <4> [876.736328]  __do_softirq+0xd8/0x4ae
> > <4> [876.736332]  ? smpboot_thread_fn+0x23/0x280
> > <4> [876.736334]  ? smpboot_thread_fn+0x6b/0x280
> > <4> [876.736336]  run_ksoftirqd+0x2b/0x50
> > <4> [876.736338]  smpboot_thread_fn+0x1d3/0x280
> > <4> [876.736341]  ? sort_range+0x20/0x20
> > <4> [876.736343]  kthread+0x119/0x130
> > <4> [876.736345]  ? kthread_park+0xa0/0xa0
> > <4> [876.736347]  ret_from_fork+0x24/0x50
> > <4> [876.736353] irq event stamp: 2290145
> > <4> [876.736356] hardirqs last  enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
> > <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
> > <4> [876.736360] softirqs last  enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
> > <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
> > <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
> > <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
> > <4> [876.736406] ------------[ cut here ]------------
> > <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
> > <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
> > <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
> > <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G     U  W         5.3.0-CI-CI_DRM_6898+ #1
> > <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
> > <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
> > <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
> > <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
> > <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
> > <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
> > <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
> > <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
> > <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
> > <4> [876.736447] FS:  00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
> > <4> [876.736448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
> > <4> [876.736450] PKRU: 55555554
> > <4> [876.736451] Call Trace:
> > <4> [876.736488]  i915_request_retire+0x224/0x8e0 [i915]
> > <4> [876.736521]  i915_request_create+0x4b/0x1b0 [i915]
> > <4> [876.736550]  nop_virtual_engine+0x230/0x4d0 [i915]
> > 
> > Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 25 ++++++++++++++++++++++---
> >   1 file changed, 22 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 754a78364a63..f12358150097 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -195,6 +195,27 @@ static void free_capture_list(struct i915_request *request)
> >       }
> >   }
> >   
> > +static void remove_from_engine(struct i915_request *rq)
> > +{
> > +     struct intel_engine_cs *engine, *locked;
> > +
> > +     /*
> > +      * Virtual engines complicate acquiring the engine timeline lock,
> > +      * as their rq->engine pointer is not stable until under that
> > +      * engine lock. The simple ploy we use is to take the lock then
> > +      * check that the rq still belongs to the newly locked engine.
> > +      */
> > +     locked = READ_ONCE(rq->engine);
> 
> I thought we had rq->owner and so could bail early if that one was 
> non-virtual, but that field seems like a distant memory. Still, could be 
> a thought to store the original/fixed engine in there somewhere. Maybe 
> it would enable some interesting asserts. Or maybe not..

You are thinking of ce->engine (which used to be ce->owner, to
distinguish it from the ce->inflight engine). So you could do something
like

	if (intel_engine_is_virtual(rq->hw_context->engine))
	    ... and then have the same trail and error spinlock ...

by which time the code is equally branchy, and our assumption here is
the spinlock is the correct one for the vast majority of cases, and so
we very rarely have to dance.

> > +     spin_lock(&locked->active.lock);
> > +     while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
> > +             spin_unlock(&locked->active.lock);
> > +             spin_lock(&engine->active.lock);
> > +             locked = engine;
> > +     }
> > +     list_del(&rq->sched.link);
> > +     spin_unlock(&locked->active.lock);
> > +}
> > +
> >   static bool i915_request_retire(struct i915_request *rq)
> >   {
> >       struct i915_active_request *active, *next;
> > @@ -260,9 +281,7 @@ static bool i915_request_retire(struct i915_request *rq)
> >        * request that we have removed from the HW and put back on a run
> >        * queue.
> >        */
> > -     spin_lock(&rq->engine->active.lock);
> > -     list_del(&rq->sched.link);
> > -     spin_unlock(&rq->engine->active.lock);
> > +     remove_from_engine(rq);
> >   
> >       spin_lock(&rq->lock);
> >       i915_request_mark_complete(rq);
> > 
> 
> How exactly we can retire and resubmit at the same time? Isn't 
> retirement an one off event when seqno is complete, rq signaled?

Yes. So preempt-to-busy introduces a window where the request is still
on HW but we have returned it back to the submission queue. We catch up
with the HW on the next process_csb, but it may have completed the
request in the mean time (it is just not allowed to advance beyond the
subsequent breadcrumb and so prevented from overtaking our knowledge of
RING_TAIL and so we avoid telling the HW to go "backwards".).
-Chris
Tvrtko Ursulin Sept. 18, 2019, 3:54 p.m. UTC | #3
On 17/09/2019 16:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
>>
>> On 16/09/2019 12:38, Chris Wilson wrote:
>>> When using virtual engines, the rq->engine is not stable until we hold
>>> the engine->active.lock (as the virtual engine may be exchanged with the
>>> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
>>> we may retire a request concurrently with resubmitting it to HW, we need
>>> to be extra careful to verify we are holding the correct lock for the
>>> request's active list. This is similar to the issue we saw with
>>> rescheduling the virtual requests, see sched_lock_engine().
>>>
>>> Or else:
>>>
>>> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
>>> <4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
>>> <4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
>>> <4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G     U            5.3.0-CI-CI_DRM_6898+ #1
>>> <4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
>>> <4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
>>> <4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
>>> <4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
>>> <4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
>>> <4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
>>> <4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
>>> <4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
>>> <4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
>>> <4> [876.736168] FS:  0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
>>> <4> [876.736169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
>>> <4> [876.736171] PKRU: 55555554
>>> <4> [876.736172] Call Trace:
>>> <4> [876.736226]  __i915_request_submit+0x152/0x370 [i915]
>>> <4> [876.736263]  __execlists_submission_tasklet+0x6da/0x1f50 [i915]
>>> <4> [876.736293]  ? execlists_submission_tasklet+0x29/0x50 [i915]
>>> <4> [876.736321]  execlists_submission_tasklet+0x34/0x50 [i915]
>>> <4> [876.736325]  tasklet_action_common.isra.5+0x47/0xb0
>>> <4> [876.736328]  __do_softirq+0xd8/0x4ae
>>> <4> [876.736332]  ? smpboot_thread_fn+0x23/0x280
>>> <4> [876.736334]  ? smpboot_thread_fn+0x6b/0x280
>>> <4> [876.736336]  run_ksoftirqd+0x2b/0x50
>>> <4> [876.736338]  smpboot_thread_fn+0x1d3/0x280
>>> <4> [876.736341]  ? sort_range+0x20/0x20
>>> <4> [876.736343]  kthread+0x119/0x130
>>> <4> [876.736345]  ? kthread_park+0xa0/0xa0
>>> <4> [876.736347]  ret_from_fork+0x24/0x50
>>> <4> [876.736353] irq event stamp: 2290145
>>> <4> [876.736356] hardirqs last  enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
>>> <4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
>>> <4> [876.736360] softirqs last  enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
>>> <4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
>>> <4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
>>> <4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
>>> <4> [876.736406] ------------[ cut here ]------------
>>> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
>>> <4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
>>> <4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
>>> <4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G     U  W         5.3.0-CI-CI_DRM_6898+ #1
>>> <4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
>>> <4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
>>> <4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
>>> <4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
>>> <4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
>>> <4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
>>> <4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
>>> <4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
>>> <4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
>>> <4> [876.736447] FS:  00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
>>> <4> [876.736448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
>>> <4> [876.736450] PKRU: 55555554
>>> <4> [876.736451] Call Trace:
>>> <4> [876.736488]  i915_request_retire+0x224/0x8e0 [i915]
>>> <4> [876.736521]  i915_request_create+0x4b/0x1b0 [i915]
>>> <4> [876.736550]  nop_virtual_engine+0x230/0x4d0 [i915]
>>>
>>> Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Matthew Auld <matthew.auld@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c | 25 ++++++++++++++++++++++---
>>>    1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 754a78364a63..f12358150097 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -195,6 +195,27 @@ static void free_capture_list(struct i915_request *request)
>>>        }
>>>    }
>>>    
>>> +static void remove_from_engine(struct i915_request *rq)
>>> +{
>>> +     struct intel_engine_cs *engine, *locked;
>>> +
>>> +     /*
>>> +      * Virtual engines complicate acquiring the engine timeline lock,
>>> +      * as their rq->engine pointer is not stable until under that
>>> +      * engine lock. The simple ploy we use is to take the lock then
>>> +      * check that the rq still belongs to the newly locked engine.
>>> +      */
>>> +     locked = READ_ONCE(rq->engine);
>>
>> I thought we had rq->owner and so could bail early if that one was
>> non-virtual, but that field seems like a distant memory. Still, could be
>> a thought to store the original/fixed engine in there somewhere. Maybe
>> it would enable some interesting asserts. Or maybe not..
> 
> You are thinking of ce->engine (which used to be ce->owner, to
> distinguish it from the ce->inflight engine). So you could do something
> like
> 
> 	if (intel_engine_is_virtual(rq->hw_context->engine))
> 	    ... and then have the same trail and error spinlock ...
> 
> by which time the code is equally branchy, and our assumption here is
> the spinlock is the correct one for the vast majority of cases, and so
> we very rarely have to dance.
> 
>>> +     spin_lock(&locked->active.lock);
>>> +     while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
>>> +             spin_unlock(&locked->active.lock);
>>> +             spin_lock(&engine->active.lock);
>>> +             locked = engine;
>>> +     }
>>> +     list_del(&rq->sched.link);
>>> +     spin_unlock(&locked->active.lock);
>>> +}
>>> +
>>>    static bool i915_request_retire(struct i915_request *rq)
>>>    {
>>>        struct i915_active_request *active, *next;
>>> @@ -260,9 +281,7 @@ static bool i915_request_retire(struct i915_request *rq)
>>>         * request that we have removed from the HW and put back on a run
>>>         * queue.
>>>         */
>>> -     spin_lock(&rq->engine->active.lock);
>>> -     list_del(&rq->sched.link);
>>> -     spin_unlock(&rq->engine->active.lock);
>>> +     remove_from_engine(rq);
>>>    
>>>        spin_lock(&rq->lock);
>>>        i915_request_mark_complete(rq);
>>>
>>
>> How exactly we can retire and resubmit at the same time? Isn't
>> retirement an one off event when seqno is complete, rq signaled?
> 
> Yes. So preempt-to-busy introduces a window where the request is still
> on HW but we have returned it back to the submission queue. We catch up
> with the HW on the next process_csb, but it may have completed the
> request in the mean time (it is just not allowed to advance beyond the
> subsequent breadcrumb and so prevented from overtaking our knowledge of
> RING_TAIL and so we avoid telling the HW to go "backwards".).

Would it be sufficient to do:

   engine = READ_ONCE(rq->engine);
   spin_lock(...);
   list_del(...);
   spin_unlock(engine->active.lock);

To ensure the same engine is used? Although the oops is not about 
spinlock but list corruption. How does the list get corrupt though? 
list_del does not care on which list the request is.. If it is really 
key to have the correct lock, then why it is enough to re-check the 
engine after taking the lock? Why rq->engine couldn't change under the 
lock again? rq->engine does get updated under the very lock, no?

Regards,

Tvrtko
Chris Wilson Sept. 18, 2019, 4:10 p.m. UTC | #4
Quoting Tvrtko Ursulin (2019-09-18 16:54:36)
> 
> On 17/09/2019 16:17, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
> >>
> >> On 16/09/2019 12:38, Chris Wilson wrote:
> >>> When using virtual engines, the rq->engine is not stable until we hold
> >>> the engine->active.lock (as the virtual engine may be exchanged with the
> >>> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> >>> we may retire a request concurrently with resubmitting it to HW, we need
> >>> to be extra careful to verify we are holding the correct lock for the
> >>> request's active list. This is similar to the issue we saw with
> >>> rescheduling the virtual requests, see sched_lock_engine().
> >>>
> >>> Or else:
> >>>
> >>> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
...
> >>> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730

> > Yes. So preempt-to-busy introduces a window where the request is still
> > on HW but we have returned it back to the submission queue. We catch up
> > with the HW on the next process_csb, but it may have completed the
> > request in the mean time (it is just not allowed to advance beyond the
> > subsequent breadcrumb and so prevented from overtaking our knowledge of
> > RING_TAIL and so we avoid telling the HW to go "backwards".).
> 
> Would it be sufficient to do:
> 
>    engine = READ_ONCE(rq->engine);
>    spin_lock(...);
>    list_del(...);
>    spin_unlock(engine->active.lock);
> 
> To ensure the same engine is used? Although the oops is not about 
> spinlock but list corruption. How does the list get corrupt though? 
> list_del does not care on which list the request is.. If it is really 
> key to have the correct lock, then why it is enough to re-check the 
> engine after taking the lock? Why rq->engine couldn't change under the 
> lock again? rq->engine does get updated under the very lock, no?

Don't forget that list_del changes the list around it:
list_del() {
	list->prev->next = list->next;
	list->next->prev = list->prev;
}

rq->engine can't change under the real->active.lock, as the assignment
to rq->engine = (virtual, real) is made under the real->active.lock.

execlists_dequeue:
	real->active.lock
	ve->active.lock

__unwind_incomplete_requests:
	real->active.lock

Hmm. I trust the trick employed in the patch is well proven by this
point, but if we took the nested ve lock inside __unwind, do we need to
worry. Hmm.
-Chris
Chris Wilson Sept. 18, 2019, 4:21 p.m. UTC | #5
Quoting Chris Wilson (2019-09-18 17:10:26)
> Quoting Tvrtko Ursulin (2019-09-18 16:54:36)
> > 
> > On 17/09/2019 16:17, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
> > >>
> > >> On 16/09/2019 12:38, Chris Wilson wrote:
> > >>> When using virtual engines, the rq->engine is not stable until we hold
> > >>> the engine->active.lock (as the virtual engine may be exchanged with the
> > >>> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
> > >>> we may retire a request concurrently with resubmitting it to HW, we need
> > >>> to be extra careful to verify we are holding the correct lock for the
> > >>> request's active list. This is similar to the issue we saw with
> > >>> rescheduling the virtual requests, see sched_lock_engine().
> > >>>
> > >>> Or else:
> > >>>
> > >>> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
> ...
> > >>> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
> 
> > > Yes. So preempt-to-busy introduces a window where the request is still
> > > on HW but we have returned it back to the submission queue. We catch up
> > > with the HW on the next process_csb, but it may have completed the
> > > request in the mean time (it is just not allowed to advance beyond the
> > > subsequent breadcrumb and so prevented from overtaking our knowledge of
> > > RING_TAIL and so we avoid telling the HW to go "backwards".).
> > 
> > Would it be sufficient to do:
> > 
> >    engine = READ_ONCE(rq->engine);
> >    spin_lock(...);
> >    list_del(...);
> >    spin_unlock(engine->active.lock);
> > 
> > To ensure the same engine is used? Although the oops is not about 
> > spinlock but list corruption. How does the list get corrupt though? 
> > list_del does not care on which list the request is.. If it is really 
> > key to have the correct lock, then why it is enough to re-check the 
> > engine after taking the lock? Why rq->engine couldn't change under the 
> > lock again? rq->engine does get updated under the very lock, no?
> 
> Don't forget that list_del changes the list around it:
> list_del() {
>         list->prev->next = list->next;
>         list->next->prev = list->prev;
> }
> 
> rq->engine can't change under the real->active.lock, as the assignment
> to rq->engine = (virtual, real) is made under the real->active.lock.
> 
> execlists_dequeue:
>         real->active.lock
>         ve->active.lock
> 
> __unwind_incomplete_requests:
>         real->active.lock
> 
> Hmm. I trust the trick employed in the patch is well proven by this
> point, but if we took the nested ve lock inside __unwind, do we need to
> worry. Hmm.

No. Take an extreme example where the request is taken from the virtual
engine submitted to sibling0, then transferred to sibling1 all between
the other thread doing engine = READ_ONCE(rq->engine); lock(engine);

The engine that used to be on the request no longer correlates with the
engine locking the sched.requests.
-Chris
Tvrtko Ursulin Sept. 18, 2019, 4:28 p.m. UTC | #6
On 18/09/2019 17:10, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-09-18 16:54:36)
>>
>> On 17/09/2019 16:17, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-09-17 15:59:25)
>>>>
>>>> On 16/09/2019 12:38, Chris Wilson wrote:
>>>>> When using virtual engines, the rq->engine is not stable until we hold
>>>>> the engine->active.lock (as the virtual engine may be exchanged with the
>>>>> sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
>>>>> we may retire a request concurrently with resubmitting it to HW, we need
>>>>> to be extra careful to verify we are holding the correct lock for the
>>>>> request's active list. This is similar to the issue we saw with
>>>>> rescheduling the virtual requests, see sched_lock_engine().
>>>>>
>>>>> Or else:
>>>>>
>>>>> <4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
> ...
>>>>> <4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
> 
>>> Yes. So preempt-to-busy introduces a window where the request is still
>>> on HW but we have returned it back to the submission queue. We catch up
>>> with the HW on the next process_csb, but it may have completed the
>>> request in the mean time (it is just not allowed to advance beyond the
>>> subsequent breadcrumb and so prevented from overtaking our knowledge of
>>> RING_TAIL and so we avoid telling the HW to go "backwards".).
>>
>> Would it be sufficient to do:
>>
>>     engine = READ_ONCE(rq->engine);
>>     spin_lock(...);
>>     list_del(...);
>>     spin_unlock(engine->active.lock);
>>
>> To ensure the same engine is used? Although the oops is not about
>> spinlock but list corruption. How does the list get corrupt though?
>> list_del does not care on which list the request is.. If it is really
>> key to have the correct lock, then why it is enough to re-check the
>> engine after taking the lock? Why rq->engine couldn't change under the
>> lock again? rq->engine does get updated under the very lock, no?
> 
> Don't forget that list_del changes the list around it:
> list_del() {
> 	list->prev->next = list->next;
> 	list->next->prev = list->prev;
> }
> 
> rq->engine can't change under the real->active.lock, as the assignment
> to rq->engine = (virtual, real) is made under the real->active.lock.

Sheehs.. I am re-inventing paradigms here.. :(

> execlists_dequeue:
> 	real->active.lock
> 	ve->active.lock
> 
> __unwind_incomplete_requests:
> 	real->active.lock
> 
> Hmm. I trust the trick employed in the patch is well proven by this
> point, but if we took the nested ve lock inside __unwind, do we need to
> worry. Hmm.

Might be nicer indeed. We would only have ve->real nesting, never the 
opposite, right?

Regards,

Tvrtko

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 754a78364a63..f12358150097 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -195,6 +195,27 @@  static void free_capture_list(struct i915_request *request)
 	}
 }
 
+static void remove_from_engine(struct i915_request *rq)
+{
+	struct intel_engine_cs *engine, *locked;
+
+	/*
+	 * Virtual engines complicate acquiring the engine timeline lock,
+	 * as their rq->engine pointer is not stable until under that
+	 * engine lock. The simple ploy we use is to take the lock then
+	 * check that the rq still belongs to the newly locked engine.
+	 */
+	locked = READ_ONCE(rq->engine);
+	spin_lock(&locked->active.lock);
+	while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
+		spin_unlock(&locked->active.lock);
+		spin_lock(&engine->active.lock);
+		locked = engine;
+	}
+	list_del(&rq->sched.link);
+	spin_unlock(&locked->active.lock);
+}
+
 static bool i915_request_retire(struct i915_request *rq)
 {
 	struct i915_active_request *active, *next;
@@ -260,9 +281,7 @@  static bool i915_request_retire(struct i915_request *rq)
 	 * request that we have removed from the HW and put back on a run
 	 * queue.
 	 */
-	spin_lock(&rq->engine->active.lock);
-	list_del(&rq->sched.link);
-	spin_unlock(&rq->engine->active.lock);
+	remove_from_engine(rq);
 
 	spin_lock(&rq->lock);
 	i915_request_mark_complete(rq);