diff mbox

[v2] drm/i915: Unwind HW init after GVT setup failure

Message ID 20180710143821.1889-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson July 10, 2018, 2:38 p.m. UTC
Following intel_gvt_init() failure, we missed unwinding our setup
leaving pointers dangling past the module unload. For our example, the
pm_qos:

[  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
               prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
               next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
[  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
[  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
[  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
[  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
[  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
[  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
[  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
[  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
[  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
[  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
[  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
[  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
[  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
[  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  441.057732] Call Trace:
[  441.057736]  plist_check_list+0x2e/0x40
[  441.057738]  plist_add+0x23/0x130
[  441.057743]  pm_qos_update_target+0x1bd/0x2f0
[  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
[  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  441.057800]  i915_pci_probe+0x29/0x90 [i915]
[  441.057804]  pci_device_probe+0xa1/0x130
[  441.057807]  driver_probe_device+0x306/0x480
[  441.057810]  __driver_attach+0xdb/0x100
[  441.057812]  ? driver_probe_device+0x480/0x480
[  441.057813]  ? driver_probe_device+0x480/0x480
[  441.057816]  bus_for_each_dev+0x74/0xc0
[  441.057819]  bus_add_driver+0x15f/0x250
[  441.057821]  ? 0xffffffffa0696000
[  441.057823]  driver_register+0x56/0xe0
[  441.057825]  ? 0xffffffffa0696000
[  441.057827]  do_one_initcall+0x58/0x370
[  441.057830]  ? do_init_module+0x1d/0x1ea
[  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
[  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
[  441.057838]  do_init_module+0x56/0x1ea
[  441.057841]  load_module+0x2435/0x2b20
[  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
[  441.057854]  __se_sys_finit_module+0xd3/0xf0
[  441.057861]  do_syscall_64+0x55/0x190
[  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  441.057865] RIP: 0033:0x7fc239d75839
[  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
[  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
[  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
[  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
[  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
[  441.057940] irq event stamp: 231338
[  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
[  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
[  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40

v2: Add a load failure point to intel_gvt_init() so that we always
exercise this path in future.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c  | 10 +++++++---
 drivers/gpu/drm/i915/intel_gvt.c |  3 +++
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Chris Wilson July 10, 2018, 2:39 p.m. UTC | #1
Quoting Chris Wilson (2018-07-10 15:38:21)
> Following intel_gvt_init() failure, we missed unwinding our setup
> leaving pointers dangling past the module unload. For our example, the
> pm_qos:
> 
> [  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
>                prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
>                next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
> [  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> [  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
> [  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
> [  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
> [  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
> [  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
> [  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
> [  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
> [  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
> [  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
> [  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
> [  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
> [  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
> [  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
> [  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  441.057732] Call Trace:
> [  441.057736]  plist_check_list+0x2e/0x40
> [  441.057738]  plist_add+0x23/0x130
> [  441.057743]  pm_qos_update_target+0x1bd/0x2f0
> [  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
> [  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> [  441.057800]  i915_pci_probe+0x29/0x90 [i915]
> [  441.057804]  pci_device_probe+0xa1/0x130
> [  441.057807]  driver_probe_device+0x306/0x480
> [  441.057810]  __driver_attach+0xdb/0x100
> [  441.057812]  ? driver_probe_device+0x480/0x480
> [  441.057813]  ? driver_probe_device+0x480/0x480
> [  441.057816]  bus_for_each_dev+0x74/0xc0
> [  441.057819]  bus_add_driver+0x15f/0x250
> [  441.057821]  ? 0xffffffffa0696000
> [  441.057823]  driver_register+0x56/0xe0
> [  441.057825]  ? 0xffffffffa0696000
> [  441.057827]  do_one_initcall+0x58/0x370
> [  441.057830]  ? do_init_module+0x1d/0x1ea
> [  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
> [  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
> [  441.057838]  do_init_module+0x56/0x1ea
> [  441.057841]  load_module+0x2435/0x2b20
> [  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
> [  441.057854]  __se_sys_finit_module+0xd3/0xf0
> [  441.057861]  do_syscall_64+0x55/0x190
> [  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  441.057865] RIP: 0033:0x7fc239d75839
> [  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
> [  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
> [  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
> [  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
> [  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
> [  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
> [  441.057940] irq event stamp: 231338
> [  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
> [  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
> [  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
> [  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
> [  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> 
> v2: Add a load failure point to intel_gvt_init() so that we always
> exercise this path in future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107129
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
-Chris
Michał Winiarski July 10, 2018, 3:44 p.m. UTC | #2
On Tue, Jul 10, 2018 at 03:38:21PM +0100, Chris Wilson wrote:
> Following intel_gvt_init() failure, we missed unwinding our setup
> leaving pointers dangling past the module unload. For our example, the
> pm_qos:
> 
> [  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
>                prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
>                next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
> [  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> [  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
> [  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
> [  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
> [  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
> [  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
> [  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
> [  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
> [  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
> [  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
> [  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
> [  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
> [  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
> [  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
> [  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  441.057732] Call Trace:
> [  441.057736]  plist_check_list+0x2e/0x40
> [  441.057738]  plist_add+0x23/0x130
> [  441.057743]  pm_qos_update_target+0x1bd/0x2f0
> [  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
> [  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> [  441.057800]  i915_pci_probe+0x29/0x90 [i915]
> [  441.057804]  pci_device_probe+0xa1/0x130
> [  441.057807]  driver_probe_device+0x306/0x480
> [  441.057810]  __driver_attach+0xdb/0x100
> [  441.057812]  ? driver_probe_device+0x480/0x480
> [  441.057813]  ? driver_probe_device+0x480/0x480
> [  441.057816]  bus_for_each_dev+0x74/0xc0
> [  441.057819]  bus_add_driver+0x15f/0x250
> [  441.057821]  ? 0xffffffffa0696000
> [  441.057823]  driver_register+0x56/0xe0
> [  441.057825]  ? 0xffffffffa0696000
> [  441.057827]  do_one_initcall+0x58/0x370
> [  441.057830]  ? do_init_module+0x1d/0x1ea
> [  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
> [  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
> [  441.057838]  do_init_module+0x56/0x1ea
> [  441.057841]  load_module+0x2435/0x2b20
> [  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
> [  441.057854]  __se_sys_finit_module+0xd3/0xf0
> [  441.057861]  do_syscall_64+0x55/0x190
> [  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  441.057865] RIP: 0033:0x7fc239d75839
> [  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
> [  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> [  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
> [  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
> [  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
> [  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
> [  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
> [  441.057940] irq event stamp: 231338
> [  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
> [  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
> [  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
> [  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
> [  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> 
> v2: Add a load failure point to intel_gvt_init() so that we always
> exercise this path in future.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

> ---
>  drivers/gpu/drm/i915/i915_drv.c  | 10 +++++++---
>  drivers/gpu/drm/i915/intel_gvt.c |  3 +++
>  2 files changed, 10 insertions(+), 3 deletions(-)
Chris Wilson July 10, 2018, 3:49 p.m. UTC | #3
Quoting Michał Winiarski (2018-07-10 16:44:05)
> On Tue, Jul 10, 2018 at 03:38:21PM +0100, Chris Wilson wrote:
> > Following intel_gvt_init() failure, we missed unwinding our setup
> > leaving pointers dangling past the module unload. For our example, the
> > pm_qos:
> > 
> > [  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
> >                prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
> >                next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
> > [  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> > [  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
> > [  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
> > [  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
> > [  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
> > [  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
> > [  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
> > [  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
> > [  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
> > [  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
> > [  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
> > [  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
> > [  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
> > [  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
> > [  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  441.057732] Call Trace:
> > [  441.057736]  plist_check_list+0x2e/0x40
> > [  441.057738]  plist_add+0x23/0x130
> > [  441.057743]  pm_qos_update_target+0x1bd/0x2f0
> > [  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
> > [  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
> > [  441.057800]  i915_pci_probe+0x29/0x90 [i915]
> > [  441.057804]  pci_device_probe+0xa1/0x130
> > [  441.057807]  driver_probe_device+0x306/0x480
> > [  441.057810]  __driver_attach+0xdb/0x100
> > [  441.057812]  ? driver_probe_device+0x480/0x480
> > [  441.057813]  ? driver_probe_device+0x480/0x480
> > [  441.057816]  bus_for_each_dev+0x74/0xc0
> > [  441.057819]  bus_add_driver+0x15f/0x250
> > [  441.057821]  ? 0xffffffffa0696000
> > [  441.057823]  driver_register+0x56/0xe0
> > [  441.057825]  ? 0xffffffffa0696000
> > [  441.057827]  do_one_initcall+0x58/0x370
> > [  441.057830]  ? do_init_module+0x1d/0x1ea
> > [  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
> > [  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
> > [  441.057838]  do_init_module+0x56/0x1ea
> > [  441.057841]  load_module+0x2435/0x2b20
> > [  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
> > [  441.057854]  __se_sys_finit_module+0xd3/0xf0
> > [  441.057861]  do_syscall_64+0x55/0x190
> > [  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [  441.057865] RIP: 0033:0x7fc239d75839
> > [  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
> > [  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> > [  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
> > [  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
> > [  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
> > [  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
> > [  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
> > [  441.057940] irq event stamp: 231338
> > [  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
> > [  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
> > [  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
> > [  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
> > [  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
> > 
> > v2: Add a load failure point to intel_gvt_init() so that we always
> > exercise this path in future.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> 
> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

Thanks, hopefully this is the last of the strange failures, but I do
wonder if intel_gvt_init() failing will leave a trail of red behind.

I still have drv_selftests/live_guc failing in BAT that I believe you
have patches to clean up :)
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e3dfa74d45ad..3eba3d1ab5b8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1154,8 +1154,6 @@  static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 
 	intel_uncore_sanitize(dev_priv);
 
-	intel_opregion_setup(dev_priv);
-
 	i915_gem_load_init_fences(dev_priv);
 
 	/* On the 945G/GM, the chipset reports the MSI capability on the
@@ -1184,10 +1182,16 @@  static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 
 	ret = intel_gvt_init(dev_priv);
 	if (ret)
-		goto err_ggtt;
+		goto err_msi;
+
+	intel_opregion_setup(dev_priv);
 
 	return 0;
 
+err_msi:
+	if (pdev->msi_enabled)
+		pci_disable_msi(pdev);
+	pm_qos_remove_request(&dev_priv->pm_qos);
 err_ggtt:
 	i915_ggtt_cleanup_hw(dev_priv);
 err_perf:
diff --git a/drivers/gpu/drm/i915/intel_gvt.c b/drivers/gpu/drm/i915/intel_gvt.c
index a6291f60545b..c22b3e18a0f5 100644
--- a/drivers/gpu/drm/i915/intel_gvt.c
+++ b/drivers/gpu/drm/i915/intel_gvt.c
@@ -92,6 +92,9 @@  int intel_gvt_init(struct drm_i915_private *dev_priv)
 {
 	int ret;
 
+	if (i915_inject_load_failure())
+		return -ENODEV;
+
 	if (!i915_modparams.enable_gvt) {
 		DRM_DEBUG_DRIVER("GVT-g is disabled by kernel params\n");
 		return 0;