Message ID | 1508980804-17197-1-git-send-email-manasi.d.navare@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Manasi Navare (2017-10-26 02:20:04) > During modeset cleanup on driver unload we may have a pending > hotplug work. This needs to be cancel early during the teardown > so that it does not fire after we have freed the connector. > We do this after drm_kms_helper_poll_fini(dev) since this might > cause link retrain and before intel_fbdev_fini() since this tries to > free the connector. > > If this is not done we may see something like: > DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) > ------------[ cut here ]------------ > WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 > Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178 > +a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid > +[last unloaded: snd_hda_intel] > CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 > Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 > task: ffff8803c827aa40 task.stack: ffffc90000520000 > RIP: 0010:mutex_destroy+0x4e/0x60 > RSP: 0018:ffffc90000523d58 EFLAGS: 00010292 > RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000 > RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0 > RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001 > R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8 > R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308 > FS: 00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > drm_fb_helper_fini+0xd9/0x130 > intel_fbdev_destroy+0x12/0x60 [i915] > intel_fbdev_fini+0x28/0x30 [i915] > intel_modeset_cleanup+0x45/0xa0 [i915] > i915_driver_unload+0x92/0x180 [i915] > i915_pci_remove+0x19/0x30 [i915] > i915_driver_unload+0x92/0x180 [i915] > i915_pci_remove+0x19/0x30 [i915] > pci_device_remove+0x39/0xb0 > device_release_driver_internal+0x15d/0x220 > driver_detach+0x40/0x80 > bus_remove_driver+0x58/0xd0 > driver_unregister+0x2c/0x40 > pci_unregister_driver+0x36/0xb0 > i915_exit+0x1a/0x8b [i915] > SyS_delete_module+0x18c/0x1e0 > entry_SYSCALL_64_fastpath+0x1c/0xb1 > RIP: 0033:0x7f5df3286287 > RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 > RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287 > RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48 > RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080 > R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000 > > Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure") > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tony Cheng <tony.cheng@amd.com> > Cc: Harry Wentland <Harry.wentland@amd.com> > Cc: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Daniel Vetter <daniel.vetter@intel.com> > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > Cc: Manasi Navare <manasi.d.navare@intel.com> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 366ba74..f81b073 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -4114,6 +4114,7 @@ extern void intel_modeset_gem_init(struct drm_device *dev); > extern void intel_modeset_cleanup(struct drm_device *dev); > extern int intel_connector_register(struct drm_connector *); > extern void intel_connector_unregister(struct drm_connector *); > +extern void intel_connector_work_fn_cleanup(struct drm_device *dev); > extern int intel_modeset_vga_set_state(struct drm_i915_private *dev_priv, > bool state); > extern void intel_display_resume(struct drm_device *dev); > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 0e493a1..44158ff 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -15200,6 +15200,19 @@ void intel_connector_unregister(struct drm_connector *connector) > intel_panel_destroy_backlight(connector); > } > > +void intel_connector_work_fn_cleanup(struct drm_device *dev) ^static > +{ > + struct intel_connector *connector; > + struct drm_connector_list_iter conn_iter; > + > + drm_connector_list_iter_begin(dev, &conn_iter); > + for_each_intel_connector_iter(connector, &conn_iter) { > + if (connector->modeset_retry_work.func) > + cancel_work_sync(&connector->modeset_retry_work); > + } > + drm_connector_list_iter_end(&conn_iter); > +} > + > void intel_modeset_cleanup(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = to_i915(dev); > @@ -15222,6 +15235,9 @@ void intel_modeset_cleanup(struct drm_device *dev) > */ > drm_kms_helper_poll_fini(dev); > > + /* Cancel pending modeset retry work in intel_connector */ > + intel_connector_work_fn_cleanup(dev); If you call this intel_hpd_poll_fini() and do both the helper_poll_fini and the cancel_work_sync() afterwards, the coupling between the two is more obvious. -Chris
Op 26-10-17 om 03:20 schreef Manasi Navare: > During modeset cleanup on driver unload we may have a pending > hotplug work. This needs to be cancel early during the teardown > so that it does not fire after we have freed the connector. > We do this after drm_kms_helper_poll_fini(dev) since this might > cause link retrain and before intel_fbdev_fini() since this tries to > free the connector. > > If this is not done we may see something like: > DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) > ------------[ cut here ]------------ > WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 > Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178 > +a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid > +[last unloaded: snd_hda_intel] > CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 > Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 > task: ffff8803c827aa40 task.stack: ffffc90000520000 > RIP: 0010:mutex_destroy+0x4e/0x60 > RSP: 0018:ffffc90000523d58 EFLAGS: 00010292 > RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000 > RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0 > RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001 > R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8 > R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308 > FS: 00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > drm_fb_helper_fini+0xd9/0x130 > intel_fbdev_destroy+0x12/0x60 [i915] > intel_fbdev_fini+0x28/0x30 [i915] > intel_modeset_cleanup+0x45/0xa0 [i915] > i915_driver_unload+0x92/0x180 [i915] > i915_pci_remove+0x19/0x30 [i915] > i915_driver_unload+0x92/0x180 [i915] > i915_pci_remove+0x19/0x30 [i915] > pci_device_remove+0x39/0xb0 > device_release_driver_internal+0x15d/0x220 > driver_detach+0x40/0x80 > bus_remove_driver+0x58/0xd0 > driver_unregister+0x2c/0x40 > pci_unregister_driver+0x36/0xb0 > i915_exit+0x1a/0x8b [i915] > SyS_delete_module+0x18c/0x1e0 > entry_SYSCALL_64_fastpath+0x1c/0xb1 > RIP: 0033:0x7f5df3286287 > RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 > RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287 > RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48 > RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080 > R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000 Or a GPF like: general protection fault: 0000 [#1] PREEMPT SMP Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid [last unloaded: snd_hda_intel] CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U W 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 Workqueue: events intel_dp_modeset_retry_work_fn [i915] task: ffff88045a5caa40 task.stack: ffffc90000378000 RIP: 0010:drm_setup_crtcs+0x143/0xbf0 RSP: 0018:ffffc9000037bd20 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000780 RDI: 00000000ffffffff RBP: ffffc9000037bdb8 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000780 R11: 0000000000000000 R12: 0000000000000002 R13: ffff88044fbef4e8 R14: 0000000000000780 R15: 0000000000000438 FS: 0000000000000000(0000) GS:ffff88045d200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055ec51ee5168 CR3: 000000044c89d003 CR4: 00000000003606f0 Call Trace: drm_fb_helper_hotplug_event.part.18+0x7e/0xc0 drm_fb_helper_hotplug_event+0x1a/0x20 intel_fbdev_output_poll_changed+0x1a/0x20 [i915] drm_kms_helper_hotplug_event+0x27/0x30 intel_dp_modeset_retry_work_fn+0x77/0x80 [i915] process_one_work+0x233/0x660 worker_thread+0x206/0x3b0 kthread+0x152/0x190 ? process_one_work+0x660/0x660 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x27/0x40 Code: 06 00 00 45 8b 45 20 31 db 45 31 e4 45 85 c0 0f 8e 91 06 00 00 44 8b 75 94 44 8b 7d 90 49 8b 45 28 49 63 d4 44 89 f6 41 83 c4 01 <48> 8b 04 d0 44 89 fa 48 8b 38 48 8b 87 a8 01 00 00 ff 50 20 01 RIP: drm_setup_crtcs+0x143/0xbf0 RSP: ffffc9000037bd20 ---[ end trace 08901ff1a77d30c7 ]--- The second backtrace is also needed, since it shows the full picture of the race. retry_work_fn calling kms helper hotplug event, which ends up calling drm_fb_helper_hotplug_event. This is the race, and the reason why retry_work_fn has to be killed after poll_fini and before fbdev_fini. I agree with Chris though, having a separate function makes the dependency wrt poll_fini more clear. The connector isn't freed yet, but fbdev is at the point of the GPF. :) > Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure") > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tony Cheng <tony.cheng@amd.com> > Cc: Harry Wentland <Harry.wentland@amd.com> > Cc: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Daniel Vetter <daniel.vetter@intel.com> > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > Cc: Manasi Navare <manasi.d.navare@intel.com> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 366ba74..f81b073 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -4114,6 +4114,7 @@ extern void intel_modeset_gem_init(struct drm_device *dev); > extern void intel_modeset_cleanup(struct drm_device *dev); > extern int intel_connector_register(struct drm_connector *); > extern void intel_connector_unregister(struct drm_connector *); > +extern void intel_connector_work_fn_cleanup(struct drm_device *dev); > extern int intel_modeset_vga_set_state(struct drm_i915_private *dev_priv, > bool state); > extern void intel_display_resume(struct drm_device *dev); > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 0e493a1..44158ff 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -15200,6 +15200,19 @@ void intel_connector_unregister(struct drm_connector *connector) > intel_panel_destroy_backlight(connector); > } > > +void intel_connector_work_fn_cleanup(struct drm_device *dev) > +{ > + struct intel_connector *connector; > + struct drm_connector_list_iter conn_iter; > + > + drm_connector_list_iter_begin(dev, &conn_iter); > + for_each_intel_connector_iter(connector, &conn_iter) { > + if (connector->modeset_retry_work.func) > + cancel_work_sync(&connector->modeset_retry_work); > + } > + drm_connector_list_iter_end(&conn_iter); > +} > + > void intel_modeset_cleanup(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = to_i915(dev); > @@ -15222,6 +15235,9 @@ void intel_modeset_cleanup(struct drm_device *dev) > */ > drm_kms_helper_poll_fini(dev); > > + /* Cancel pending modeset retry work in intel_connector */ > + intel_connector_work_fn_cleanup(dev); > + > /* poll work can call into fbdev, hence clean that up afterwards */ > intel_fbdev_fini(dev_priv); >
On Thu, Oct 26, 2017 at 07:33:44AM +0100, Chris Wilson wrote: > Quoting Manasi Navare (2017-10-26 02:20:04) > > During modeset cleanup on driver unload we may have a pending > > hotplug work. This needs to be cancel early during the teardown > > so that it does not fire after we have freed the connector. > > We do this after drm_kms_helper_poll_fini(dev) since this might > > cause link retrain and before intel_fbdev_fini() since this tries to > > free the connector. > > > > If this is not done we may see something like: > > DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) > > ------------[ cut here ]------------ > > WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 > > Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178 > > +a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid > > +[last unloaded: snd_hda_intel] > > CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 > > Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 > > task: ffff8803c827aa40 task.stack: ffffc90000520000 > > RIP: 0010:mutex_destroy+0x4e/0x60 > > RSP: 0018:ffffc90000523d58 EFLAGS: 00010292 > > RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000 > > RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0 > > RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001 > > R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8 > > R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308 > > FS: 00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > drm_fb_helper_fini+0xd9/0x130 > > intel_fbdev_destroy+0x12/0x60 [i915] > > intel_fbdev_fini+0x28/0x30 [i915] > > intel_modeset_cleanup+0x45/0xa0 [i915] > > i915_driver_unload+0x92/0x180 [i915] > > i915_pci_remove+0x19/0x30 [i915] > > i915_driver_unload+0x92/0x180 [i915] > > i915_pci_remove+0x19/0x30 [i915] > > pci_device_remove+0x39/0xb0 > > device_release_driver_internal+0x15d/0x220 > > driver_detach+0x40/0x80 > > bus_remove_driver+0x58/0xd0 > > driver_unregister+0x2c/0x40 > > pci_unregister_driver+0x36/0xb0 > > i915_exit+0x1a/0x8b [i915] > > SyS_delete_module+0x18c/0x1e0 > > entry_SYSCALL_64_fastpath+0x1c/0xb1 > > RIP: 0033:0x7f5df3286287 > > RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 > > RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287 > > RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48 > > RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080 > > R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000 > > R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000 > > > > Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > > Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure") > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tony Cheng <tony.cheng@amd.com> > > Cc: Harry Wentland <Harry.wentland@amd.com> > > Cc: Jani Nikula <jani.nikula@linux.intel.com> > > Cc: Daniel Vetter <daniel.vetter@intel.com> > > Cc: Ville Syrjala <ville.syrjala@linux.intel.com> > > Cc: Manasi Navare <manasi.d.navare@intel.com> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > > Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> > > --- > > drivers/gpu/drm/i915/i915_drv.h | 1 + > > drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++++++++ > > 2 files changed, 17 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index 366ba74..f81b073 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -4114,6 +4114,7 @@ extern void intel_modeset_gem_init(struct drm_device *dev); > > extern void intel_modeset_cleanup(struct drm_device *dev); > > extern int intel_connector_register(struct drm_connector *); > > extern void intel_connector_unregister(struct drm_connector *); > > +extern void intel_connector_work_fn_cleanup(struct drm_device *dev); > > extern int intel_modeset_vga_set_state(struct drm_i915_private *dev_priv, > > bool state); > > extern void intel_display_resume(struct drm_device *dev); > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > > index 0e493a1..44158ff 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -15200,6 +15200,19 @@ void intel_connector_unregister(struct drm_connector *connector) > > intel_panel_destroy_backlight(connector); > > } > > > > +void intel_connector_work_fn_cleanup(struct drm_device *dev) > ^static > Yes will make this static > > +{ > > + struct intel_connector *connector; > > + struct drm_connector_list_iter conn_iter; > > + > > + drm_connector_list_iter_begin(dev, &conn_iter); > > + for_each_intel_connector_iter(connector, &conn_iter) { > > + if (connector->modeset_retry_work.func) > > + cancel_work_sync(&connector->modeset_retry_work); > > + } > > + drm_connector_list_iter_end(&conn_iter); > > +} > > + > > void intel_modeset_cleanup(struct drm_device *dev) > > { > > struct drm_i915_private *dev_priv = to_i915(dev); > > @@ -15222,6 +15235,9 @@ void intel_modeset_cleanup(struct drm_device *dev) > > */ > > drm_kms_helper_poll_fini(dev); > > > > + /* Cancel pending modeset retry work in intel_connector */ > > + intel_connector_work_fn_cleanup(dev); > > If you call this intel_hpd_poll_fini() and do both the helper_poll_fini > and the cancel_work_sync() afterwards, the coupling between the two is > more obvious. > -Chris So rename this function as intel_hpd_poll_fini() that will call drm_kms_helper_fini() first and then loop through the connector list and cancel_work_sync()? Manasi
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 366ba74..f81b073 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -4114,6 +4114,7 @@ extern void intel_modeset_gem_init(struct drm_device *dev); extern void intel_modeset_cleanup(struct drm_device *dev); extern int intel_connector_register(struct drm_connector *); extern void intel_connector_unregister(struct drm_connector *); +extern void intel_connector_work_fn_cleanup(struct drm_device *dev); extern int intel_modeset_vga_set_state(struct drm_i915_private *dev_priv, bool state); extern void intel_display_resume(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 0e493a1..44158ff 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15200,6 +15200,19 @@ void intel_connector_unregister(struct drm_connector *connector) intel_panel_destroy_backlight(connector); } +void intel_connector_work_fn_cleanup(struct drm_device *dev) +{ + struct intel_connector *connector; + struct drm_connector_list_iter conn_iter; + + drm_connector_list_iter_begin(dev, &conn_iter); + for_each_intel_connector_iter(connector, &conn_iter) { + if (connector->modeset_retry_work.func) + cancel_work_sync(&connector->modeset_retry_work); + } + drm_connector_list_iter_end(&conn_iter); +} + void intel_modeset_cleanup(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); @@ -15222,6 +15235,9 @@ void intel_modeset_cleanup(struct drm_device *dev) */ drm_kms_helper_poll_fini(dev); + /* Cancel pending modeset retry work in intel_connector */ + intel_connector_work_fn_cleanup(dev); + /* poll work can call into fbdev, hence clean that up afterwards */ intel_fbdev_fini(dev_priv);
During modeset cleanup on driver unload we may have a pending hotplug work. This needs to be cancel early during the teardown so that it does not fire after we have freed the connector. We do this after drm_kms_helper_poll_fini(dev) since this might cause link retrain and before intel_fbdev_fini() since this tries to free the connector. If this is not done we may see something like: DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) ------------[ cut here ]------------ WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60 Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178 +a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid +[last unloaded: snd_hda_intel] CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1 Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017 task: ffff8803c827aa40 task.stack: ffffc90000520000 RIP: 0010:mutex_destroy+0x4e/0x60 RSP: 0018:ffffc90000523d58 EFLAGS: 00010292 RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000 RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0 RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001 R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8 R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308 FS: 00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: drm_fb_helper_fini+0xd9/0x130 intel_fbdev_destroy+0x12/0x60 [i915] intel_fbdev_fini+0x28/0x30 [i915] intel_modeset_cleanup+0x45/0xa0 [i915] i915_driver_unload+0x92/0x180 [i915] i915_pci_remove+0x19/0x30 [i915] i915_driver_unload+0x92/0x180 [i915] i915_pci_remove+0x19/0x30 [i915] pci_device_remove+0x39/0xb0 device_release_driver_internal+0x15d/0x220 driver_detach+0x40/0x80 bus_remove_driver+0x58/0xd0 driver_unregister+0x2c/0x40 pci_unregister_driver+0x36/0xb0 i915_exit+0x1a/0x8b [i915] SyS_delete_module+0x18c/0x1e0 entry_SYSCALL_64_fastpath+0x1c/0xb1 RIP: 0033:0x7f5df3286287 RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287 RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48 RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080 R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000 Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tony Cheng <tony.cheng@amd.com> Cc: Harry Wentland <Harry.wentland@amd.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++++++++ 2 files changed, 17 insertions(+)