Message ID | 20160831110905.31289-1-peter.ujfalusi@ti.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, 31 Aug 2016, Daniel Vetter <daniel@ffwll.ch> wrote: > On Wed, Aug 31, 2016 at 02:09:05PM +0300, Peter Ujfalusi wrote: >> drm_kms_helper_poll_enable_locked() should check if we have delayed event >> pending and if we have, schedule the work to run without delay. >> >> Currently the output_poll_work is only scheduled if any of the connectors >> have DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT with >> DRM_OUTPUT_POLL_PERIOD delay. It does not matter if we have delayed event >> already registered to be handled. The detection will be delayd by >> DRM_OUTPUT_POLL_PERIOD in any case. >> Furthermore if none of the connectors are marked as POLL_CONNECT or >> POLL_DISCONNECT because all connectors are either POLL_HPD or they are >> always connected: the output_poll_work will not run at all even if we >> have delayed event marked. >> >> When none of the connectors require polling, their initial status change >> from unknown to connected/disconnected is not going to be handled until >> the first kms application starts or if we have fb console enabled. >> >> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> >> --- >> Hi, >> >> Changes since v1: >> - dropped the last paragraph from the commit message. > > I added a few more words to the commit message to explain when exactly > this is a problem and applied your patch to drm-misc. Hi Peter, sadly looks like this regresses users out there [1]. Seems to be a reliable bisect. We need to have this fixed or reverted. BR, Jani. [1] https://bugs.freedesktop.org/show_bug.cgi?id=98690 > > Thanks, Daniel >> >> Regards, >> Peter >> >> drivers/gpu/drm/drm_probe_helper.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c >> index a0df377d7d1c..f6b64d7d3528 100644 >> --- a/drivers/gpu/drm/drm_probe_helper.c >> +++ b/drivers/gpu/drm/drm_probe_helper.c >> @@ -129,6 +129,7 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) >> { >> bool poll = false; >> struct drm_connector *connector; >> + unsigned long delay = DRM_OUTPUT_POLL_PERIOD; >> >> WARN_ON(!mutex_is_locked(&dev->mode_config.mutex)); >> >> @@ -141,8 +142,13 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) >> poll = true; >> } >> >> + if (dev->mode_config.delayed_event) { >> + poll = true; >> + delay = 0; >> + } >> + >> if (poll) >> - schedule_delayed_work(&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD); >> + schedule_delayed_work(&dev->mode_config.output_poll_work, delay); >> } >> EXPORT_SYMBOL(drm_kms_helper_poll_enable_locked); >> >> -- >> 2.9.3 >>
On 12/19/2016 11:54 AM, Jani Nikula wrote: > On Wed, 31 Aug 2016, Daniel Vetter <daniel@ffwll.ch> wrote: >> On Wed, Aug 31, 2016 at 02:09:05PM +0300, Peter Ujfalusi wrote: >>> drm_kms_helper_poll_enable_locked() should check if we have delayed event >>> pending and if we have, schedule the work to run without delay. >>> >>> Currently the output_poll_work is only scheduled if any of the connectors >>> have DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT with >>> DRM_OUTPUT_POLL_PERIOD delay. It does not matter if we have delayed event >>> already registered to be handled. The detection will be delayd by >>> DRM_OUTPUT_POLL_PERIOD in any case. >>> Furthermore if none of the connectors are marked as POLL_CONNECT or >>> POLL_DISCONNECT because all connectors are either POLL_HPD or they are >>> always connected: the output_poll_work will not run at all even if we >>> have delayed event marked. >>> >>> When none of the connectors require polling, their initial status change >>> from unknown to connected/disconnected is not going to be handled until >>> the first kms application starts or if we have fb console enabled. >>> >>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> >>> --- >>> Hi, >>> >>> Changes since v1: >>> - dropped the last paragraph from the commit message. >> >> I added a few more words to the commit message to explain when exactly >> this is a problem and applied your patch to drm-misc. > > Hi Peter, sadly looks like this regresses users out there [1]. Seems to > be a reliable bisect. We need to have this fixed or reverted. When I sent the patch I did booted my laptop (with Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)) and my old desktop with some nVidia Quadro using nouveau stack and seen no issue. But with 4.9 now on my Dell laptop I see lots of warnings during boot: [ 0.573119] ------------[ cut here ]------------ [ 0.573127] WARNING: CPU: 3 PID: 874 at drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1d5/0x210 [ 0.573132] WARN_ON(!msg->buffer != !msg->size) [ 0.573134] Modules linked in: [ 0.573140] CPU: 3 PID: 874 Comm: kworker/u8:3 Not tainted 4.9.0-gentoo #1 [ 0.573143] Hardware name: Dell Inc. Latitude E7440/06MFX3, BIOS A14 02/02/2015 [ 0.573150] Workqueue: events_unbound async_run_entry_fn [ 0.573154] ffffc900021afaf0 ffffffff813c3bad ffffc900021afb40 0000000000000000 [ 0.573160] ffffc900021afb30 ffffffff8106a596 0000042614b82908 ffffc900021afc00 [ 0.573166] ffff8802144040e0 0000000000000003 0000000000000000 ffff880214404158 [ 0.573172] Call Trace: [ 0.573178] [<ffffffff813c3bad>] dump_stack+0x4f/0x72 [ 0.573182] [<ffffffff8106a596>] __warn+0xc6/0xe0 [ 0.573185] [<ffffffff8106a5fa>] warn_slowpath_fmt+0x4a/0x50 [ 0.573189] [<ffffffff815926e9>] ? intel_dp_aux_transfer+0xc9/0x210 [ 0.573193] [<ffffffff815927f5>] intel_dp_aux_transfer+0x1d5/0x210 [ 0.573198] [<ffffffff817d3e53>] ? _raw_write_unlock_irqrestore+0x13/0x30 [ 0.573202] [<ffffffff817d3e79>] ? _raw_spin_unlock_irqrestore+0x9/0x10 [ 0.573207] [<ffffffff814b9e48>] drm_dp_dpcd_access+0x58/0xf0 [ 0.573210] [<ffffffff814b9ef6>] drm_dp_dpcd_write+0x16/0x20 [ 0.573214] [<ffffffff8158d8c3>] intel_dp_start_link_train+0x2b3/0x4a0 [ 0.573218] [<ffffffff8158ecd2>] intel_dp_check_link_status+0xb2/0xf0 [ 0.573222] [<ffffffff81593586>] intel_dp_detect+0x7d6/0xb40 [ 0.573226] [<ffffffff814bb1db>] drm_helper_probe_single_connector_modes+0x41b/0x4e0 [ 0.573233] [<ffffffff814c8d7c>] drm_fb_helper_initial_config+0x7c/0x3f0 [ 0.573237] [<ffffffff817d3e39>] ? _raw_spin_unlock_irq+0x9/0x10 [ 0.573242] [<ffffffff815858b3>] intel_fbdev_initial_config+0x13/0x30 [ 0.573245] [<ffffffff8108b832>] async_run_entry_fn+0x32/0xe0 [ 0.573249] [<ffffffff81083208>] process_one_work+0x148/0x4c0 [ 0.573253] [<ffffffff810835c3>] worker_thread+0x43/0x4e0 [ 0.573257] [<ffffffff81083580>] ? process_one_work+0x4c0/0x4c0 [ 0.573260] [<ffffffff81083580>] ? process_one_work+0x4c0/0x4c0 [ 0.573264] [<ffffffff8107f3c7>] ? call_usermodehelper_exec_async+0x137/0x140 [ 0.573269] [<ffffffff81088a45>] kthread+0xc5/0xe0 [ 0.573273] [<ffffffff81088980>] ? kthread_park+0x60/0x60 [ 0.573277] [<ffffffff8107f290>] ? umh_complete+0x40/0x40 [ 0.573280] [<ffffffff817d44f2>] ret_from_fork+0x22/0x30 [ 0.573285] ---[ end trace a544f5d689389b41 ]--- If I revert the patch in question I have tons of: [ 0.569127] ------------[ cut here ]------------ [ 0.569136] WARNING: CPU: 3 PID: 564 at drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1d5/0x210 [ 0.569141] WARN_ON(!msg->buffer != !msg->size) [ 0.569143] Modules linked in: [ 0.569149] CPU: 3 PID: 564 Comm: kworker/3:2 Not tainted 4.9.0-gentoo #2 [ 0.569152] Hardware name: Dell Inc. Latitude E7440/06MFX3, BIOS A14 02/02/2015 [ 0.569159] Workqueue: events i915_hotplug_work_func [ 0.569164] ffffc90001a93ba0 ffffffff813c3bad ffffc90001a93bf0 0000000000000000 [ 0.569174] ffffc90001a93be0 ffffffff8106a596 0000042614b12908 ffffc90001a93cb0 [ 0.569184] ffff88021482b0e0 0000000000000003 0000000000000000 ffff88021482b158 [ 0.569191] Call Trace: [ 0.569198] [<ffffffff813c3bad>] dump_stack+0x4f/0x72 [ 0.569205] [<ffffffff8106a596>] __warn+0xc6/0xe0 [ 0.569210] [<ffffffff8106a5fa>] warn_slowpath_fmt+0x4a/0x50 [ 0.569214] [<ffffffff815926c9>] ? intel_dp_aux_transfer+0xc9/0x210 [ 0.569218] [<ffffffff815927d5>] intel_dp_aux_transfer+0x1d5/0x210 [ 0.569223] [<ffffffff817d3e53>] ? _raw_write_unlock_irqrestore+0x13/0x30 [ 0.569224] loop: module loaded [ 0.569233] [<ffffffff817d3e79>] ? _raw_spin_unlock_irqrestore+0x9/0x10 [ 0.569241] [<ffffffff814b9e48>] drm_dp_dpcd_access+0x58/0xf0 [ 0.569248] [<ffffffff814b9ef6>] drm_dp_dpcd_write+0x16/0x20 [ 0.569254] [<ffffffff8158d8a3>] intel_dp_start_link_train+0x2b3/0x4a0 [ 0.569261] [<ffffffff8158ecb2>] intel_dp_check_link_status+0xb2/0xf0 [ 0.569268] [<ffffffff81593566>] intel_dp_detect+0x7d6/0xb40 [ 0.569275] [<ffffffff8157c739>] i915_hotplug_work_func+0x1d9/0x2a0 [ 0.569283] [<ffffffff81083208>] process_one_work+0x148/0x4c0 [ 0.569290] [<ffffffff810835c3>] worker_thread+0x43/0x4e0 [ 0.569296] [<ffffffff81083580>] ? process_one_work+0x4c0/0x4c0 [ 0.569302] [<ffffffff81083580>] ? process_one_work+0x4c0/0x4c0 [ 0.569309] [<ffffffff81088a45>] kthread+0xc5/0xe0 [ 0.569316] [<ffffffff81088980>] ? kthread_park+0x60/0x60 [ 0.569322] [<ffffffff817d44f2>] ret_from_fork+0x22/0x30 [ 0.569330] ---[ end trace bc0abba135d2cb5d ]--- So the behaviour is changed, no question about it. But I still believe that the patch itself fixes a valid bug (or shortcoming). atm I don't see how it can lock the kernel. The only thing the patch does is to schedule the output_poll_work w/o delay if we already have delayed_event to reduce the time to handle pending event(s) when the poll is enabled for the first time. From the log it is not clear for me why Xorg would lock up, but we have intel and nouveau drivers probed, it might be because we have two GPUs? > > BR, > Jani. > > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=98690 > > > >> >> Thanks, Daniel >>> >>> Regards, >>> Peter >>> >>> drivers/gpu/drm/drm_probe_helper.c | 8 +++++++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c >>> index a0df377d7d1c..f6b64d7d3528 100644 >>> --- a/drivers/gpu/drm/drm_probe_helper.c >>> +++ b/drivers/gpu/drm/drm_probe_helper.c >>> @@ -129,6 +129,7 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) >>> { >>> bool poll = false; >>> struct drm_connector *connector; >>> + unsigned long delay = DRM_OUTPUT_POLL_PERIOD; >>> >>> WARN_ON(!mutex_is_locked(&dev->mode_config.mutex)); >>> >>> @@ -141,8 +142,13 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) >>> poll = true; >>> } >>> >>> + if (dev->mode_config.delayed_event) { >>> + poll = true; >>> + delay = 0; >>> + } >>> + >>> if (poll) >>> - schedule_delayed_work(&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD); >>> + schedule_delayed_work(&dev->mode_config.output_poll_work, delay); >>> } >>> EXPORT_SYMBOL(drm_kms_helper_poll_enable_locked); >>> >>> -- >>> 2.9.3 >>> >
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c index a0df377d7d1c..f6b64d7d3528 100644 --- a/drivers/gpu/drm/drm_probe_helper.c +++ b/drivers/gpu/drm/drm_probe_helper.c @@ -129,6 +129,7 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) { bool poll = false; struct drm_connector *connector; + unsigned long delay = DRM_OUTPUT_POLL_PERIOD; WARN_ON(!mutex_is_locked(&dev->mode_config.mutex)); @@ -141,8 +142,13 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) poll = true; } + if (dev->mode_config.delayed_event) { + poll = true; + delay = 0; + } + if (poll) - schedule_delayed_work(&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD); + schedule_delayed_work(&dev->mode_config.output_poll_work, delay); } EXPORT_SYMBOL(drm_kms_helper_poll_enable_locked);
drm_kms_helper_poll_enable_locked() should check if we have delayed event pending and if we have, schedule the work to run without delay. Currently the output_poll_work is only scheduled if any of the connectors have DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT with DRM_OUTPUT_POLL_PERIOD delay. It does not matter if we have delayed event already registered to be handled. The detection will be delayd by DRM_OUTPUT_POLL_PERIOD in any case. Furthermore if none of the connectors are marked as POLL_CONNECT or POLL_DISCONNECT because all connectors are either POLL_HPD or they are always connected: the output_poll_work will not run at all even if we have delayed event marked. When none of the connectors require polling, their initial status change from unknown to connected/disconnected is not going to be handled until the first kms application starts or if we have fb console enabled. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> --- Hi, Changes since v1: - dropped the last paragraph from the commit message. Regards, Peter drivers/gpu/drm/drm_probe_helper.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -- 2.9.3