diff mbox

drm: Schedule the output_poll_work with 1s delay if we have delayed event

Message ID 20170104120053.31882-1-peter.ujfalusi@ti.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Ujfalusi Jan. 4, 2017, noon UTC
Instead of scheduling the work to handle the initial delayed event, use 1s
delay.

When the delayed event is handled w/o delay - in a similar matter when the
poll had been initialized before drm_helper_probe_single_connector_modes()
is called - it triggers a race in Optimus setups.

Fixes: 339fd36238dd ("drm: drm_probe_helper: Fix output_poll_work scheduling")
Cc: stable@vger.kernel.org   # v4.9
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
Hi,

related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=98690

Regards,
Peter

 drivers/gpu/drm/drm_probe_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Daniel Vetter Jan. 5, 2017, 8:43 a.m. UTC | #1
On Wed, Jan 04, 2017 at 02:00:53PM +0200, Peter Ujfalusi wrote:
> Instead of scheduling the work to handle the initial delayed event, use 1s
> delay.
> 
> When the delayed event is handled w/o delay - in a similar matter when the
> poll had been initialized before drm_helper_probe_single_connector_modes()
> is called - it triggers a race in Optimus setups.
> 
> Fixes: 339fd36238dd ("drm: drm_probe_helper: Fix output_poll_work scheduling")
> Cc: stable@vger.kernel.org   # v4.9
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> ---
> Hi,
> 
> related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=98690
> 
> Regards,
> Peter
> 
>  drivers/gpu/drm/drm_probe_helper.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
> index 98ed110e28ed..f30c14b0a72f 100644
> --- a/drivers/gpu/drm/drm_probe_helper.c
> +++ b/drivers/gpu/drm/drm_probe_helper.c
> @@ -146,8 +146,9 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
>  	drm_connector_list_iter_put(&conn_iter);
>  
>  	if (dev->mode_config.delayed_event) {
> +		/* Use short (1s) delay to handle the initial delayed event */
>  		poll = true;
> -		delay = 0;
> +		delay = HZ;

This smells a bit like duct-tape papering over some other race. Lack of
locking or something else. Most likely some drivers enable polling a bit
too early in their load sequence. And if we can't figure this out, then we
need some really big FIXME: here that this papers over driver races.
-Daniel

>  	}
>  
>  	if (poll)
> -- 
> 2.11.0
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Peter Ujfalusi Jan. 5, 2017, 10:40 a.m. UTC | #2
On 01/05/2017 10:43 AM, Daniel Vetter wrote:
> On Wed, Jan 04, 2017 at 02:00:53PM +0200, Peter Ujfalusi wrote:
>> Instead of scheduling the work to handle the initial delayed event, use 1s
>> delay.
>>
>> When the delayed event is handled w/o delay - in a similar matter when the
>> poll had been initialized before drm_helper_probe_single_connector_modes()
>> is called - it triggers a race in Optimus setups.
>>
>> Fixes: 339fd36238dd ("drm: drm_probe_helper: Fix output_poll_work scheduling")
>> Cc: stable@vger.kernel.org   # v4.9
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> ---
>> Hi,
>>
>> related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=98690
>>
>> Regards,
>> Peter
>>
>>  drivers/gpu/drm/drm_probe_helper.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
>> index 98ed110e28ed..f30c14b0a72f 100644
>> --- a/drivers/gpu/drm/drm_probe_helper.c
>> +++ b/drivers/gpu/drm/drm_probe_helper.c
>> @@ -146,8 +146,9 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
>>  	drm_connector_list_iter_put(&conn_iter);
>>  
>>  	if (dev->mode_config.delayed_event) {
>> +		/* Use short (1s) delay to handle the initial delayed event */
>>  		poll = true;
>> -		delay = 0;
>> +		delay = HZ;
> 
> This smells a bit like duct-tape papering over some other race.

Yes, I agree. We could revert 339fd36238dd also to put back the duct-tape.

> Lack of
> locking or something else. Most likely some drivers enable polling a bit
> too early in their load sequence.

All is pointing to Optimus. Intel and nouveau alone works, but we have
failure on Optimus laptops.
I have tried to narrow it down with another patch attached to the
bugzilla: https://bugs.freedesktop.org/attachment.cgi?id=128742

I expected it to fix the problem. But it did not. So I'm puzzled.

>And if we can't figure this out, then we
> need some really big FIXME: here that this papers over driver races.

Yeah, now we know that there is something wrong in some driver(s). We
can hide it with delaying the poll_work. It would be great if
nouveau/optimus guys would be able to take a look at this.

> -Daniel
> 
>>  	}
>>  
>>  	if (poll)
>> -- 
>> 2.11.0
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Daniel Vetter Jan. 6, 2017, 10:09 a.m. UTC | #3
On Thu, Jan 05, 2017 at 12:40:29PM +0200, Peter Ujfalusi wrote:
> On 01/05/2017 10:43 AM, Daniel Vetter wrote:
> > On Wed, Jan 04, 2017 at 02:00:53PM +0200, Peter Ujfalusi wrote:
> >> Instead of scheduling the work to handle the initial delayed event, use 1s
> >> delay.
> >>
> >> When the delayed event is handled w/o delay - in a similar matter when the
> >> poll had been initialized before drm_helper_probe_single_connector_modes()
> >> is called - it triggers a race in Optimus setups.
> >>
> >> Fixes: 339fd36238dd ("drm: drm_probe_helper: Fix output_poll_work scheduling")
> >> Cc: stable@vger.kernel.org   # v4.9
> >> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> >> ---
> >> Hi,
> >>
> >> related bug report: https://bugs.freedesktop.org/show_bug.cgi?id=98690
> >>
> >> Regards,
> >> Peter
> >>
> >>  drivers/gpu/drm/drm_probe_helper.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
> >> index 98ed110e28ed..f30c14b0a72f 100644
> >> --- a/drivers/gpu/drm/drm_probe_helper.c
> >> +++ b/drivers/gpu/drm/drm_probe_helper.c
> >> @@ -146,8 +146,9 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
> >>  	drm_connector_list_iter_put(&conn_iter);
> >>  
> >>  	if (dev->mode_config.delayed_event) {
> >> +		/* Use short (1s) delay to handle the initial delayed event */
> >>  		poll = true;
> >> -		delay = 0;
> >> +		delay = HZ;
> > 
> > This smells a bit like duct-tape papering over some other race.
> 
> Yes, I agree. We could revert 339fd36238dd also to put back the duct-tape.

Since the revert and this patch are the same kind of duct-tape (massive
delay) I prefer something like this patch ...

> 
> > Lack of
> > locking or something else. Most likely some drivers enable polling a bit
> > too early in their load sequence.
> 
> All is pointing to Optimus. Intel and nouveau alone works, but we have
> failure on Optimus laptops.
> I have tried to narrow it down with another patch attached to the
> bugzilla: https://bugs.freedesktop.org/attachment.cgi?id=128742
> 
> I expected it to fix the problem. But it did not. So I'm puzzled.
> 
> >And if we can't figure this out, then we
> > need some really big FIXME: here that this papers over driver races.
> 
> Yeah, now we know that there is something wrong in some driver(s). We
> can hide it with delaying the poll_work. It would be great if
> nouveau/optimus guys would be able to take a look at this.

... but with all the above information (optimus only, what blows up)
condensed into both the commit message and a big WARNING comment in the
code. Then I think this is reasonable to merge (but still a decent wtf),
since it addresses a rather serious regression.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
index 98ed110e28ed..f30c14b0a72f 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -146,8 +146,9 @@  void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
 	drm_connector_list_iter_put(&conn_iter);
 
 	if (dev->mode_config.delayed_event) {
+		/* Use short (1s) delay to handle the initial delayed event */
 		poll = true;
-		delay = 0;
+		delay = HZ;
 	}
 
 	if (poll)