diff mbox series

[drm,i915] Increase LSPCON timeout

Message ID 20180813175133.24910-1-fredrik.schon@gmail.com (mailing list archive)
State New, archived
Headers show
Series [drm,i915] Increase LSPCON timeout | expand

Commit Message

Fredrik Schön Aug. 13, 2018, 5:51 p.m. UTC
100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
settle. This causes dropped display modes at driver initialisation.

Increase timeout to 1000 ms.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
---
 drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Rodrigo Vivi Aug. 15, 2018, 10:39 p.m. UTC | #1
On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:

First of all we need to fix the commit subject:

drm/i915: Increase LSPCON timeout

(this can be done when merging, no need to resend)

> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
> settle. This causes dropped display modes at driver initialisation.
> 
> Increase timeout to 1000 ms.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392

Missusage of "Fixes:" tag, please read

https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes

But also no need for resending... could be fixed when merging

The right one would be:

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v4.11+

Since initial 100 seemed to be empirical and this increase seems to
help other cases I'm in favor of this move so

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

However I will wait a bit before merging it
so Imre, Shashank, and/or Jani can take a look here...

> Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
> ---
>  drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
> index 8ae8f42f430a..be1b08f589a4 100644
> --- a/drivers/gpu/drm/i915/intel_lspcon.c
> +++ b/drivers/gpu/drm/i915/intel_lspcon.c
> @@ -74,7 +74,7 @@ static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
>  	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
>  		      lspcon_mode_name(mode));
>  
> -	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 100);
> +	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 1000);
>  	if (current_mode != mode)
>  		DRM_ERROR("LSPCON mode hasn't settled\n");
>  
> -- 
> 2.17.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Rodrigo Vivi Aug. 15, 2018, 10:45 p.m. UTC | #2
On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
> 
> First of all we need to fix the commit subject:
> 
> drm/i915: Increase LSPCON timeout
> 
> (this can be done when merging, no need to resend)
> 
> > 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
> > settle. This causes dropped display modes at driver initialisation.
> > 
> > Increase timeout to 1000 ms.
> > 
> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> 
> Missusage of "Fixes:" tag, please read
> 
> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> 
> But also no need for resending... could be fixed when merging
> 
> The right one would be:
> 
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
> Cc: Shashank Sharma <shashank.sharma@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: <stable@vger.kernel.org> # v4.11+
> 
> Since initial 100 seemed to be empirical and this increase seems to
> help other cases I'm in favor of this move so
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> 
> However I will wait a bit before merging it
> so Imre, Shashank, and/or Jani can take a look here...

now, really cc'ing them...

> 
> > Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
> > index 8ae8f42f430a..be1b08f589a4 100644
> > --- a/drivers/gpu/drm/i915/intel_lspcon.c
> > +++ b/drivers/gpu/drm/i915/intel_lspcon.c
> > @@ -74,7 +74,7 @@ static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
> >  	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
> >  		      lspcon_mode_name(mode));
> >  
> > -	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 100);
> > +	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 1000);
> >  	if (current_mode != mode)
> >  		DRM_ERROR("LSPCON mode hasn't settled\n");
> >  
> > -- 
> > 2.17.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Jani Nikula Aug. 16, 2018, 7:17 a.m. UTC | #3
On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
>> 
>> First of all we need to fix the commit subject:
>> 
>> drm/i915: Increase LSPCON timeout
>> 
>> (this can be done when merging, no need to resend)
>> 
>> > 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
>> > settle. This causes dropped display modes at driver initialisation.
>> > 
>> > Increase timeout to 1000 ms.
>> > 
>> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>> 
>> Missusage of "Fixes:" tag, please read
>> 
>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
>> 
>> But also no need for resending... could be fixed when merging
>> 
>> The right one would be:
>> 
>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
>> Cc: Shashank Sharma <shashank.sharma@intel.com>
>> Cc: Imre Deak <imre.deak@intel.com>
>> Cc: Jani Nikula <jani.nikula@intel.com>
>> Cc: <stable@vger.kernel.org> # v4.11+
>> 
>> Since initial 100 seemed to be empirical and this increase seems to
>> help other cases I'm in favor of this move so
>> 
>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> 
>> However I will wait a bit before merging it
>> so Imre, Shashank, and/or Jani can take a look here...
>
> now, really cc'ing them...

Shashank? Does this slow down non-LSPCON paths?

BR,
Jani.


>
>> 
>> > Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
>> > index 8ae8f42f430a..be1b08f589a4 100644
>> > --- a/drivers/gpu/drm/i915/intel_lspcon.c
>> > +++ b/drivers/gpu/drm/i915/intel_lspcon.c
>> > @@ -74,7 +74,7 @@ static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
>> >  	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
>> >  		      lspcon_mode_name(mode));
>> >  
>> > -	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 100);
>> > +	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 1000);
>> >  	if (current_mode != mode)
>> >  		DRM_ERROR("LSPCON mode hasn't settled\n");
>> >  
>> > -- 
>> > 2.17.1
>> > 
>> > _______________________________________________
>> > Intel-gfx mailing list
>> > Intel-gfx@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Sharma, Shashank Aug. 16, 2018, 7:33 a.m. UTC | #4
Regards

Shashank


On 8/16/2018 12:47 PM, Jani Nikula wrote:
> On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
>>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
>>>
>>> First of all we need to fix the commit subject:
>>>
>>> drm/i915: Increase LSPCON timeout
>>>
>>> (this can be done when merging, no need to resend)
>>>
>>>> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
>>>> settle. This causes dropped display modes at driver initialisation.
>>>>
>>>> Increase timeout to 1000 ms.
>>>>
>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>>> Missusage of "Fixes:" tag, please read
>>>
>>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
>>>
>>> But also no need for resending... could be fixed when merging
>>>
>>> The right one would be:
>>>
>>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
>>> Cc: Shashank Sharma <shashank.sharma@intel.com>
>>> Cc: Imre Deak <imre.deak@intel.com>
>>> Cc: Jani Nikula <jani.nikula@intel.com>
>>> Cc: <stable@vger.kernel.org> # v4.11+
>>>
>>> Since initial 100 seemed to be empirical and this increase seems to
>>> help other cases I'm in favor of this move so
>>>
>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>
>>> However I will wait a bit before merging it
>>> so Imre, Shashank, and/or Jani can take a look here...
>> now, really cc'ing them...
> Shashank? Does this slow down non-LSPCON paths?
This will slow down the lspcon probing and resume part, but both of them 
happen only when LSPCON device is found. So to answer your question, 
this will not slow down the non-lspcon path, but will slow down the 
LSPCON connector resume and probe time. but I would recommend, instead 
of increasing it to 1000 ms in a single shot, we might want to gradually 
pick this up, on a wake-and-check way.

- Shashank
>
> BR,
> Jani.
>
>
>>>> Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
>>>> index 8ae8f42f430a..be1b08f589a4 100644
>>>> --- a/drivers/gpu/drm/i915/intel_lspcon.c
>>>> +++ b/drivers/gpu/drm/i915/intel_lspcon.c
>>>> @@ -74,7 +74,7 @@ static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
>>>>   	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
>>>>   		      lspcon_mode_name(mode));
>>>>   
>>>> -	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 100);
>>>> +	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 1000);
>>>>   	if (current_mode != mode)
>>>>   		DRM_ERROR("LSPCON mode hasn't settled\n");
>>>>   
>>>> -- 
>>>> 2.17.1
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Aug. 16, 2018, 7:43 a.m. UTC | #5
Quoting Sharma, Shashank (2018-08-16 08:33:36)
> Regards
> 
> Shashank
> 
> 
> On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> >> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
> >>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
> >>>
> >>> First of all we need to fix the commit subject:
> >>>
> >>> drm/i915: Increase LSPCON timeout
> >>>
> >>> (this can be done when merging, no need to resend)
> >>>
> >>>> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
> >>>> settle. This causes dropped display modes at driver initialisation.
> >>>>
> >>>> Increase timeout to 1000 ms.
> >>>>
> >>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >>> Missusage of "Fixes:" tag, please read
> >>>
> >>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> >>>
> >>> But also no need for resending... could be fixed when merging
> >>>
> >>> The right one would be:
> >>>
> >>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
> >>> Cc: Shashank Sharma <shashank.sharma@intel.com>
> >>> Cc: Imre Deak <imre.deak@intel.com>
> >>> Cc: Jani Nikula <jani.nikula@intel.com>
> >>> Cc: <stable@vger.kernel.org> # v4.11+
> >>>
> >>> Since initial 100 seemed to be empirical and this increase seems to
> >>> help other cases I'm in favor of this move so
> >>>
> >>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>>
> >>> However I will wait a bit before merging it
> >>> so Imre, Shashank, and/or Jani can take a look here...
> >> now, really cc'ing them...
> > Shashank? Does this slow down non-LSPCON paths?
> This will slow down the lspcon probing and resume part, but both of them 
> happen only when LSPCON device is found. So to answer your question, 
> this will not slow down the non-lspcon path, but will slow down the 
> LSPCON connector resume and probe time. but I would recommend, instead 
> of increasing it to 1000 ms in a single shot, we might want to gradually 
> pick this up, on a wake-and-check way.

wait_for() checks every [10us, 1ms] until the condition is met, or it
times out. So, so long as we don't enter this path for !LPSCON where we
know that it will timeout, the wait_for() will only take as long as is
required for the connector to settle.

Can we do other connectors at the same time, or does probing LSPCON
block the system?
-Chris
Sharma, Shashank Aug. 16, 2018, 8:15 a.m. UTC | #6
Hey Chris,


On 8/16/2018 1:13 PM, Chris Wilson wrote:
> Quoting Sharma, Shashank (2018-08-16 08:33:36)
>> Regards
>>
>> Shashank
>>
>>
>> On 8/16/2018 12:47 PM, Jani Nikula wrote:
>>> On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>>>> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
>>>>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
>>>>>
>>>>> First of all we need to fix the commit subject:
>>>>>
>>>>> drm/i915: Increase LSPCON timeout
>>>>>
>>>>> (this can be done when merging, no need to resend)
>>>>>
>>>>>> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
>>>>>> settle. This causes dropped display modes at driver initialisation.
>>>>>>
>>>>>> Increase timeout to 1000 ms.
>>>>>>
>>>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>>>>> Missusage of "Fixes:" tag, please read
>>>>>
>>>>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
>>>>>
>>>>> But also no need for resending... could be fixed when merging
>>>>>
>>>>> The right one would be:
>>>>>
>>>>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
>>>>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
>>>>> Cc: Shashank Sharma <shashank.sharma@intel.com>
>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>> Cc: Jani Nikula <jani.nikula@intel.com>
>>>>> Cc: <stable@vger.kernel.org> # v4.11+
>>>>>
>>>>> Since initial 100 seemed to be empirical and this increase seems to
>>>>> help other cases I'm in favor of this move so
>>>>>
>>>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>>>
>>>>> However I will wait a bit before merging it
>>>>> so Imre, Shashank, and/or Jani can take a look here...
>>>> now, really cc'ing them...
>>> Shashank? Does this slow down non-LSPCON paths?
>> This will slow down the lspcon probing and resume part, but both of them
>> happen only when LSPCON device is found. So to answer your question,
>> this will not slow down the non-lspcon path, but will slow down the
>> LSPCON connector resume and probe time. but I would recommend, instead
>> of increasing it to 1000 ms in a single shot, we might want to gradually
>> pick this up, on a wake-and-check way.
> wait_for() checks every [10us, 1ms] until the condition is met, or it
> times out. So, so long as we don't enter this path for !LPSCON where we
> know that it will timeout, the wait_for() will only take as long as is
> required for the connector to settle.
We wont hit !LSPCON timeout case here, as we have already read the 
dongle signature successfully by now.  But I was thinking that, if the 
spec recommends max wait time as 100ms (which is of course doesn't seem 
enough), if we can't detect i2c-over-aux after first 500ms, I guess we 
wont be able to do that in next 500ms too. So is it really ok to wait 
this long in the resume sequence ?

I guess Fredrik can provide some inputs here on if there are some 
experiments behind this number of 1000ms, or this is just a safe bet ?
>
> Can we do other connectors at the same time, or does probing LSPCON
> block the system?
We can do other connectors at the same time in DRM layer at-least, 
LSPCON blocks only this connector. I was curious if are we doing this 
during the resume scenario or is this in the sequential get_connector() 
type of call  ?
- Shashank
> -Chris
Fredrik Schön Aug. 16, 2018, 8:36 a.m. UTC | #7
Shashank,

Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
<shashank.sharma@intel.com>:
>
> Hey Chris,
>
>
> On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> >> Regards
> >>
> >> Shashank
> >>
> >>
> >> On 8/16/2018 12:47 PM, Jani Nikula wrote:
> >>> On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> >>>> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
> >>>>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
> >>>>>
> >>>>> First of all we need to fix the commit subject:
> >>>>>
> >>>>> drm/i915: Increase LSPCON timeout
> >>>>>
> >>>>> (this can be done when merging, no need to resend)
> >>>>>
> >>>>>> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
> >>>>>> settle. This causes dropped display modes at driver initialisation.
> >>>>>>
> >>>>>> Increase timeout to 1000 ms.
> >>>>>>
> >>>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >>>>> Missusage of "Fixes:" tag, please read
> >>>>>
> >>>>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> >>>>>
> >>>>> But also no need for resending... could be fixed when merging
> >>>>>
> >>>>> The right one would be:
> >>>>>
> >>>>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >>>>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
> >>>>> Cc: Shashank Sharma <shashank.sharma@intel.com>
> >>>>> Cc: Imre Deak <imre.deak@intel.com>
> >>>>> Cc: Jani Nikula <jani.nikula@intel.com>
> >>>>> Cc: <stable@vger.kernel.org> # v4.11+
> >>>>>
> >>>>> Since initial 100 seemed to be empirical and this increase seems to
> >>>>> help other cases I'm in favor of this move so
> >>>>>
> >>>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>>>>
> >>>>> However I will wait a bit before merging it
> >>>>> so Imre, Shashank, and/or Jani can take a look here...
> >>>> now, really cc'ing them...
> >>> Shashank? Does this slow down non-LSPCON paths?
> >> This will slow down the lspcon probing and resume part, but both of them
> >> happen only when LSPCON device is found. So to answer your question,
> >> this will not slow down the non-lspcon path, but will slow down the
> >> LSPCON connector resume and probe time. but I would recommend, instead
> >> of increasing it to 1000 ms in a single shot, we might want to gradually
> >> pick this up, on a wake-and-check way.
> > wait_for() checks every [10us, 1ms] until the condition is met, or it
> > times out. So, so long as we don't enter this path for !LPSCON where we
> > know that it will timeout, the wait_for() will only take as long as is
> > required for the connector to settle.
> We wont hit !LSPCON timeout case here, as we have already read the
> dongle signature successfully by now.  But I was thinking that, if the
> spec recommends max wait time as 100ms (which is of course doesn't seem
> enough), if we can't detect i2c-over-aux after first 500ms, I guess we
> wont be able to do that in next 500ms too. So is it really ok to wait
> this long in the resume sequence ?
>
> I guess Fredrik can provide some inputs here on if there are some
> experiments behind this number of 1000ms, or this is just a safe bet ?
> >

My first patch attempt - which is attached to the Redhat and FDO Bugzilla
bugs - added a retry loop around the original 100 ms timeout. The retry loop
did trigger, but never more than once in a row in my testing.

So possibly 200 ms would be a sufficient timeout, but as the wait_for() loop
terminates early on success I suggested a conservative value of 1000 ms.

> > Can we do other connectors at the same time, or does probing LSPCON
> > block the system?
> We can do other connectors at the same time in DRM layer at-least,
> LSPCON blocks only this connector. I was curious if are we doing this
> during the resume scenario or is this in the sequential get_connector()
> type of call  ?
> - Shashank
> > -Chris
/F
Saarinen, Jani Aug. 16, 2018, 9:53 a.m. UTC | #8
Hi, 
> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Jani Nikula
> Sent: torstai 16. elokuuta 2018 10.18
> To: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Fredrik Schön
> <fredrikschon@gmail.com>
> Cc: intel-gfx@lists.freedesktop.org; Fredrik Schön
> <fredrik.schon@gmail.com>
> Subject: Re: [Intel-gfx] [PATCH] [drm][i915] Increase LSPCON timeout
> 
> On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
> >> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
> >>
> >> First of all we need to fix the commit subject:
> >>
> >> drm/i915: Increase LSPCON timeout
> >>
> >> (this can be done when merging, no need to resend)
> >>
> >> > 100 ms is not enough time for the LSPCON adapter on Intel NUC
> >> > devices to settle. This causes dropped display modes at driver
> initialisation.
> >> >
> >> > Increase timeout to 1000 ms.
> >> >
> >> > Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >>
> >> Missusage of "Fixes:" tag, please read
> >>
> >> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html
> >> #fixes
> >>
> >> But also no need for resending... could be fixed when merging
> >>
> >> The right one would be:
> >>
> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> >> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode
> >> to settle")
> >> Cc: Shashank Sharma <shashank.sharma@intel.com>
> >> Cc: Imre Deak <imre.deak@intel.com>
> >> Cc: Jani Nikula <jani.nikula@intel.com>
> >> Cc: <stable@vger.kernel.org> # v4.11+
> >>
> >> Since initial 100 seemed to be empirical and this increase seems to
> >> help other cases I'm in favor of this move so
> >>
> >> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >>
> >> However I will wait a bit before merging it so Imre, Shashank, and/or
> >> Jani can take a look here...
> >
> > now, really cc'ing them...
> 
> Shashank? Does this slow down non-LSPCON paths?
+ Shashank for real
> 
> BR,
> Jani.
> 
> 
> >
> >>
> >> > Signed-off-by: Fredrik Schön <fredrik.schon@gmail.com>
> >> > ---
> >> >  drivers/gpu/drm/i915/intel_lspcon.c | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/intel_lspcon.c
> >> > b/drivers/gpu/drm/i915/intel_lspcon.c
> >> > index 8ae8f42f430a..be1b08f589a4 100644
> >> > --- a/drivers/gpu/drm/i915/intel_lspcon.c
> >> > +++ b/drivers/gpu/drm/i915/intel_lspcon.c
> >> > @@ -74,7 +74,7 @@ static enum drm_lspcon_mode
> lspcon_wait_mode(struct intel_lspcon *lspcon,
> >> >  	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
> >> >  		      lspcon_mode_name(mode));
> >> >
> >> > -	wait_for((current_mode = lspcon_get_current_mode(lspcon)) ==
> mode, 100);
> >> > +	wait_for((current_mode = lspcon_get_current_mode(lspcon)) ==
> >> > +mode, 1000);
> >> >  	if (current_mode != mode)
> >> >  		DRM_ERROR("LSPCON mode hasn't settled\n");
> >> >
> >> > --
> >> > 2.17.1
> >> >
> >> > _______________________________________________
> >> > Intel-gfx mailing list
> >> > Intel-gfx@lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> --
> Jani Nikula, Intel Open Source Graphics Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Rodrigo Vivi Aug. 16, 2018, 6:23 p.m. UTC | #9
On Thu, Aug 16, 2018 at 10:36:43AM +0200, Fredrik Schön wrote:
> Shashank,
> 
> Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
> <shashank.sharma@intel.com>:
> >
> > Hey Chris,
> >
> >
> > On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> > >> Regards
> > >>
> > >> Shashank
> > >>
> > >>
> > >> On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > >>> On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > >>>> On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi wrote:
> > >>>>> On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön wrote:
> > >>>>>
> > >>>>> First of all we need to fix the commit subject:
> > >>>>>
> > >>>>> drm/i915: Increase LSPCON timeout
> > >>>>>
> > >>>>> (this can be done when merging, no need to resend)
> > >>>>>
> > >>>>>> 100 ms is not enough time for the LSPCON adapter on Intel NUC devices to
> > >>>>>> settle. This causes dropped display modes at driver initialisation.
> > >>>>>>
> > >>>>>> Increase timeout to 1000 ms.
> > >>>>>>
> > >>>>>> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > >>>>> Missusage of "Fixes:" tag, please read
> > >>>>>
> > >>>>> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> > >>>>>
> > >>>>> But also no need for resending... could be fixed when merging
> > >>>>>
> > >>>>> The right one would be:
> > >>>>>
> > >>>>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > >>>>> Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for expected LSPCON mode to settle")
> > >>>>> Cc: Shashank Sharma <shashank.sharma@intel.com>
> > >>>>> Cc: Imre Deak <imre.deak@intel.com>
> > >>>>> Cc: Jani Nikula <jani.nikula@intel.com>
> > >>>>> Cc: <stable@vger.kernel.org> # v4.11+
> > >>>>>
> > >>>>> Since initial 100 seemed to be empirical and this increase seems to
> > >>>>> help other cases I'm in favor of this move so
> > >>>>>
> > >>>>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > >>>>>
> > >>>>> However I will wait a bit before merging it
> > >>>>> so Imre, Shashank, and/or Jani can take a look here...
> > >>>> now, really cc'ing them...
> > >>> Shashank? Does this slow down non-LSPCON paths?
> > >> This will slow down the lspcon probing and resume part, but both of them
> > >> happen only when LSPCON device is found. So to answer your question,
> > >> this will not slow down the non-lspcon path, but will slow down the
> > >> LSPCON connector resume and probe time. but I would recommend, instead
> > >> of increasing it to 1000 ms in a single shot, we might want to gradually
> > >> pick this up, on a wake-and-check way.
> > > wait_for() checks every [10us, 1ms] until the condition is met, or it
> > > times out. So, so long as we don't enter this path for !LPSCON where we
> > > know that it will timeout, the wait_for() will only take as long as is
> > > required for the connector to settle.
> > We wont hit !LSPCON timeout case here, as we have already read the
> > dongle signature successfully by now.  But I was thinking that, if the
> > spec recommends max wait time as 100ms (which is of course doesn't seem
> > enough), if we can't detect i2c-over-aux after first 500ms, I guess we
> > wont be able to do that in next 500ms too. So is it really ok to wait
> > this long in the resume sequence ?
> >
> > I guess Fredrik can provide some inputs here on if there are some
> > experiments behind this number of 1000ms, or this is just a safe bet ?
> > >
> 
> My first patch attempt - which is attached to the Redhat and FDO Bugzilla
> bugs - added a retry loop around the original 100 ms timeout. The retry loop
> did trigger, but never more than once in a row in my testing.
> 
> So possibly 200 ms would be a sufficient timeout, but as the wait_for() loop
> terminates early on success I suggested a conservative value of 1000 ms.

Since Shashank mentioned 100us came from some spec, maybe the double is already
a conservative value.

Since there is the concerns of delaying something when LSPCON fails
and we are possibly looping on connectors somewhere/somehow I believe we need
to have a balanced approach here.

could you please try the 200 ms approach on your case there for a while and
see how it goes?

> 
> > > Can we do other connectors at the same time, or does probing LSPCON
> > > block the system?
> > We can do other connectors at the same time in DRM layer at-least,
> > LSPCON blocks only this connector. I was curious if are we doing this
> > during the resume scenario or is this in the sequential get_connector()
> > type of call  ?
> > - Shashank
> > > -Chris
> /F
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Fredrik Schön Aug. 16, 2018, 9:35 p.m. UTC | #10
tor 2018-08-16 klockan 11:23 -0700 skrev Rodrigo Vivi:
> On Thu, Aug 16, 2018 at 10:36:43AM +0200, Fredrik Schön wrote:
> > Shashank,
> > 
> > Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
> > <shashank.sharma@intel.com>:
> > > 
> > > Hey Chris,
> > > 
> > > 
> > > On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > > > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> > > > > Regards
> > > > > 
> > > > > Shashank
> > > > > 
> > > > > 
> > > > > On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > > > > > On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > wrote:
> > > > > > > On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi
> > > > > > > wrote:
> > > > > > > > On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > First of all we need to fix the commit subject:
> > > > > > > > 
> > > > > > > > drm/i915: Increase LSPCON timeout
> > > > > > > > 
> > > > > > > > (this can be done when merging, no need to resend)
> > > > > > > > 
> > > > > > > > > 100 ms is not enough time for the LSPCON adapter on
> > > > > > > > > Intel NUC devices to
> > > > > > > > > settle. This causes dropped display modes at driver
> > > > > > > > > initialisation.
> > > > > > > > > 
> > > > > > > > > Increase timeout to 1000 ms.
> > > > > > > > > 
> > > > > > > > > Fixes: 
> > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > 
> > > > > > > > Missusage of "Fixes:" tag, please read
> > > > > > > > 
> > > > > > > > 
https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> > > > > > > > 
> > > > > > > > But also no need for resending... could be fixed when
> > > > > > > > merging
> > > > > > > > 
> > > > > > > > The right one would be:
> > > > > > > > 
> > > > > > > > Bugzilla: 
> > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for
> > > > > > > > expected LSPCON mode to settle")
> > > > > > > > Cc: Shashank Sharma <shashank.sharma@intel.com>
> > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > Cc: <stable@vger.kernel.org> # v4.11+
> > > > > > > > 
> > > > > > > > Since initial 100 seemed to be empirical and this
> > > > > > > > increase seems to
> > > > > > > > help other cases I'm in favor of this move so
> > > > > > > > 
> > > > > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > 
> > > > > > > > However I will wait a bit before merging it
> > > > > > > > so Imre, Shashank, and/or Jani can take a look here...
> > > > > > > 
> > > > > > > now, really cc'ing them...
> > > > > > 
> > > > > > Shashank? Does this slow down non-LSPCON paths?
> > > > > 
> > > > > This will slow down the lspcon probing and resume part, but
> > > > > both of them
> > > > > happen only when LSPCON device is found. So to answer your
> > > > > question,
> > > > > this will not slow down the non-lspcon path, but will slow
> > > > > down the
> > > > > LSPCON connector resume and probe time. but I would
> > > > > recommend, instead
> > > > > of increasing it to 1000 ms in a single shot, we might want
> > > > > to gradually
> > > > > pick this up, on a wake-and-check way.
> > > > 
> > > > wait_for() checks every [10us, 1ms] until the condition is met,
> > > > or it
> > > > times out. So, so long as we don't enter this path for !LPSCON
> > > > where we
> > > > know that it will timeout, the wait_for() will only take as
> > > > long as is
> > > > required for the connector to settle.
> > > 
> > > We wont hit !LSPCON timeout case here, as we have already read
> > > the
> > > dongle signature successfully by now.  But I was thinking that,
> > > if the
> > > spec recommends max wait time as 100ms (which is of course
> > > doesn't seem
> > > enough), if we can't detect i2c-over-aux after first 500ms, I
> > > guess we
> > > wont be able to do that in next 500ms too. So is it really ok to
> > > wait
> > > this long in the resume sequence ?
> > > 
> > > I guess Fredrik can provide some inputs here on if there are some
> > > experiments behind this number of 1000ms, or this is just a safe
> > > bet ?
> > > > 
> > 
> > My first patch attempt - which is attached to the Redhat and FDO
> > Bugzilla
> > bugs - added a retry loop around the original 100 ms timeout. The
> > retry loop
> > did trigger, but never more than once in a row in my testing.
> > 
> > So possibly 200 ms would be a sufficient timeout, but as the
> > wait_for() loop
> > terminates early on success I suggested a conservative value of
> > 1000 ms.
> 
> Since Shashank mentioned 100us came from some spec, maybe the double
> is already
> a conservative value.
> 
> Since there is the concerns of delaying something when LSPCON fails
> and we are possibly looping on connectors somewhere/somehow I believe
> we need
> to have a balanced approach here.
> 
> could you please try the 200 ms approach on your case there for a
> while and
> see how it goes?
> 

I ran a few stress tests using Nicholas test case from [1]. I can
quickly reproduce the failure with timeouts 100 ms, 110 ms, 130 ms, 150
ms and 170 ms. I am unable to reproduce any failures with timeouts 190
ms (n=18) and 200 ms (n=20+16).

So while 200 ms appears to work on my hardware with reasonable
confidence, I wouldn't call 200 conservative. But then again, I do not
know the specifications. I'm just being empirical.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=107503#c15

> > 
> > > > Can we do other connectors at the same time, or does probing
> > > > LSPCON
> > > > block the system?
> > > 
> > > We can do other connectors at the same time in DRM layer at-
> > > least,
> > > LSPCON blocks only this connector. I was curious if are we doing
> > > this
> > > during the resume scenario or is this in the sequential
> > > get_connector()
> > > type of call  ?
> > > - Shashank
> > > > -Chris
> > 
> > /F
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Rodrigo Vivi Aug. 16, 2018, 9:43 p.m. UTC | #11
On Thu, Aug 16, 2018 at 11:35:26PM +0200, fredrikschon@gmail.com wrote:
> tor 2018-08-16 klockan 11:23 -0700 skrev Rodrigo Vivi:
> > On Thu, Aug 16, 2018 at 10:36:43AM +0200, Fredrik Schön wrote:
> > > Shashank,
> > > 
> > > Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
> > > <shashank.sharma@intel.com>:
> > > > 
> > > > Hey Chris,
> > > > 
> > > > 
> > > > On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > > > > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> > > > > > Regards
> > > > > > 
> > > > > > Shashank
> > > > > > 
> > > > > > 
> > > > > > On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > > > > > > On Wed, 15 Aug 2018, Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > wrote:
> > > > > > > > On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo Vivi
> > > > > > > > wrote:
> > > > > > > > > On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik Schön
> > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > First of all we need to fix the commit subject:
> > > > > > > > > 
> > > > > > > > > drm/i915: Increase LSPCON timeout
> > > > > > > > > 
> > > > > > > > > (this can be done when merging, no need to resend)
> > > > > > > > > 
> > > > > > > > > > 100 ms is not enough time for the LSPCON adapter on
> > > > > > > > > > Intel NUC devices to
> > > > > > > > > > settle. This causes dropped display modes at driver
> > > > > > > > > > initialisation.
> > > > > > > > > > 
> > > > > > > > > > Increase timeout to 1000 ms.
> > > > > > > > > > 
> > > > > > > > > > Fixes: 
> > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > 
> > > > > > > > > Missusage of "Fixes:" tag, please read
> > > > > > > > > 
> > > > > > > > > 
> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> > > > > > > > > 
> > > > > > > > > But also no need for resending... could be fixed when
> > > > > > > > > merging
> > > > > > > > > 
> > > > > > > > > The right one would be:
> > > > > > > > > 
> > > > > > > > > Bugzilla: 
> > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for
> > > > > > > > > expected LSPCON mode to settle")
> > > > > > > > > Cc: Shashank Sharma <shashank.sharma@intel.com>
> > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > > Cc: <stable@vger.kernel.org> # v4.11+
> > > > > > > > > 
> > > > > > > > > Since initial 100 seemed to be empirical and this
> > > > > > > > > increase seems to
> > > > > > > > > help other cases I'm in favor of this move so
> > > > > > > > > 
> > > > > > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > > 
> > > > > > > > > However I will wait a bit before merging it
> > > > > > > > > so Imre, Shashank, and/or Jani can take a look here...
> > > > > > > > 
> > > > > > > > now, really cc'ing them...
> > > > > > > 
> > > > > > > Shashank? Does this slow down non-LSPCON paths?
> > > > > > 
> > > > > > This will slow down the lspcon probing and resume part, but
> > > > > > both of them
> > > > > > happen only when LSPCON device is found. So to answer your
> > > > > > question,
> > > > > > this will not slow down the non-lspcon path, but will slow
> > > > > > down the
> > > > > > LSPCON connector resume and probe time. but I would
> > > > > > recommend, instead
> > > > > > of increasing it to 1000 ms in a single shot, we might want
> > > > > > to gradually
> > > > > > pick this up, on a wake-and-check way.
> > > > > 
> > > > > wait_for() checks every [10us, 1ms] until the condition is met,
> > > > > or it
> > > > > times out. So, so long as we don't enter this path for !LPSCON
> > > > > where we
> > > > > know that it will timeout, the wait_for() will only take as
> > > > > long as is
> > > > > required for the connector to settle.
> > > > 
> > > > We wont hit !LSPCON timeout case here, as we have already read
> > > > the
> > > > dongle signature successfully by now.  But I was thinking that,
> > > > if the
> > > > spec recommends max wait time as 100ms (which is of course
> > > > doesn't seem
> > > > enough), if we can't detect i2c-over-aux after first 500ms, I
> > > > guess we
> > > > wont be able to do that in next 500ms too. So is it really ok to
> > > > wait
> > > > this long in the resume sequence ?
> > > > 
> > > > I guess Fredrik can provide some inputs here on if there are some
> > > > experiments behind this number of 1000ms, or this is just a safe
> > > > bet ?
> > > > > 
> > > 
> > > My first patch attempt - which is attached to the Redhat and FDO
> > > Bugzilla
> > > bugs - added a retry loop around the original 100 ms timeout. The
> > > retry loop
> > > did trigger, but never more than once in a row in my testing.
> > > 
> > > So possibly 200 ms would be a sufficient timeout, but as the
> > > wait_for() loop
> > > terminates early on success I suggested a conservative value of
> > > 1000 ms.
> > 
> > Since Shashank mentioned 100us came from some spec, maybe the double
> > is already
> > a conservative value.
> > 
> > Since there is the concerns of delaying something when LSPCON fails
> > and we are possibly looping on connectors somewhere/somehow I believe
> > we need
> > to have a balanced approach here.
> > 
> > could you please try the 200 ms approach on your case there for a
> > while and
> > see how it goes?
> > 
> 
> I ran a few stress tests using Nicholas test case from [1]. I can
> quickly reproduce the failure with timeouts 100 ms, 110 ms, 130 ms, 150
> ms and 170 ms. I am unable to reproduce any failures with timeouts 190
> ms (n=18) and 200 ms (n=20+16).
> 
> So while 200 ms appears to work on my hardware with reasonable
> confidence, I wouldn't call 200 conservative. But then again, I do not
> know the specifications. I'm just being empirical.

I don't know this specification either and if that exists the empirical
shows that it is wrong or we have another bug somewhere else.

So... let's call 400 safe enough for now then?!

> 
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=107503#c15
> 
> > > 
> > > > > Can we do other connectors at the same time, or does probing
> > > > > LSPCON
> > > > > block the system?
> > > > 
> > > > We can do other connectors at the same time in DRM layer at-
> > > > least,
> > > > LSPCON blocks only this connector. I was curious if are we doing
> > > > this
> > > > during the resume scenario or is this in the sequential
> > > > get_connector()
> > > > type of call  ?
> > > > - Shashank
> > > > > -Chris
> > > 
> > > /F
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Fredrik Schön Aug. 16, 2018, 9:51 p.m. UTC | #12
tor 2018-08-16 klockan 14:43 -0700 skrev Rodrigo Vivi:
> On Thu, Aug 16, 2018 at 11:35:26PM +0200, fredrikschon@gmail.com
> wrote:
> > tor 2018-08-16 klockan 11:23 -0700 skrev Rodrigo Vivi:
> > > On Thu, Aug 16, 2018 at 10:36:43AM +0200, Fredrik Schön wrote:
> > > > Shashank,
> > > > 
> > > > Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
> > > > <shashank.sharma@intel.com>:
> > > > > 
> > > > > Hey Chris,
> > > > > 
> > > > > 
> > > > > On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > > > > > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> > > > > > > Regards
> > > > > > > 
> > > > > > > Shashank
> > > > > > > 
> > > > > > > 
> > > > > > > On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > > > > > > > On Wed, 15 Aug 2018, Rodrigo Vivi <
> > > > > > > > rodrigo.vivi@intel.com>
> > > > > > > > wrote:
> > > > > > > > > On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo
> > > > > > > > > Vivi
> > > > > > > > > wrote:
> > > > > > > > > > On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik
> > > > > > > > > > Schön
> > > > > > > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > First of all we need to fix the commit subject:
> > > > > > > > > > 
> > > > > > > > > > drm/i915: Increase LSPCON timeout
> > > > > > > > > > 
> > > > > > > > > > (this can be done when merging, no need to resend)
> > > > > > > > > > 
> > > > > > > > > > > 100 ms is not enough time for the LSPCON adapter
> > > > > > > > > > > on
> > > > > > > > > > > Intel NUC devices to
> > > > > > > > > > > settle. This causes dropped display modes at
> > > > > > > > > > > driver
> > > > > > > > > > > initialisation.
> > > > > > > > > > > 
> > > > > > > > > > > Increase timeout to 1000 ms.
> > > > > > > > > > > 
> > > > > > > > > > > Fixes: 
> > > > > > > > > > > 
https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > > 
> > > > > > > > > > Missusage of "Fixes:" tag, please read
> > > > > > > > > > 
> > > > > > > > > > 
> > 
> > 
https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> > > > > > > > > > 
> > > > > > > > > > But also no need for resending... could be fixed
> > > > > > > > > > when
> > > > > > > > > > merging
> > > > > > > > > > 
> > > > > > > > > > The right one would be:
> > > > > > > > > > 
> > > > > > > > > > Bugzilla: 
> > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > > Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for
> > > > > > > > > > expected LSPCON mode to settle")
> > > > > > > > > > Cc: Shashank Sharma <shashank.sharma@intel.com>
> > > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > > > Cc: <stable@vger.kernel.org> # v4.11+
> > > > > > > > > > 
> > > > > > > > > > Since initial 100 seemed to be empirical and this
> > > > > > > > > > increase seems to
> > > > > > > > > > help other cases I'm in favor of this move so
> > > > > > > > > > 
> > > > > > > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > > > 
> > > > > > > > > > However I will wait a bit before merging it
> > > > > > > > > > so Imre, Shashank, and/or Jani can take a look
> > > > > > > > > > here...
> > > > > > > > > 
> > > > > > > > > now, really cc'ing them...
> > > > > > > > 
> > > > > > > > Shashank? Does this slow down non-LSPCON paths?
> > > > > > > 
> > > > > > > This will slow down the lspcon probing and resume part,
> > > > > > > but
> > > > > > > both of them
> > > > > > > happen only when LSPCON device is found. So to answer
> > > > > > > your
> > > > > > > question,
> > > > > > > this will not slow down the non-lspcon path, but will
> > > > > > > slow
> > > > > > > down the
> > > > > > > LSPCON connector resume and probe time. but I would
> > > > > > > recommend, instead
> > > > > > > of increasing it to 1000 ms in a single shot, we might
> > > > > > > want
> > > > > > > to gradually
> > > > > > > pick this up, on a wake-and-check way.
> > > > > > 
> > > > > > wait_for() checks every [10us, 1ms] until the condition is
> > > > > > met,
> > > > > > or it
> > > > > > times out. So, so long as we don't enter this path for
> > > > > > !LPSCON
> > > > > > where we
> > > > > > know that it will timeout, the wait_for() will only take as
> > > > > > long as is
> > > > > > required for the connector to settle.
> > > > > 
> > > > > We wont hit !LSPCON timeout case here, as we have already
> > > > > read
> > > > > the
> > > > > dongle signature successfully by now.  But I was thinking
> > > > > that,
> > > > > if the
> > > > > spec recommends max wait time as 100ms (which is of course
> > > > > doesn't seem
> > > > > enough), if we can't detect i2c-over-aux after first 500ms, I
> > > > > guess we
> > > > > wont be able to do that in next 500ms too. So is it really ok
> > > > > to
> > > > > wait
> > > > > this long in the resume sequence ?
> > > > > 
> > > > > I guess Fredrik can provide some inputs here on if there are
> > > > > some
> > > > > experiments behind this number of 1000ms, or this is just a
> > > > > safe
> > > > > bet ?
> > > > > > 
> > > > 
> > > > My first patch attempt - which is attached to the Redhat and
> > > > FDO
> > > > Bugzilla
> > > > bugs - added a retry loop around the original 100 ms timeout.
> > > > The
> > > > retry loop
> > > > did trigger, but never more than once in a row in my testing.
> > > > 
> > > > So possibly 200 ms would be a sufficient timeout, but as the
> > > > wait_for() loop
> > > > terminates early on success I suggested a conservative value of
> > > > 1000 ms.
> > > 
> > > Since Shashank mentioned 100us came from some spec, maybe the
> > > double
> > > is already
> > > a conservative value.
> > > 
> > > Since there is the concerns of delaying something when LSPCON
> > > fails
> > > and we are possibly looping on connectors somewhere/somehow I
> > > believe
> > > we need
> > > to have a balanced approach here.
> > > 
> > > could you please try the 200 ms approach on your case there for a
> > > while and
> > > see how it goes?
> > > 
> > 
> > I ran a few stress tests using Nicholas test case from [1]. I can
> > quickly reproduce the failure with timeouts 100 ms, 110 ms, 130 ms,
> > 150
> > ms and 170 ms. I am unable to reproduce any failures with timeouts
> > 190
> > ms (n=18) and 200 ms (n=20+16).
> > 
> > So while 200 ms appears to work on my hardware with reasonable
> > confidence, I wouldn't call 200 conservative. But then again, I do
> > not
> > know the specifications. I'm just being empirical.
> 
> I don't know this specification either and if that exists the
> empirical
> shows that it is wrong or we have another bug somewhere else.
> 
> So... let's call 400 safe enough for now then?!
> 

Sound reasonable. Do you want me to respin the patch?
> > 
> > [1] https://bugs.freedesktop.org/show_bug.cgi?id=107503#c15
> > 
> > > > 
> > > > > > Can we do other connectors at the same time, or does
> > > > > > probing
> > > > > > LSPCON
> > > > > > block the system?
> > > > > 
> > > > > We can do other connectors at the same time in DRM layer at-
> > > > > least,
> > > > > LSPCON blocks only this connector. I was curious if are we
> > > > > doing
> > > > > this
> > > > > during the resume scenario or is this in the sequential
> > > > > get_connector()
> > > > > type of call  ?
> > > > > - Shashank
> > > > > > -Chris
> > > > 
> > > > /F
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Rodrigo Vivi Aug. 16, 2018, 9:58 p.m. UTC | #13
On Thu, Aug 16, 2018 at 11:51:07PM +0200, Fredrik Schön wrote:
> tor 2018-08-16 klockan 14:43 -0700 skrev Rodrigo Vivi:
> > On Thu, Aug 16, 2018 at 11:35:26PM +0200, fredrikschon@gmail.com
> > wrote:
> > > tor 2018-08-16 klockan 11:23 -0700 skrev Rodrigo Vivi:
> > > > On Thu, Aug 16, 2018 at 10:36:43AM +0200, Fredrik Schön wrote:
> > > > > Shashank,
> > > > > 
> > > > > Den tors 16 aug. 2018 kl 10:15 skrev Sharma, Shashank
> > > > > <shashank.sharma@intel.com>:
> > > > > > 
> > > > > > Hey Chris,
> > > > > > 
> > > > > > 
> > > > > > On 8/16/2018 1:13 PM, Chris Wilson wrote:
> > > > > > > Quoting Sharma, Shashank (2018-08-16 08:33:36)
> > > > > > > > Regards
> > > > > > > > 
> > > > > > > > Shashank
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 8/16/2018 12:47 PM, Jani Nikula wrote:
> > > > > > > > > On Wed, 15 Aug 2018, Rodrigo Vivi <
> > > > > > > > > rodrigo.vivi@intel.com>
> > > > > > > > > wrote:
> > > > > > > > > > On Wed, Aug 15, 2018 at 03:39:40PM -0700, Rodrigo
> > > > > > > > > > Vivi
> > > > > > > > > > wrote:
> > > > > > > > > > > On Mon, Aug 13, 2018 at 07:51:33PM +0200, Fredrik
> > > > > > > > > > > Schön
> > > > > > > > > > > wrote:
> > > > > > > > > > > 
> > > > > > > > > > > First of all we need to fix the commit subject:
> > > > > > > > > > > 
> > > > > > > > > > > drm/i915: Increase LSPCON timeout
> > > > > > > > > > > 
> > > > > > > > > > > (this can be done when merging, no need to resend)
> > > > > > > > > > > 
> > > > > > > > > > > > 100 ms is not enough time for the LSPCON adapter
> > > > > > > > > > > > on
> > > > > > > > > > > > Intel NUC devices to
> > > > > > > > > > > > settle. This causes dropped display modes at
> > > > > > > > > > > > driver
> > > > > > > > > > > > initialisation.
> > > > > > > > > > > > 
> > > > > > > > > > > > Increase timeout to 1000 ms.
> > > > > > > > > > > > 
> > > > > > > > > > > > Fixes: 
> > > > > > > > > > > > 
> https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > > > 
> > > > > > > > > > > Missusage of "Fixes:" tag, please read
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > 
> > > 
> https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-intel.html#fixes
> > > > > > > > > > > 
> > > > > > > > > > > But also no need for resending... could be fixed
> > > > > > > > > > > when
> > > > > > > > > > > merging
> > > > > > > > > > > 
> > > > > > > > > > > The right one would be:
> > > > > > > > > > > 
> > > > > > > > > > > Bugzilla: 
> > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1570392
> > > > > > > > > > > Fixes: 357c0ae9198a ("drm/i915/lspcon: Wait for
> > > > > > > > > > > expected LSPCON mode to settle")
> > > > > > > > > > > Cc: Shashank Sharma <shashank.sharma@intel.com>
> > > > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > > > > Cc: <stable@vger.kernel.org> # v4.11+
> > > > > > > > > > > 
> > > > > > > > > > > Since initial 100 seemed to be empirical and this
> > > > > > > > > > > increase seems to
> > > > > > > > > > > help other cases I'm in favor of this move so
> > > > > > > > > > > 
> > > > > > > > > > > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > > > > 
> > > > > > > > > > > However I will wait a bit before merging it
> > > > > > > > > > > so Imre, Shashank, and/or Jani can take a look
> > > > > > > > > > > here...
> > > > > > > > > > 
> > > > > > > > > > now, really cc'ing them...
> > > > > > > > > 
> > > > > > > > > Shashank? Does this slow down non-LSPCON paths?
> > > > > > > > 
> > > > > > > > This will slow down the lspcon probing and resume part,
> > > > > > > > but
> > > > > > > > both of them
> > > > > > > > happen only when LSPCON device is found. So to answer
> > > > > > > > your
> > > > > > > > question,
> > > > > > > > this will not slow down the non-lspcon path, but will
> > > > > > > > slow
> > > > > > > > down the
> > > > > > > > LSPCON connector resume and probe time. but I would
> > > > > > > > recommend, instead
> > > > > > > > of increasing it to 1000 ms in a single shot, we might
> > > > > > > > want
> > > > > > > > to gradually
> > > > > > > > pick this up, on a wake-and-check way.
> > > > > > > 
> > > > > > > wait_for() checks every [10us, 1ms] until the condition is
> > > > > > > met,
> > > > > > > or it
> > > > > > > times out. So, so long as we don't enter this path for
> > > > > > > !LPSCON
> > > > > > > where we
> > > > > > > know that it will timeout, the wait_for() will only take as
> > > > > > > long as is
> > > > > > > required for the connector to settle.
> > > > > > 
> > > > > > We wont hit !LSPCON timeout case here, as we have already
> > > > > > read
> > > > > > the
> > > > > > dongle signature successfully by now.  But I was thinking
> > > > > > that,
> > > > > > if the
> > > > > > spec recommends max wait time as 100ms (which is of course
> > > > > > doesn't seem
> > > > > > enough), if we can't detect i2c-over-aux after first 500ms, I
> > > > > > guess we
> > > > > > wont be able to do that in next 500ms too. So is it really ok
> > > > > > to
> > > > > > wait
> > > > > > this long in the resume sequence ?
> > > > > > 
> > > > > > I guess Fredrik can provide some inputs here on if there are
> > > > > > some
> > > > > > experiments behind this number of 1000ms, or this is just a
> > > > > > safe
> > > > > > bet ?
> > > > > > > 
> > > > > 
> > > > > My first patch attempt - which is attached to the Redhat and
> > > > > FDO
> > > > > Bugzilla
> > > > > bugs - added a retry loop around the original 100 ms timeout.
> > > > > The
> > > > > retry loop
> > > > > did trigger, but never more than once in a row in my testing.
> > > > > 
> > > > > So possibly 200 ms would be a sufficient timeout, but as the
> > > > > wait_for() loop
> > > > > terminates early on success I suggested a conservative value of
> > > > > 1000 ms.
> > > > 
> > > > Since Shashank mentioned 100us came from some spec, maybe the
> > > > double
> > > > is already
> > > > a conservative value.
> > > > 
> > > > Since there is the concerns of delaying something when LSPCON
> > > > fails
> > > > and we are possibly looping on connectors somewhere/somehow I
> > > > believe
> > > > we need
> > > > to have a balanced approach here.
> > > > 
> > > > could you please try the 200 ms approach on your case there for a
> > > > while and
> > > > see how it goes?
> > > > 
> > > 
> > > I ran a few stress tests using Nicholas test case from [1]. I can
> > > quickly reproduce the failure with timeouts 100 ms, 110 ms, 130 ms,
> > > 150
> > > ms and 170 ms. I am unable to reproduce any failures with timeouts
> > > 190
> > > ms (n=18) and 200 ms (n=20+16).
> > > 
> > > So while 200 ms appears to work on my hardware with reasonable
> > > confidence, I wouldn't call 200 conservative. But then again, I do
> > > not
> > > know the specifications. I'm just being empirical.
> > 
> > I don't know this specification either and if that exists the
> > empirical
> > shows that it is wrong or we have another bug somewhere else.
> > 
> > So... let's call 400 safe enough for now then?!
> > 
> 
> Sound reasonable. Do you want me to respin the patch?

we can give a day for the guys on other timezone to have a chance
to ack, nack or give another suggestions and after that yes, please
send a v2.

Thanks,
Rodrigo.

> > > 
> > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=107503#c15
> > > 
> > > > > 
> > > > > > > Can we do other connectors at the same time, or does
> > > > > > > probing
> > > > > > > LSPCON
> > > > > > > block the system?
> > > > > > 
> > > > > > We can do other connectors at the same time in DRM layer at-
> > > > > > least,
> > > > > > LSPCON blocks only this connector. I was curious if are we
> > > > > > doing
> > > > > > this
> > > > > > during the resume scenario or is this in the sequential
> > > > > > get_connector()
> > > > > > type of call  ?
> > > > > > - Shashank
> > > > > > > -Chris
> > > > > 
> > > > > /F
> > > > > _______________________________________________
> > > > > Intel-gfx mailing list
> > > > > Intel-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_lspcon.c b/drivers/gpu/drm/i915/intel_lspcon.c
index 8ae8f42f430a..be1b08f589a4 100644
--- a/drivers/gpu/drm/i915/intel_lspcon.c
+++ b/drivers/gpu/drm/i915/intel_lspcon.c
@@ -74,7 +74,7 @@  static enum drm_lspcon_mode lspcon_wait_mode(struct intel_lspcon *lspcon,
 	DRM_DEBUG_KMS("Waiting for LSPCON mode %s to settle\n",
 		      lspcon_mode_name(mode));
 
-	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 100);
+	wait_for((current_mode = lspcon_get_current_mode(lspcon)) == mode, 1000);
 	if (current_mode != mode)
 		DRM_ERROR("LSPCON mode hasn't settled\n");