diff mbox

drm/i915/dp: Give up link training clock recovery after 10 failed attempts

Message ID 1531503135-19153-1-git-send-email-nathan.d.ciobanu@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Nathan Ciobanu July 13, 2018, 5:32 p.m. UTC
Limit the link training clock recovery loop to 10 failed attempts at
LANEx_CR_DONE per DP 1.4 spec. Some USB-C MST hubs cause us to get
stuck in this loop on hot-plugging indefinitely as
drm_dp_clock_recovery_ok() never returns true and none of the
other conditions occur.

Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Rodrigo Vivi July 13, 2018, 9:22 p.m. UTC | #1
On Fri, Jul 13, 2018 at 10:32:15AM -0700, Nathan Ciobanu wrote:
> Limit the link training clock recovery loop to 10 failed attempts at
> LANEx_CR_DONE per DP 1.4 spec.

Where exactly in the spec?

> Some USB-C MST hubs cause us to get
> stuck in this loop on hot-plugging indefinitely as
> drm_dp_clock_recovery_ok() never returns true and none of the
> other conditions occur.

Although it seems really bad situation that we need to avoid...

> 
> Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
> index 4da6e33c7fa1..66c1a70343ba 100644
> --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> @@ -129,7 +129,7 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
>  intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
>  {
>  	uint8_t voltage;
> -	int voltage_tries, max_vswing_tries;
> +	int voltage_tries, max_vswing_tries, cr_tries;
>  	uint8_t link_config[2];
>  	uint8_t link_bw, rate_select;
>  
> @@ -172,6 +172,7 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
>  
>  	voltage_tries = 1;
>  	max_vswing_tries = 0;
> +	cr_tries = 0;
>  	for (;;) {
>  		uint8_t link_status[DP_LINK_STATUS_SIZE];
>  
> @@ -215,6 +216,11 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
>  		if (intel_dp_link_max_vswing_reached(intel_dp))
>  			++max_vswing_tries;
>  
> +		if (cr_tries == 9) {
> +			DRM_ERROR("Failed clock recovery 10 times, giving up!\n");
> +			return false;
> +		}
> +		++cr_tries;

If I understood correctly this is a global thing for the for(;;) right?

Shouldn't we make then like a:

- for(;;)
+ for(cr_tries = 0; cr_tries < 10; cr_tries++)
	{
	}

+ DRM_ERROR("Failed clock recovery 10 times, giving up!\n"); 
+ return false;
}

Thanks,
Rodrigo.

>  	}
>  }
>  
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Navare, Manasi July 13, 2018, 10:18 p.m. UTC | #2
On Fri, Jul 13, 2018 at 03:23:41PM -0700, Dhinakaran Pandiyan wrote:
> On Fri, 2018-07-13 at 14:22 -0700, Rodrigo Vivi wrote:
> > On Fri, Jul 13, 2018 at 10:32:15AM -0700, Nathan Ciobanu wrote:
> > > 
> > > Limit the link training clock recovery loop to 10 failed attempts
> > > at
> > > LANEx_CR_DONE per DP 1.4 spec.
> > Where exactly in the spec?

I see that this is specified in 3.5.1.2.2 Clock Recovery Sequence section
in DP 1.4 spec. Might be a good idea to mention this expliitly in the commit message.

> > 
> > > 
> > > Some USB-C MST hubs cause us to get
> > > stuck in this loop on hot-plugging indefinitely as

This is likely to occur in case of USB Type C cases where only fewer lanes
might be connected whereas the max lane count returned is still 4.
If this is the case, explicitly mentioning this is also a good idea.

Doesnt this get recovered at all after the fallback when it will eventually
try with the lower lane count?

> 
> Also include the information (the vswing toggling part) about why it is
> stuck in the loop. 
> 
> > > drm_dp_clock_recovery_ok() never returns true and none of the
> > > other conditions occur.
> > Although it seems really bad situation that we need to avoid...
> > 
> > > 
> > > 
> > > Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > index 4da6e33c7fa1..66c1a70343ba 100644
> > > --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > @@ -129,7 +129,7 @@ static bool
> > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > >  intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
> > >  {
> > >  	uint8_t voltage;
> > > -	int voltage_tries, max_vswing_tries;
> > > +	int voltage_tries, max_vswing_tries, cr_tries;
> > >  	uint8_t link_config[2];
> > >  	uint8_t link_bw, rate_select;
> > >  
> > > @@ -172,6 +172,7 @@ static bool
> > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > >  
> > >  	voltage_tries = 1;
> > >  	max_vswing_tries = 0;
> > > +	cr_tries = 0;
> > >  	for (;;) {
> > >  		uint8_t link_status[DP_LINK_STATUS_SIZE];
> > >  
> > > @@ -215,6 +216,11 @@ static bool
> > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > >  		if (intel_dp_link_max_vswing_reached(intel_dp))
> > >  			++max_vswing_tries;
> > >  
> > > +		if (cr_tries == 9) {
> > > +			DRM_ERROR("Failed clock recovery 10 times,
> > > giving up!\n");
> > > +			return false;
> > > +		}
> > > +		++cr_tries;
> > If I understood correctly this is a global thing for the for(;;)
> > right?
> > 
> > Shouldn't we make then like a:
> > 
> > - for(;;)
> > + for(cr_tries = 0; cr_tries < 10; cr_tries++)
> > 	{
> > 	}

I think incrementing it at the end of the loop makes sense because we
are checking for the 10 total read requests for lane_cr_done
and adjust_request_lane.
But if the max voltage swing is reached before that it should exit earlier.

Manasi

> > 
> > + DRM_ERROR("Failed clock recovery 10 times, giving up!\n"); 
> > + return false;
> > }
> > 
> > Thanks,
> > Rodrigo.
> > 
> > > 
> > >  	}
> > >  }
> > >  
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Dhinakaran Pandiyan July 13, 2018, 10:23 p.m. UTC | #3
On Fri, 2018-07-13 at 14:22 -0700, Rodrigo Vivi wrote:
> On Fri, Jul 13, 2018 at 10:32:15AM -0700, Nathan Ciobanu wrote:
> > 
> > Limit the link training clock recovery loop to 10 failed attempts
> > at
> > LANEx_CR_DONE per DP 1.4 spec.
> Where exactly in the spec?
> 
> > 
> > Some USB-C MST hubs cause us to get
> > stuck in this loop on hot-plugging indefinitely as

Also include the information (the vswing toggling part) about why it is
stuck in the loop. 

> > drm_dp_clock_recovery_ok() never returns true and none of the
> > other conditions occur.
> Although it seems really bad situation that we need to avoid...
> 
> > 
> > 
> > Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > index 4da6e33c7fa1..66c1a70343ba 100644
> > --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > @@ -129,7 +129,7 @@ static bool
> > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
> >  {
> >  	uint8_t voltage;
> > -	int voltage_tries, max_vswing_tries;
> > +	int voltage_tries, max_vswing_tries, cr_tries;
> >  	uint8_t link_config[2];
> >  	uint8_t link_bw, rate_select;
> >  
> > @@ -172,6 +172,7 @@ static bool
> > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  
> >  	voltage_tries = 1;
> >  	max_vswing_tries = 0;
> > +	cr_tries = 0;
> >  	for (;;) {
> >  		uint8_t link_status[DP_LINK_STATUS_SIZE];
> >  
> > @@ -215,6 +216,11 @@ static bool
> > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  		if (intel_dp_link_max_vswing_reached(intel_dp))
> >  			++max_vswing_tries;
> >  
> > +		if (cr_tries == 9) {
> > +			DRM_ERROR("Failed clock recovery 10 times,
> > giving up!\n");
> > +			return false;
> > +		}
> > +		++cr_tries;
> If I understood correctly this is a global thing for the for(;;)
> right?
> 
> Shouldn't we make then like a:
> 
> - for(;;)
> + for(cr_tries = 0; cr_tries < 10; cr_tries++)
> 	{
> 	}
> 
> + DRM_ERROR("Failed clock recovery 10 times, giving up!\n"); 
> + return false;
> }
> 
> Thanks,
> Rodrigo.
> 
> > 
> >  	}
> >  }
> >
Nathan Ciobanu July 13, 2018, 10:49 p.m. UTC | #4
On Fri, Jul 13, 2018 at 02:22:03PM -0700, Rodrigo Vivi wrote:
> On Fri, Jul 13, 2018 at 10:32:15AM -0700, Nathan Ciobanu wrote:
> > Limit the link training clock recovery loop to 10 failed attempts at
> > LANEx_CR_DONE per DP 1.4 spec.
> 
> Where exactly in the spec?
I'll add the section number to the commit message.
> 
> > Some USB-C MST hubs cause us to get
> > stuck in this loop on hot-plugging indefinitely as
> > drm_dp_clock_recovery_ok() never returns true and none of the
> > other conditions occur.
> 
> Although it seems really bad situation that we need to avoid...
> 
> > 
> > Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > index 4da6e33c7fa1..66c1a70343ba 100644
> > --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > @@ -129,7 +129,7 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
> >  {
> >  	uint8_t voltage;
> > -	int voltage_tries, max_vswing_tries;
> > +	int voltage_tries, max_vswing_tries, cr_tries;
> >  	uint8_t link_config[2];
> >  	uint8_t link_bw, rate_select;
> >  
> > @@ -172,6 +172,7 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  
> >  	voltage_tries = 1;
> >  	max_vswing_tries = 0;
> > +	cr_tries = 0;
> >  	for (;;) {
> >  		uint8_t link_status[DP_LINK_STATUS_SIZE];
> >  
> > @@ -215,6 +216,11 @@ static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> >  		if (intel_dp_link_max_vswing_reached(intel_dp))
> >  			++max_vswing_tries;
> >  
> > +		if (cr_tries == 9) {
> > +			DRM_ERROR("Failed clock recovery 10 times, giving up!\n");
> > +			return false;
> > +		}
> > +		++cr_tries;
> 
> If I understood correctly this is a global thing for the for(;;) right?
> 
> Shouldn't we make then like a:
> 
> - for(;;)
> + for(cr_tries = 0; cr_tries < 10; cr_tries++)
> 	{
> 	}
> 
> + DRM_ERROR("Failed clock recovery 10 times, giving up!\n"); 
> + return false;
> }
That was my thought initially as well but I was worried that
it will not be immediately obvious why I'm returning false after
the loop - although the error message tells you why. I'll change
this. 

-Nathan
> 
> Thanks,
> Rodrigo.
> 
> >  	}
> >  }
> >  
> > -- 
> > 1.9.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Nathan Ciobanu July 13, 2018, 11:17 p.m. UTC | #5
On Fri, Jul 13, 2018 at 03:18:32PM -0700, Manasi Navare wrote:
> On Fri, Jul 13, 2018 at 03:23:41PM -0700, Dhinakaran Pandiyan wrote:
> > On Fri, 2018-07-13 at 14:22 -0700, Rodrigo Vivi wrote:
> > > On Fri, Jul 13, 2018 at 10:32:15AM -0700, Nathan Ciobanu wrote:
> > > > 
> > > > Limit the link training clock recovery loop to 10 failed attempts
> > > > at
> > > > LANEx_CR_DONE per DP 1.4 spec.
> > > Where exactly in the spec?
> 
> I see that this is specified in 3.5.1.2.2 Clock Recovery Sequence section
> in DP 1.4 spec. Might be a good idea to mention this expliitly in the commit message.
Right, I'll do that.
> 
> > > 
> > > > 
> > > > Some USB-C MST hubs cause us to get
> > > > stuck in this loop on hot-plugging indefinitely as
> 
> This is likely to occur in case of USB Type C cases where only fewer lanes
> might be connected whereas the max lane count returned is still 4.
> If this is the case, explicitly mentioning this is also a good idea.
I'm pretty sure this device uses 4 lanes. I am getting voltage swing/pre-emphasis
values and status on all 4.
> 
> Doesnt this get recovered at all after the fallback when it will eventually
> try with the lower lane count?
Yes it does recover after the 10 attempts.
> 
> > 
> > Also include the information (the vswing toggling part) about why it is
> > stuck in the loop. 
> > 
> > > > drm_dp_clock_recovery_ok() never returns true and none of the
> > > > other conditions occur.
> > > Although it seems really bad situation that we need to avoid...
> > > 
> > > > 
> > > > 
> > > > Signed-off-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_dp_link_training.c | 8 +++++++-
> > > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > > b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > > index 4da6e33c7fa1..66c1a70343ba 100644
> > > > --- a/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > > +++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
> > > > @@ -129,7 +129,7 @@ static bool
> > > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > > >  intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
> > > >  {
> > > >  	uint8_t voltage;
> > > > -	int voltage_tries, max_vswing_tries;
> > > > +	int voltage_tries, max_vswing_tries, cr_tries;
> > > >  	uint8_t link_config[2];
> > > >  	uint8_t link_bw, rate_select;
> > > >  
> > > > @@ -172,6 +172,7 @@ static bool
> > > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > > >  
> > > >  	voltage_tries = 1;
> > > >  	max_vswing_tries = 0;
> > > > +	cr_tries = 0;
> > > >  	for (;;) {
> > > >  		uint8_t link_status[DP_LINK_STATUS_SIZE];
> > > >  
> > > > @@ -215,6 +216,11 @@ static bool
> > > > intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
> > > >  		if (intel_dp_link_max_vswing_reached(intel_dp))
> > > >  			++max_vswing_tries;
> > > >  
> > > > +		if (cr_tries == 9) {
> > > > +			DRM_ERROR("Failed clock recovery 10 times,
> > > > giving up!\n");
> > > > +			return false;
> > > > +		}
> > > > +		++cr_tries;
> > > If I understood correctly this is a global thing for the for(;;)
> > > right?
> > > 
> > > Shouldn't we make then like a:
> > > 
> > > - for(;;)
> > > + for(cr_tries = 0; cr_tries < 10; cr_tries++)
> > > 	{
> > > 	}
> 
> I think incrementing it at the end of the loop makes sense because we
> are checking for the 10 total read requests for lane_cr_done
> and adjust_request_lane.
> But if the max voltage swing is reached before that it should exit earlier.
> 
> Manasi
> 
> > > 
> > > + DRM_ERROR("Failed clock recovery 10 times, giving up!\n"); 
> > > + return false;
> > > }
> > > 
> > > Thanks,
> > > Rodrigo.
> > > 
> > > > 
> > > >  	}
> > > >  }
> > > >  
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
index 4da6e33c7fa1..66c1a70343ba 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -129,7 +129,7 @@  static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
 intel_dp_link_training_clock_recovery(struct intel_dp *intel_dp)
 {
 	uint8_t voltage;
-	int voltage_tries, max_vswing_tries;
+	int voltage_tries, max_vswing_tries, cr_tries;
 	uint8_t link_config[2];
 	uint8_t link_bw, rate_select;
 
@@ -172,6 +172,7 @@  static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
 
 	voltage_tries = 1;
 	max_vswing_tries = 0;
+	cr_tries = 0;
 	for (;;) {
 		uint8_t link_status[DP_LINK_STATUS_SIZE];
 
@@ -215,6 +216,11 @@  static bool intel_dp_link_max_vswing_reached(struct intel_dp *intel_dp)
 		if (intel_dp_link_max_vswing_reached(intel_dp))
 			++max_vswing_tries;
 
+		if (cr_tries == 9) {
+			DRM_ERROR("Failed clock recovery 10 times, giving up!\n");
+			return false;
+		}
+		++cr_tries;
 	}
 }