diff mbox

[v5] drm/i915/psr: Account for sink CRC raciness on some panels

Message ID 1502311200-28033-1-git-send-email-jim.bride@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

jim.bride@linux.intel.com Aug. 9, 2017, 8:40 p.m. UTC
According to the eDP spec, when the count field in TEST_SINK_MISC
increments then the six bytes of sink CRC information in the DPCD
should be valid.  Unfortunately, this doesn't seem to be the case
on some panels, and as a result we get some incorrect and inconsistent
values from the sink CRC DPCD locations at times.  This problem exhibits
itself more on faster processors (relative failure rates HSW < SKL < KBL.)
In order to try and account for this, we try a lot harder to read the sink
CRC until we get consistent values twice in a row before returning what we
read and delay for a time before trying to read.  We still see some
occasional failures, but reading the sink CRC is much more reliable,
particularly on SKL, with these changes than without.

v2: * Reduce number of retries when reading the sink CRC (Jani)
    * Refactor to minimize changes to the code (Jani)
    * Rebase
v3: * Rebase
v4: * Switch from do-while to for loop when reading CRC values (Jani)
    * Rebase
v5: * Checkpatch cleanup and commit message tweaks
    * Rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_dp.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

Comments

jim.bride@linux.intel.com Aug. 15, 2017, 4:58 p.m. UTC | #1
On Wed, Aug 09, 2017 at 01:40:00PM -0700, Jim Bride wrote:
> According to the eDP spec, when the count field in TEST_SINK_MISC
> increments then the six bytes of sink CRC information in the DPCD
> should be valid.  Unfortunately, this doesn't seem to be the case
> on some panels, and as a result we get some incorrect and inconsistent
> values from the sink CRC DPCD locations at times.  This problem exhibits
> itself more on faster processors (relative failure rates HSW < SKL < KBL.)
> In order to try and account for this, we try a lot harder to read the sink
> CRC until we get consistent values twice in a row before returning what we
> read and delay for a time before trying to read.  We still see some
> occasional failures, but reading the sink CRC is much more reliable,
> particularly on SKL, with these changes than without.
> 
> v2: * Reduce number of retries when reading the sink CRC (Jani)
>     * Refactor to minimize changes to the code (Jani)
>     * Rebase
> v3: * Rebase
> v4: * Switch from do-while to for loop when reading CRC values (Jani)
>     * Rebase
> v5: * Checkpatch cleanup and commit message tweaks
>     * Rebase
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>

I think I addressed all previous review comments for this patch.  Any
thoughts?

Jim

> Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 76c8a0b..b64757c 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -3906,6 +3906,10 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
>  	u8 buf;
>  	int count, ret;
>  	int attempts = 6;
> +	u8 old_crc[6];
> +
> +	if (crc == NULL)
> +		return -ENOMEM;
>  
>  	ret = intel_dp_sink_crc_start(intel_dp);
>  	if (ret)
> @@ -3929,11 +3933,33 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
>  		goto stop;
>  	}
>  
> -	if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR, crc, 6) < 0) {
> -		ret = -EIO;
> -		goto stop;
> +	/*
> +	 * Sometimes it takes a while for the "real" CRC values to land in
> +	 * the DPCD, so try several times until we get two reads in a row
> +	 * that are the same.  If we're an eDP panel, delay between reads
> +	 * for a while since the values take a bit longer to propagate.
> +	 */
> +	for (attempts = 0; attempts < 6; attempts++) {
> +		intel_wait_for_vblank(dev_priv, intel_crtc->pipe);
> +
> +		if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR,
> +				     crc, 6) < 0) {
> +			ret = -EIO;
> +			break;
> +		}
> +
> +		if (attempts && memcmp(old_crc, crc, 6) == 0)
> +			break;
> +		memcpy(old_crc, crc, 6);
> +
> +		if (is_edp(intel_dp))
> +			usleep_range(20000, 25000);
>  	}
>  
> +	if (attempts == 6) {
> +		DRM_DEBUG_KMS("Failed to get CRC after 6 attempts.\n");
> +		ret = -ETIMEDOUT;
> +	}
>  stop:
>  	intel_dp_sink_crc_stop(intel_dp);
>  	return ret;
> -- 
> 2.7.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Rodrigo Vivi Aug. 15, 2017, 11:41 p.m. UTC | #2
On Tue, Aug 15, 2017 at 9:58 AM, Jim Bride <jim.bride@linux.intel.com> wrote:
> On Wed, Aug 09, 2017 at 01:40:00PM -0700, Jim Bride wrote:
>> According to the eDP spec, when the count field in TEST_SINK_MISC
>> increments then the six bytes of sink CRC information in the DPCD
>> should be valid.  Unfortunately, this doesn't seem to be the case
>> on some panels, and as a result we get some incorrect and inconsistent
>> values from the sink CRC DPCD locations at times.  This problem exhibits
>> itself more on faster processors (relative failure rates HSW < SKL < KBL.)
>> In order to try and account for this, we try a lot harder to read the sink
>> CRC until we get consistent values twice in a row before returning what we
>> read and delay for a time before trying to read.  We still see some
>> occasional failures, but reading the sink CRC is much more reliable,
>> particularly on SKL, with these changes than without.
>>
>> v2: * Reduce number of retries when reading the sink CRC (Jani)
>>     * Refactor to minimize changes to the code (Jani)
>>     * Rebase
>> v3: * Rebase
>> v4: * Switch from do-while to for loop when reading CRC values (Jani)
>>     * Rebase
>> v5: * Checkpatch cleanup and commit message tweaks
>>     * Rebase
>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> Cc: Jani Nikula <jani.nikula@intel.com>
>
> I think I addressed all previous review comments for this patch.  Any
> thoughts?

I suffered a lot with this unreliable sink crcs in the past. As you I tried many
different things like this, but they are still unreliable.

So I believe I'm in favor of one of DK's suggestion:
" On the other hand, since the only consumers of dp sink crc are tests,
why can't the kernel just dump what it reads to debugfs and let the test
deal with erroneous results?
"
So let's keep the kernel doing the right thing by the spec and try to
change test cases to deal with this bad values.

Or let's find some other way to test this without sink crc... and
anyways I believe that we should just drop this patch.

>
> Jim
>
>> Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/intel_dp.c | 32 +++++++++++++++++++++++++++++---
>>  1 file changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
>> index 76c8a0b..b64757c 100644
>> --- a/drivers/gpu/drm/i915/intel_dp.c
>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>> @@ -3906,6 +3906,10 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
>>       u8 buf;
>>       int count, ret;
>>       int attempts = 6;
>> +     u8 old_crc[6];
>> +
>> +     if (crc == NULL)
>> +             return -ENOMEM;
>>
>>       ret = intel_dp_sink_crc_start(intel_dp);
>>       if (ret)
>> @@ -3929,11 +3933,33 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
>>               goto stop;
>>       }
>>
>> -     if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR, crc, 6) < 0) {
>> -             ret = -EIO;
>> -             goto stop;
>> +     /*
>> +      * Sometimes it takes a while for the "real" CRC values to land in
>> +      * the DPCD, so try several times until we get two reads in a row
>> +      * that are the same.  If we're an eDP panel, delay between reads
>> +      * for a while since the values take a bit longer to propagate.
>> +      */
>> +     for (attempts = 0; attempts < 6; attempts++) {
>> +             intel_wait_for_vblank(dev_priv, intel_crtc->pipe);
>> +
>> +             if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR,
>> +                                  crc, 6) < 0) {
>> +                     ret = -EIO;
>> +                     break;
>> +             }
>> +
>> +             if (attempts && memcmp(old_crc, crc, 6) == 0)
>> +                     break;
>> +             memcpy(old_crc, crc, 6);
>> +
>> +             if (is_edp(intel_dp))
>> +                     usleep_range(20000, 25000);
>>       }
>>
>> +     if (attempts == 6) {
>> +             DRM_DEBUG_KMS("Failed to get CRC after 6 attempts.\n");
>> +             ret = -ETIMEDOUT;
>> +     }
>>  stop:
>>       intel_dp_sink_crc_stop(intel_dp);
>>       return ret;
>> --
>> 2.7.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
jim.bride@linux.intel.com Aug. 16, 2017, 5:32 p.m. UTC | #3
On Tue, Aug 15, 2017 at 04:41:52PM -0700, Rodrigo Vivi wrote:
> On Tue, Aug 15, 2017 at 9:58 AM, Jim Bride <jim.bride@linux.intel.com> wrote:
> > On Wed, Aug 09, 2017 at 01:40:00PM -0700, Jim Bride wrote:
> >> According to the eDP spec, when the count field in TEST_SINK_MISC
> >> increments then the six bytes of sink CRC information in the DPCD
> >> should be valid.  Unfortunately, this doesn't seem to be the case
> >> on some panels, and as a result we get some incorrect and inconsistent
> >> values from the sink CRC DPCD locations at times.  This problem exhibits
> >> itself more on faster processors (relative failure rates HSW < SKL < KBL.)
> >> In order to try and account for this, we try a lot harder to read the sink
> >> CRC until we get consistent values twice in a row before returning what we
> >> read and delay for a time before trying to read.  We still see some
> >> occasional failures, but reading the sink CRC is much more reliable,
> >> particularly on SKL, with these changes than without.
> >>
> >> v2: * Reduce number of retries when reading the sink CRC (Jani)
> >>     * Refactor to minimize changes to the code (Jani)
> >>     * Rebase
> >> v3: * Rebase
> >> v4: * Switch from do-while to for loop when reading CRC values (Jani)
> >>     * Rebase
> >> v5: * Checkpatch cleanup and commit message tweaks
> >>     * Rebase
> >> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> Cc: Jani Nikula <jani.nikula@intel.com>
> >
> > I think I addressed all previous review comments for this patch.  Any
> > thoughts?
> 
> I suffered a lot with this unreliable sink crcs in the past. As you I tried many
> different things like this, but they are still unreliable.
> 
> So I believe I'm in favor of one of DK's suggestion:
> " On the other hand, since the only consumers of dp sink crc are tests,
> why can't the kernel just dump what it reads to debugfs and let the test
> deal with erroneous results?
> "
> So let's keep the kernel doing the right thing by the spec and try to
> change test cases to deal with this bad values.

Ok.  I moved the logic into the IGT library's call for reading sink
crcs.

> Or let's find some other way to test this without sink crc... and
> anyways I believe that we should just drop this patch.

This is the longer-term plan.   We need similar tests for PSR 2,
which doesn't support sink crcs, anyhow.

Jim


> >
> > Jim
> >
> >> Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
> >> ---
> >>  drivers/gpu/drm/i915/intel_dp.c | 32 +++++++++++++++++++++++++++++---
> >>  1 file changed, 29 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> >> index 76c8a0b..b64757c 100644
> >> --- a/drivers/gpu/drm/i915/intel_dp.c
> >> +++ b/drivers/gpu/drm/i915/intel_dp.c
> >> @@ -3906,6 +3906,10 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
> >>       u8 buf;
> >>       int count, ret;
> >>       int attempts = 6;
> >> +     u8 old_crc[6];
> >> +
> >> +     if (crc == NULL)
> >> +             return -ENOMEM;
> >>
> >>       ret = intel_dp_sink_crc_start(intel_dp);
> >>       if (ret)
> >> @@ -3929,11 +3933,33 @@ int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
> >>               goto stop;
> >>       }
> >>
> >> -     if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR, crc, 6) < 0) {
> >> -             ret = -EIO;
> >> -             goto stop;
> >> +     /*
> >> +      * Sometimes it takes a while for the "real" CRC values to land in
> >> +      * the DPCD, so try several times until we get two reads in a row
> >> +      * that are the same.  If we're an eDP panel, delay between reads
> >> +      * for a while since the values take a bit longer to propagate.
> >> +      */
> >> +     for (attempts = 0; attempts < 6; attempts++) {
> >> +             intel_wait_for_vblank(dev_priv, intel_crtc->pipe);
> >> +
> >> +             if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR,
> >> +                                  crc, 6) < 0) {
> >> +                     ret = -EIO;
> >> +                     break;
> >> +             }
> >> +
> >> +             if (attempts && memcmp(old_crc, crc, 6) == 0)
> >> +                     break;
> >> +             memcpy(old_crc, crc, 6);
> >> +
> >> +             if (is_edp(intel_dp))
> >> +                     usleep_range(20000, 25000);
> >>       }
> >>
> >> +     if (attempts == 6) {
> >> +             DRM_DEBUG_KMS("Failed to get CRC after 6 attempts.\n");
> >> +             ret = -ETIMEDOUT;
> >> +     }
> >>  stop:
> >>       intel_dp_sink_crc_stop(intel_dp);
> >>       return ret;
> >> --
> >> 2.7.4
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> 
> 
> -- 
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 76c8a0b..b64757c 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -3906,6 +3906,10 @@  int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
 	u8 buf;
 	int count, ret;
 	int attempts = 6;
+	u8 old_crc[6];
+
+	if (crc == NULL)
+		return -ENOMEM;
 
 	ret = intel_dp_sink_crc_start(intel_dp);
 	if (ret)
@@ -3929,11 +3933,33 @@  int intel_dp_sink_crc(struct intel_dp *intel_dp, u8 *crc)
 		goto stop;
 	}
 
-	if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR, crc, 6) < 0) {
-		ret = -EIO;
-		goto stop;
+	/*
+	 * Sometimes it takes a while for the "real" CRC values to land in
+	 * the DPCD, so try several times until we get two reads in a row
+	 * that are the same.  If we're an eDP panel, delay between reads
+	 * for a while since the values take a bit longer to propagate.
+	 */
+	for (attempts = 0; attempts < 6; attempts++) {
+		intel_wait_for_vblank(dev_priv, intel_crtc->pipe);
+
+		if (drm_dp_dpcd_read(&intel_dp->aux, DP_TEST_CRC_R_CR,
+				     crc, 6) < 0) {
+			ret = -EIO;
+			break;
+		}
+
+		if (attempts && memcmp(old_crc, crc, 6) == 0)
+			break;
+		memcpy(old_crc, crc, 6);
+
+		if (is_edp(intel_dp))
+			usleep_range(20000, 25000);
 	}
 
+	if (attempts == 6) {
+		DRM_DEBUG_KMS("Failed to get CRC after 6 attempts.\n");
+		ret = -ETIMEDOUT;
+	}
 stop:
 	intel_dp_sink_crc_stop(intel_dp);
 	return ret;