diff mbox

drm/edid: Fix DDC probe for passive DP dongles

Message ID 1432167736-3903-1-git-send-email-tprevite@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Todd Previte May 21, 2015, 12:22 a.m. UTC
Passive DP->DVI/HDMI dongles show up to the system as HDMI devices, as they
do not have a sink device in them to respond to any AUX traffic. When
probing these dongles over the DDC, sometimes they will NAK the first attempt
even though the transaction is valid and they support the DDC protocol. The
retry loop inside of drm_do_probe_ddc_edid() would normally catch this case
and try the transaction again, resulting in success.

That, however, was thwarted by the fix for fdo.org bug #41059. The patch is:
	commit 9292f37e1f5c79400254dca46f83313488093825
	Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
	Date:   Thu Jan 5 09:34:28 2012 -0200

	drm: give up on edid retries when i2c bus is not responding

This added code to exit immediately if the return code from the
i2c_transfer function was -ENXIO in order to reduce the amount of time spent
in waiting for unresponsive or disconnected devices. For the DP dongles,
this means that the second retry never happens which results in a failed
EDID probe and a black screen.

To work around this problem without undoing the fix for bug #41059, the
number of retries is checked along with the return code. This allows for a
device to NAK once and still continue operations. A second NAK will result
in breaking the loop as it would have before and stopping the DDC probe.

Signed-off-by: Todd Previte <tprevite@gmail.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/drm_edid.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Jani Nikula May 21, 2015, 8:28 a.m. UTC | #1
On Thu, 21 May 2015, Todd Previte <tprevite@gmail.com> wrote:
> Passive DP->DVI/HDMI dongles show up to the system as HDMI devices, as they
> do not have a sink device in them to respond to any AUX traffic. When
> probing these dongles over the DDC, sometimes they will NAK the first attempt
> even though the transaction is valid and they support the DDC protocol. The
> retry loop inside of drm_do_probe_ddc_edid() would normally catch this case
> and try the transaction again, resulting in success.
>
> That, however, was thwarted by the fix for fdo.org bug #41059. The patch is:
> 	commit 9292f37e1f5c79400254dca46f83313488093825
> 	Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
> 	Date:   Thu Jan 5 09:34:28 2012 -0200
>
> 	drm: give up on edid retries when i2c bus is not responding

Some extra background:

That commit refers to the i2c bit banging code, while i915 now prefers
gmbus, and only falls back to big banging on certain failures. (See
gmbux_xfer() in i915/intel_i2c.c). This means that in most cases i915 is
no longer susceptible to the 5*3 timeout loops, but it also means we
don't have the i2c bit banging retry at all on -ENXIO, like Todd notes.

The questions are, is one retry after -ENXIO in drm_do_probe_ddc_edid
enough now? Should we revert the original commit instead since the
underlying algorithm has changed? Or should we return something other
than -ENXIO from our gmbus code to not hit this exit with no retries
path?

> This added code to exit immediately if the return code from the
> i2c_transfer function was -ENXIO in order to reduce the amount of time spent
> in waiting for unresponsive or disconnected devices. For the DP dongles,
> this means that the second retry never happens which results in a failed
> EDID probe and a black screen.
>
> To work around this problem without undoing the fix for bug #41059, the
> number of retries is checked along with the return code. This allows for a
> device to NAK once and still continue operations. A second NAK will result
> in breaking the loop as it would have before and stopping the DDC probe.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85924

Maybe throw this at other dongle bugs you can find too?

We're going to need Tested-bys though.

BR,
Jani.


> Signed-off-by: Todd Previte <tprevite@gmail.com>
> Cc: intel-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/drm_edid.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 7087da3..e8047bd 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -1238,7 +1238,10 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned int block, size_t len)
>  		 */
>  		ret = i2c_transfer(adapter, &msgs[3 - xfers], xfers);
>  
> -		if (ret == -ENXIO) {
> +		/* Passive DP->DVI/HDMI dongles sometimes NAK the first probe
> +		 * Try to probe again but if it NAKs, stop trying
> +		 */
> +		if (ret == -ENXIO && retries < 5) {
>  			DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
>  					adapter->name);
>  			break;
> -- 
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Daniel Vetter May 21, 2015, 12:16 p.m. UTC | #2
On Thu, May 21, 2015 at 11:28:48AM +0300, Jani Nikula wrote:
> On Thu, 21 May 2015, Todd Previte <tprevite@gmail.com> wrote:
> > Passive DP->DVI/HDMI dongles show up to the system as HDMI devices, as they
> > do not have a sink device in them to respond to any AUX traffic. When
> > probing these dongles over the DDC, sometimes they will NAK the first attempt
> > even though the transaction is valid and they support the DDC protocol. The
> > retry loop inside of drm_do_probe_ddc_edid() would normally catch this case
> > and try the transaction again, resulting in success.
> >
> > That, however, was thwarted by the fix for fdo.org bug #41059. The patch is:
> > 	commit 9292f37e1f5c79400254dca46f83313488093825
> > 	Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
> > 	Date:   Thu Jan 5 09:34:28 2012 -0200
> >
> > 	drm: give up on edid retries when i2c bus is not responding
> 
> Some extra background:
> 
> That commit refers to the i2c bit banging code, while i915 now prefers
> gmbus, and only falls back to big banging on certain failures. (See
> gmbux_xfer() in i915/intel_i2c.c). This means that in most cases i915 is
> no longer susceptible to the 5*3 timeout loops, but it also means we
> don't have the i2c bit banging retry at all on -ENXIO, like Todd notes.
> 
> The questions are, is one retry after -ENXIO in drm_do_probe_ddc_edid
> enough now? Should we revert the original commit instead since the
> underlying algorithm has changed? Or should we return something other
> than -ENXIO from our gmbus code to not hit this exit with no retries
> path?

Jani&I dug around a bit on this on irc and it looks like a proper i2c
implementation should indeed retry on the first ENXIO, except when the
I2C_M_IGNORE_NAK flag is set. There's piles of examples how to do it, but
it's clear that we've never done this properly in our gmbus driver.

Can you please test whether fixing that in intel_i2c.c does resolve the
issue too? It would at least explain a lot of the fallback troubles we've
had with gmbus and i2c slaves not waking up in time, which all got
magically resolved by falling back to the bit-banging interface (which
does this retrying already).

Thanks, Daniel

> 
> > This added code to exit immediately if the return code from the
> > i2c_transfer function was -ENXIO in order to reduce the amount of time spent
> > in waiting for unresponsive or disconnected devices. For the DP dongles,
> > this means that the second retry never happens which results in a failed
> > EDID probe and a black screen.
> >
> > To work around this problem without undoing the fix for bug #41059, the
> > number of retries is checked along with the return code. This allows for a
> > device to NAK once and still continue operations. A second NAK will result
> > in breaking the loop as it would have before and stopping the DDC probe.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85924
> 
> Maybe throw this at other dongle bugs you can find too?
> 
> We're going to need Tested-bys though.
> 
> BR,
> Jani.
> 
> 
> > Signed-off-by: Todd Previte <tprevite@gmail.com>
> > Cc: intel-gfx@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/drm_edid.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 7087da3..e8047bd 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -1238,7 +1238,10 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned int block, size_t len)
> >  		 */
> >  		ret = i2c_transfer(adapter, &msgs[3 - xfers], xfers);
> >  
> > -		if (ret == -ENXIO) {
> > +		/* Passive DP->DVI/HDMI dongles sometimes NAK the first probe
> > +		 * Try to probe again but if it NAKs, stop trying
> > +		 */
> > +		if (ret == -ENXIO && retries < 5) {
> >  			DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
> >  					adapter->name);
> >  			break;
> > -- 
> > 1.9.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Shuang He May 22, 2015, 1:13 a.m. UTC | #3
Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6446
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  234/234              234/234
ILK                                  262/262              262/262
SNB                 -1              282/282              281/282
IVB                                  300/300              300/300
BYT                                  254/254              254/254
BDW                                  275/275              275/275
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 SNB  igt@pm_rpm@dpms-mode-unset-non-lpsp      DMESG_WARN(18)PASS(1)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_uncore.c:#assert_device_not_suspended[i915]()@WARNING:.* at .* assert_device_not_suspended+0x
Note: You need to pay more attention to line start with '*'
Todd Previte May 22, 2015, 3:04 p.m. UTC | #4
On 5/21/2015 1:28 AM, Jani Nikula wrote:
> On Thu, 21 May 2015, Todd Previte <tprevite@gmail.com> wrote:
>> Passive DP->DVI/HDMI dongles show up to the system as HDMI devices, as they
>> do not have a sink device in them to respond to any AUX traffic. When
>> probing these dongles over the DDC, sometimes they will NAK the first attempt
>> even though the transaction is valid and they support the DDC protocol. The
>> retry loop inside of drm_do_probe_ddc_edid() would normally catch this case
>> and try the transaction again, resulting in success.
>>
>> That, however, was thwarted by the fix for fdo.org bug #41059. The patch is:
>> 	commit 9292f37e1f5c79400254dca46f83313488093825
>> 	Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
>> 	Date:   Thu Jan 5 09:34:28 2012 -0200
>>
>> 	drm: give up on edid retries when i2c bus is not responding
> Some extra background:
>
> That commit refers to the i2c bit banging code, while i915 now prefers
> gmbus, and only falls back to big banging on certain failures. (See
> gmbux_xfer() in i915/intel_i2c.c). This means that in most cases i915 is
> no longer susceptible to the 5*3 timeout loops, but it also means we
> don't have the i2c bit banging retry at all on -ENXIO, like Todd notes.
>
> The questions are, is one retry after -ENXIO in drm_do_probe_ddc_edid
> enough now? Should we revert the original commit instead since the
> underlying algorithm has changed? Or should we return something other
> than -ENXIO from our gmbus code to not hit this exit with no retries
> path?
Question #1:
During development and testing, all of the passive dongles used would 
NAK once before operating correctly. I had an additional dongle here 
that I tested as well and it exhibited the exact same behavior. While I 
believe that more testing certainly needs to be done (as you mention 
below) at this point I would say allowing a single NAK is sufficient to 
solve this problem.

Question #2:
While I agree to some extent that this patch is dated with respect to 
the underlying algorithm changes, there still appears to be some value 
in keeping it since it does allow for more efficient processing of 
disconnected ports. So I would recommend against reverting this commit 
right now.

Question #3:
I'm looking into this now. I'll have more information by the end of the day.

>> This added code to exit immediately if the return code from the
>> i2c_transfer function was -ENXIO in order to reduce the amount of time spent
>> in waiting for unresponsive or disconnected devices. For the DP dongles,
>> this means that the second retry never happens which results in a failed
>> EDID probe and a black screen.
>>
>> To work around this problem without undoing the fix for bug #41059, the
>> number of retries is checked along with the return code. This allows for a
>> device to NAK once and still continue operations. A second NAK will result
>> in breaking the loop as it would have before and stopping the DDC probe.
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85924
>
> Maybe throw this at other dongle bugs you can find too?
>
> We're going to need Tested-bys though.
>
> BR,
> Jani.
Definitely - I'll go wrangle up the dongle bugs I can find on kernel.org 
and FDO and see if this affects them.

As for testing, the more the merrier on this front. If anyone has 
passive DP->DVI/HDMI dongles and the ability to test them, please see 
the bug Jani referenced above and report your findings there.


-T
>
>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>> Cc: intel-gfx@lists.freedesktop.org
>> ---
>>   drivers/gpu/drm/drm_edid.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
>> index 7087da3..e8047bd 100644
>> --- a/drivers/gpu/drm/drm_edid.c
>> +++ b/drivers/gpu/drm/drm_edid.c
>> @@ -1238,7 +1238,10 @@ drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned int block, size_t len)
>>   		 */
>>   		ret = i2c_transfer(adapter, &msgs[3 - xfers], xfers);
>>   
>> -		if (ret == -ENXIO) {
>> +		/* Passive DP->DVI/HDMI dongles sometimes NAK the first probe
>> +		 * Try to probe again but if it NAKs, stop trying
>> +		 */
>> +		if (ret == -ENXIO && retries < 5) {
>>   			DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
>>   					adapter->name);
>>   			break;
>> -- 
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 7087da3..e8047bd 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1238,7 +1238,10 @@  drm_do_probe_ddc_edid(void *data, u8 *buf, unsigned int block, size_t len)
 		 */
 		ret = i2c_transfer(adapter, &msgs[3 - xfers], xfers);
 
-		if (ret == -ENXIO) {
+		/* Passive DP->DVI/HDMI dongles sometimes NAK the first probe
+		 * Try to probe again but if it NAKs, stop trying
+		 */
+		if (ret == -ENXIO && retries < 5) {
 			DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
 					adapter->name);
 			break;