diff mbox

[04/13] drm/i915: Add EDID read in intel_dp_check_link_status() for Link CTS 4.2.2.1

Message ID 1428936789-13435-1-git-send-email-tprevite@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Todd Previte April 13, 2015, 2:53 p.m. UTC
Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in the
Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
all HPD plug events. To reduce the amount of code, this EDID read is also
used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
support for these tests is implemented in later patches in this series.

V2:
- Fixed compilation error introduced during rework

Signed-off-by: Todd Previte <tprevite@gmail.com>
---
 drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Paulo Zanoni April 14, 2015, 4:53 p.m. UTC | #1
2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in the
> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
> all HPD plug events. To reduce the amount of code, this EDID read is also
> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
> support for these tests is implemented in later patches in this series.
>
> V2:
> - Fixed compilation error introduced during rework
>
> Signed-off-by: Todd Previte <tprevite@gmail.com>
> ---
>  drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 23184b0..75df3e2 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp *intel_dp)
>  {
>         struct drm_device *dev = intel_dp_to_dev(intel_dp);
>         struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
> +       struct drm_connector *connector = &intel_dp->attached_connector->base;
> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
> +       struct edid *edid_read = NULL;
>         u8 sink_irq_vector;
>         u8 link_status[DP_LINK_STATUS_SIZE];
>
> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp *intel_dp)
>                 return;
>         }
>
> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
> +        * 4.2.2.1 - EDID read required for all HPD events
> +         */
> +        edid_read = drm_get_edid(connector, adapter);
> +        if (!edid_read) {
> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
> +        }
> +

We already briefly discussed this patch in private, so I'm going to
summarize the discussion and also add some more points here.

Frist, the actual detailed review: the indentation here is using
spaces and we're leaking the EDID. This will cause rebases to a few of
the next patches.

Back to the hight level architecture: your initial versions of the
series contained just 1 extra EDID read, and it was contained inside
the compliance testing function. Then the versions submitted a few
days ago had 2 extra EDID reads, then after some discussion you
reduced to 1 extra EDID read (the one on this patch). I previously
asked "But what about the automatic EDID read we do when we get a
hotplug? Can't we just rely on it?". I got some answers to the
question, but I was not really convinced.

Yesterday I was arguing that this extra EDID read is going to add a
small delay to every hotplug event we get, so my initial suggestion
was to organize the compliance testing in a way that would require the
user space program to call the GetResources() IOCTL to force the EDID
when needed. Your argument was that then the DP compliance testing
procedure would be testing our app for compliance, not the Kernel.

But today I decided to finally do some debugging regarding this, and I
was able to confirm that we do follow the DP requirements: we do have
an automatic EDID read done by the Kernel whenever we do a hotplug:
i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
drm_get_edid() at some point. This function also does other stuff that
is required by the compliance testing, such as the DPCD reads.

Now there's a problem with using i915_hotplug_work_func(), which could
the reason why you rejected it: it only happens after
intel_dp_hpd_pulse(), which means that we only really do the EDID read
after intel_dp_handle_test_request().

I consider i915_hotplug_work_func() a fundamental part of our DP
framework, and the DP compliance testing seems to be just ignoring its
existence. So my idea for a solution here would be to make
intel_dp_handle_test_request() run on its own delayed work function.
It would wait for both i915_digport_work_func() and
i915_hotplug_work_func() to finish, and only then it would do the
normal processing. With this, we would be able to avoid the edid read
on this patch, we would maybe be able to avoid at least part of patch
2, we would maybe be able to completely avoid patch 7, and then on
patch 8 we would start touching intel_dp_get_edid() instead.

I know this is sort of a fundamental change that is being requested a
little late in the review process, and it can be frustrating, but this
aspect of the code only recently changed (I was fine with the EDID
reads just in the compliance testing function), and since the DP
compliance code is quite complex, it took me a while to realize
everything that's going on and what is the purpose of each piece. I
also think that, since this idea will allow the compliance testing to
take into consideration the work done by i915_hotplug_work_func(),
compliance testing will better reflect the behavior that is actually
done by the Kernel when DP devices are plugged/unplugged. And I did
ask about those new EDID reads as soon as I started reviewing the
patch that introduced them.

Now, since I know how frustrating it is to have to change a
significant portion of the code once again, I will leave to the
maintainers the decision of whether the current proposed
implementation is acceptable or if we want to make the DP compliance
testing code take into consideration the work done by
i915_hotplug_work_func(). I would also like to know your opinion on
this. Maybe my idea just doesn't make sense because of something else
I didn't realize :)

>         /* Try to read the source of the interrupt */
>         if (intel_dp->dpcd[DP_DPCD_REV] >= 0x11 &&
>             intel_dp_get_sink_irq(intel_dp, &sink_irq_vector)) {
> --
> 1.9.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Daniel Vetter April 15, 2015, 8:48 a.m. UTC | #2
On Tue, Apr 14, 2015 at 01:53:21PM -0300, Paulo Zanoni wrote:
> ... I will leave to the > maintainers the decision of whether the
> current proposed implementation is acceptable or [not] ...

So this is pretty much a lose-lose proposition for me. I chatted a bit
with Todd about this topic and I don't really have a lot of clue here. So
me making a call will mostly end up in a coin-tossing but and so will
definitely piss off at least someone.

I need to ponder this some more, but I'd really prefer if you can just
come up with a plan yourself. Without knowing all the details I'd guess
adding a FIXME comment and investigating this after patches have landed
(assuming it does indeed happen) seems acceptable to me. But that's really
just my totally ignorant opinion ;-)
-Daniel
Todd Previte April 15, 2015, 3:37 p.m. UTC | #3
On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
> 2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in the
>> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
>> all HPD plug events. To reduce the amount of code, this EDID read is also
>> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
>> support for these tests is implemented in later patches in this series.
>>
>> V2:
>> - Fixed compilation error introduced during rework
>>
>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>> ---
>>   drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
>> index 23184b0..75df3e2 100644
>> --- a/drivers/gpu/drm/i915/intel_dp.c
>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp *intel_dp)
>>   {
>>          struct drm_device *dev = intel_dp_to_dev(intel_dp);
>>          struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
>> +       struct drm_connector *connector = &intel_dp->attached_connector->base;
>> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
>> +       struct edid *edid_read = NULL;
>>          u8 sink_irq_vector;
>>          u8 link_status[DP_LINK_STATUS_SIZE];
>>
>> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp *intel_dp)
>>                  return;
>>          }
>>
>> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
>> +        * 4.2.2.1 - EDID read required for all HPD events
>> +         */
>> +        edid_read = drm_get_edid(connector, adapter);
>> +        if (!edid_read) {
>> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
>> +        }
>> +
> We already briefly discussed this patch in private, so I'm going to
> summarize the discussion and also add some more points here.
>
> Frist, the actual detailed review: the indentation here is using
> spaces and we're leaking the EDID. This will cause rebases to a few of
> the next patches.
>
> Back to the hight level architecture: your initial versions of the
> series contained just 1 extra EDID read, and it was contained inside
> the compliance testing function. Then the versions submitted a few
> days ago had 2 extra EDID reads, then after some discussion you
> reduced to 1 extra EDID read (the one on this patch). I previously
> asked "But what about the automatic EDID read we do when we get a
> hotplug? Can't we just rely on it?". I got some answers to the
> question, but I was not really convinced.
>
> Yesterday I was arguing that this extra EDID read is going to add a
> small delay to every hotplug event we get, so my initial suggestion
> was to organize the compliance testing in a way that would require the
> user space program to call the GetResources() IOCTL to force the EDID
> when needed. Your argument was that then the DP compliance testing
> procedure would be testing our app for compliance, not the Kernel.
>
> But today I decided to finally do some debugging regarding this, and I
> was able to confirm that we do follow the DP requirements: we do have
> an automatic EDID read done by the Kernel whenever we do a hotplug:
> i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
> drm_get_edid() at some point. This function also does other stuff that
> is required by the compliance testing, such as the DPCD reads.
>
> Now there's a problem with using i915_hotplug_work_func(), which could
> the reason why you rejected it: it only happens after
> intel_dp_hpd_pulse(), which means that we only really do the EDID read
> after intel_dp_handle_test_request().
>
> I consider i915_hotplug_work_func() a fundamental part of our DP
> framework, and the DP compliance testing seems to be just ignoring its
> existence. So my idea for a solution here would be to make
> intel_dp_handle_test_request() run on its own delayed work function.
> It would wait for both i915_digport_work_func() and
> i915_hotplug_work_func() to finish, and only then it would do the
> normal processing. With this, we would be able to avoid the edid read
> on this patch, we would maybe be able to avoid at least part of patch
> 2, we would maybe be able to completely avoid patch 7, and then on
> patch 8 we would start touching intel_dp_get_edid() instead.
>
> I know this is sort of a fundamental change that is being requested a
> little late in the review process, and it can be frustrating, but this
> aspect of the code only recently changed (I was fine with the EDID
> reads just in the compliance testing function), and since the DP
> compliance code is quite complex, it took me a while to realize
> everything that's going on and what is the purpose of each piece. I
> also think that, since this idea will allow the compliance testing to
> take into consideration the work done by i915_hotplug_work_func(),
> compliance testing will better reflect the behavior that is actually
> done by the Kernel when DP devices are plugged/unplugged. And I did
> ask about those new EDID reads as soon as I started reviewing the
> patch that introduced them.
>
> Now, since I know how frustrating it is to have to change a
> significant portion of the code once again, I will leave to the
> maintainers the decision of whether the current proposed
> implementation is acceptable or if we want to make the DP compliance
> testing code take into consideration the work done by
> i915_hotplug_work_func(). I would also like to know your opinion on
> this. Maybe my idea just doesn't make sense because of something else
> I didn't realize :)
I don't think this is a good idea. The work loop aspect seems like a 
very complex solution solution to a problem that is relatively simple. 
In a discussion with Daniel, he indicated that adding a work loop is 
something to be avoided unless it's *really* necessary, as they are 
prone to race conditions. In this case, I just don't see that it's 
necessary. Additionally, you have been keen to note invasive solutions 
and this seems like a highly invasive solution especially in light of 
having a viable one right here.

This solution adds very little overhead along the HPD path and that 
overhead is a single read of the EDID for each event. To further address 
any concerns about performance, for V6 I've propagated the long_hpd flag 
forward into check_link_status so that the EDID read is only called for 
a hot plug event, not short pulses. Another plus to this solution is 
that if/when this needs to moved to userspace, this one is a lot easier 
to unwind and pull out than the work loop you're suggesting.

With respect to the user space solution, that's something worth 
investigating in the future. I have some concerns about doing that, the 
primary one being the ability to detect the corrupted header, but I 
think it's something to consider for a follow-on update.


>>          /* Try to read the source of the interrupt */
>>          if (intel_dp->dpcd[DP_DPCD_REV] >= 0x11 &&
>>              intel_dp_get_sink_irq(intel_dp, &sink_irq_vector)) {
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
Paulo Zanoni April 15, 2015, 5:42 p.m. UTC | #4
2015-04-15 12:37 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>
>
> On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
>>
>> 2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>
>>> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in
>>> the
>>> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
>>> all HPD plug events. To reduce the amount of code, this EDID read is also
>>> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
>>> support for these tests is implemented in later patches in this series.
>>>
>>> V2:
>>> - Fixed compilation error introduced during rework
>>>
>>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>>>   1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_dp.c
>>> b/drivers/gpu/drm/i915/intel_dp.c
>>> index 23184b0..75df3e2 100644
>>> --- a/drivers/gpu/drm/i915/intel_dp.c
>>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>>> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp
>>> *intel_dp)
>>>   {
>>>          struct drm_device *dev = intel_dp_to_dev(intel_dp);
>>>          struct intel_encoder *intel_encoder =
>>> &dp_to_dig_port(intel_dp)->base;
>>> +       struct drm_connector *connector =
>>> &intel_dp->attached_connector->base;
>>> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
>>> +       struct edid *edid_read = NULL;
>>>          u8 sink_irq_vector;
>>>          u8 link_status[DP_LINK_STATUS_SIZE];
>>>
>>> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp
>>> *intel_dp)
>>>                  return;
>>>          }
>>>
>>> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
>>> +        * 4.2.2.1 - EDID read required for all HPD events
>>> +         */
>>> +        edid_read = drm_get_edid(connector, adapter);
>>> +        if (!edid_read) {
>>> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
>>> +        }
>>> +
>>
>> We already briefly discussed this patch in private, so I'm going to
>> summarize the discussion and also add some more points here.
>>
>> Frist, the actual detailed review: the indentation here is using
>> spaces and we're leaking the EDID. This will cause rebases to a few of
>> the next patches.
>>
>> Back to the hight level architecture: your initial versions of the
>> series contained just 1 extra EDID read, and it was contained inside
>> the compliance testing function. Then the versions submitted a few
>> days ago had 2 extra EDID reads, then after some discussion you
>> reduced to 1 extra EDID read (the one on this patch). I previously
>> asked "But what about the automatic EDID read we do when we get a
>> hotplug? Can't we just rely on it?". I got some answers to the
>> question, but I was not really convinced.
>>
>> Yesterday I was arguing that this extra EDID read is going to add a
>> small delay to every hotplug event we get, so my initial suggestion
>> was to organize the compliance testing in a way that would require the
>> user space program to call the GetResources() IOCTL to force the EDID
>> when needed. Your argument was that then the DP compliance testing
>> procedure would be testing our app for compliance, not the Kernel.
>>
>> But today I decided to finally do some debugging regarding this, and I
>> was able to confirm that we do follow the DP requirements: we do have
>> an automatic EDID read done by the Kernel whenever we do a hotplug:
>> i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
>> drm_get_edid() at some point. This function also does other stuff that
>> is required by the compliance testing, such as the DPCD reads.
>>
>> Now there's a problem with using i915_hotplug_work_func(), which could
>> the reason why you rejected it: it only happens after
>> intel_dp_hpd_pulse(), which means that we only really do the EDID read
>> after intel_dp_handle_test_request().
>>
>> I consider i915_hotplug_work_func() a fundamental part of our DP
>> framework, and the DP compliance testing seems to be just ignoring its
>> existence. So my idea for a solution here would be to make
>> intel_dp_handle_test_request() run on its own delayed work function.
>> It would wait for both i915_digport_work_func() and
>> i915_hotplug_work_func() to finish, and only then it would do the
>> normal processing. With this, we would be able to avoid the edid read
>> on this patch, we would maybe be able to avoid at least part of patch
>> 2, we would maybe be able to completely avoid patch 7, and then on
>> patch 8 we would start touching intel_dp_get_edid() instead.
>>
>> I know this is sort of a fundamental change that is being requested a
>> little late in the review process, and it can be frustrating, but this
>> aspect of the code only recently changed (I was fine with the EDID
>> reads just in the compliance testing function), and since the DP
>> compliance code is quite complex, it took me a while to realize
>> everything that's going on and what is the purpose of each piece. I
>> also think that, since this idea will allow the compliance testing to
>> take into consideration the work done by i915_hotplug_work_func(),
>> compliance testing will better reflect the behavior that is actually
>> done by the Kernel when DP devices are plugged/unplugged. And I did
>> ask about those new EDID reads as soon as I started reviewing the
>> patch that introduced them.
>>
>> Now, since I know how frustrating it is to have to change a
>> significant portion of the code once again, I will leave to the
>> maintainers the decision of whether the current proposed
>> implementation is acceptable or if we want to make the DP compliance
>> testing code take into consideration the work done by
>> i915_hotplug_work_func(). I would also like to know your opinion on
>> this. Maybe my idea just doesn't make sense because of something else
>> I didn't realize :)
>
> I don't think this is a good idea. The work loop aspect seems like a very
> complex solution solution to a problem that is relatively simple. In a
> discussion with Daniel, he indicated that adding a work loop is something to
> be avoided unless it's *really* necessary, as they are prone to race
> conditions. In this case, I just don't see that it's necessary.

The workqueue thing was just an idea to implement a solution for the
real problem. I think we should be focusing about discussing the fact
that we're not taking i915_hotplug_work_func() into consideration when
doing the compliance testing, not on the fact that one of the possible
implementations could use a workqueue. I'd still like to hear your
arguments on that.


> Additionally, you have been keen to note invasive solutions and this seems
> like a highly invasive solution especially in light of having a viable one
> right here.
>
> This solution adds very little overhead along the HPD path and that overhead
> is a single read of the EDID for each event. To further address any concerns
> about performance, for V6 I've propagated the long_hpd flag forward into
> check_link_status so that the EDID read is only called for a hot plug event,
> not short pulses. Another plus to this solution is that if/when this needs
> to moved to userspace, this one is a lot easier to unwind and pull out than
> the work loop you're suggesting.
>
> With respect to the user space solution, that's something worth
> investigating in the future. I have some concerns about doing that, the
> primary one being the ability to detect the corrupted header, but I think
> it's something to consider for a follow-on update.
>
>
>
>>>          /* Try to read the source of the interrupt */
>>>          if (intel_dp->dpcd[DP_DPCD_REV] >= 0x11 &&
>>>              intel_dp_get_sink_irq(intel_dp, &sink_irq_vector)) {
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>>
>>
>
Todd Previte April 16, 2015, 3:41 p.m. UTC | #5
On 4/15/2015 10:42 AM, Paulo Zanoni wrote:
> 2015-04-15 12:37 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>
>> On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
>>> 2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in
>>>> the
>>>> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
>>>> all HPD plug events. To reduce the amount of code, this EDID read is also
>>>> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
>>>> support for these tests is implemented in later patches in this series.
>>>>
>>>> V2:
>>>> - Fixed compilation error introduced during rework
>>>>
>>>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>>>>    1 file changed, 11 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_dp.c
>>>> b/drivers/gpu/drm/i915/intel_dp.c
>>>> index 23184b0..75df3e2 100644
>>>> --- a/drivers/gpu/drm/i915/intel_dp.c
>>>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>>>> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp
>>>> *intel_dp)
>>>>    {
>>>>           struct drm_device *dev = intel_dp_to_dev(intel_dp);
>>>>           struct intel_encoder *intel_encoder =
>>>> &dp_to_dig_port(intel_dp)->base;
>>>> +       struct drm_connector *connector =
>>>> &intel_dp->attached_connector->base;
>>>> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
>>>> +       struct edid *edid_read = NULL;
>>>>           u8 sink_irq_vector;
>>>>           u8 link_status[DP_LINK_STATUS_SIZE];
>>>>
>>>> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp
>>>> *intel_dp)
>>>>                   return;
>>>>           }
>>>>
>>>> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
>>>> +        * 4.2.2.1 - EDID read required for all HPD events
>>>> +         */
>>>> +        edid_read = drm_get_edid(connector, adapter);
>>>> +        if (!edid_read) {
>>>> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
>>>> +        }
>>>> +
>>> We already briefly discussed this patch in private, so I'm going to
>>> summarize the discussion and also add some more points here.
>>>
>>> Frist, the actual detailed review: the indentation here is using
>>> spaces and we're leaking the EDID. This will cause rebases to a few of
>>> the next patches.
>>>
>>> Back to the hight level architecture: your initial versions of the
>>> series contained just 1 extra EDID read, and it was contained inside
>>> the compliance testing function. Then the versions submitted a few
>>> days ago had 2 extra EDID reads, then after some discussion you
>>> reduced to 1 extra EDID read (the one on this patch). I previously
>>> asked "But what about the automatic EDID read we do when we get a
>>> hotplug? Can't we just rely on it?". I got some answers to the
>>> question, but I was not really convinced.
>>>
>>> Yesterday I was arguing that this extra EDID read is going to add a
>>> small delay to every hotplug event we get, so my initial suggestion
>>> was to organize the compliance testing in a way that would require the
>>> user space program to call the GetResources() IOCTL to force the EDID
>>> when needed. Your argument was that then the DP compliance testing
>>> procedure would be testing our app for compliance, not the Kernel.
>>>
>>> But today I decided to finally do some debugging regarding this, and I
>>> was able to confirm that we do follow the DP requirements: we do have
>>> an automatic EDID read done by the Kernel whenever we do a hotplug:
>>> i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
>>> drm_get_edid() at some point. This function also does other stuff that
>>> is required by the compliance testing, such as the DPCD reads.
>>>
>>> Now there's a problem with using i915_hotplug_work_func(), which could
>>> the reason why you rejected it: it only happens after
>>> intel_dp_hpd_pulse(), which means that we only really do the EDID read
>>> after intel_dp_handle_test_request().
>>>
>>> I consider i915_hotplug_work_func() a fundamental part of our DP
>>> framework, and the DP compliance testing seems to be just ignoring its
>>> existence. So my idea for a solution here would be to make
>>> intel_dp_handle_test_request() run on its own delayed work function.
>>> It would wait for both i915_digport_work_func() and
>>> i915_hotplug_work_func() to finish, and only then it would do the
>>> normal processing. With this, we would be able to avoid the edid read
>>> on this patch, we would maybe be able to avoid at least part of patch
>>> 2, we would maybe be able to completely avoid patch 7, and then on
>>> patch 8 we would start touching intel_dp_get_edid() instead.
>>>
>>> I know this is sort of a fundamental change that is being requested a
>>> little late in the review process, and it can be frustrating, but this
>>> aspect of the code only recently changed (I was fine with the EDID
>>> reads just in the compliance testing function), and since the DP
>>> compliance code is quite complex, it took me a while to realize
>>> everything that's going on and what is the purpose of each piece. I
>>> also think that, since this idea will allow the compliance testing to
>>> take into consideration the work done by i915_hotplug_work_func(),
>>> compliance testing will better reflect the behavior that is actually
>>> done by the Kernel when DP devices are plugged/unplugged. And I did
>>> ask about those new EDID reads as soon as I started reviewing the
>>> patch that introduced them.
>>>
>>> Now, since I know how frustrating it is to have to change a
>>> significant portion of the code once again, I will leave to the
>>> maintainers the decision of whether the current proposed
>>> implementation is acceptable or if we want to make the DP compliance
>>> testing code take into consideration the work done by
>>> i915_hotplug_work_func(). I would also like to know your opinion on
>>> this. Maybe my idea just doesn't make sense because of something else
>>> I didn't realize :)
>> I don't think this is a good idea. The work loop aspect seems like a very
>> complex solution solution to a problem that is relatively simple. In a
>> discussion with Daniel, he indicated that adding a work loop is something to
>> be avoided unless it's *really* necessary, as they are prone to race
>> conditions. In this case, I just don't see that it's necessary.
> The workqueue thing was just an idea to implement a solution for the
> real problem. I think we should be focusing about discussing the fact
> that we're not taking i915_hotplug_work_func() into consideration when
> doing the compliance testing, not on the fact that one of the possible
> implementations could use a workqueue. I'd still like to hear your
> arguments on that.
Fair enough.

So I've been looking into this and why the i915_hotplug_work_func wasn't 
part of this. It is, as you said, a relatively fundamental code path for 
Displayport through the driver. What I discovered was that this function 
is never called on HSW (my primary test vehicle), mainly because 
check_link_status() returns IRQ_HANDLED instead of IRQ_NONE. The work 
function for HSW is i915_digport_work_func, so when it gets the 
IRQ_HANDLED return code, it doesn't fall through to the legacy 
i915_hotplug_work_func handler. This is important because this handler 
calls intel_hpd_irq_event which is where the ->detect connector function 
is called. And intel_dp_detect() is where all the happy goodness for 
Displayport begins.

Up until I discovered this, I had mistakenly propagated that problem 
forward in to the SST case in intel_dp_hot_pulse() in patch 6 by 
returning IRQ_HANDLED instead of IRQ_NONE, which is what the code was 
doing for SST prior to patch 6. With this problem corrected (as it is in 
the latest update in patch set V6) the work functions are now called as 
they should be. The point being that this opens up the possibility of 
using elements along this path to pass compliance testing, thereby 
creating a more valid test case.

With this in mind, I am not opposed to using elements along that path to 
satisfy compliance requirements (that's the spirit of the tests, 
anyways) but as I've indicated, there are cases where we need to take 
special steps (like the edid_corrupt flag) in order to do the right 
things to pass the tests. I have concerns about trying to do that at 
this point, as it requires substantial rework to that code path that 
have a significant chance of breaking things. So to avoid that,  I 
propose that this patch be merged now so that a working solution is in 
place. This discussion should continue and we can decide where to put 
things in the hotplug_work path to satisfy the compliance requires over 
the course of some followup patches.

>> Additionally, you have been keen to note invasive solutions and this seems
>> like a highly invasive solution especially in light of having a viable one
>> right here.
>>
>> This solution adds very little overhead along the HPD path and that overhead
>> is a single read of the EDID for each event. To further address any concerns
>> about performance, for V6 I've propagated the long_hpd flag forward into
>> check_link_status so that the EDID read is only called for a hot plug event,
>> not short pulses. Another plus to this solution is that if/when this needs
>> to moved to userspace, this one is a lot easier to unwind and pull out than
>> the work loop you're suggesting.
>>
>> With respect to the user space solution, that's something worth
>> investigating in the future. I have some concerns about doing that, the
>> primary one being the ability to detect the corrupted header, but I think
>> it's something to consider for a follow-on update.
>>
>>
>>
>>>>           /* Try to read the source of the interrupt */
>>>>           if (intel_dp->dpcd[DP_DPCD_REV] >= 0x11 &&
>>>>               intel_dp_get_sink_irq(intel_dp, &sink_irq_vector)) {
>>>> --
>>>> 1.9.1
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>>>
>
>
Daniel Vetter April 16, 2015, 4:31 p.m. UTC | #6
On Thu, Apr 16, 2015 at 08:41:33AM -0700, Todd Previte wrote:
> On 4/15/2015 10:42 AM, Paulo Zanoni wrote:
> >2015-04-15 12:37 GMT-03:00 Todd Previte <tprevite@gmail.com>:
> >>On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
> >>>2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
> >>>>Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in
> >>>>the
> >>>>Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
> >>>>all HPD plug events. To reduce the amount of code, this EDID read is also
> >>>>used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
> >>>>support for these tests is implemented in later patches in this series.
> >>>>
> >>>>V2:
> >>>>- Fixed compilation error introduced during rework
> >>>>
> >>>>Signed-off-by: Todd Previte <tprevite@gmail.com>
> >>>>---
> >>>>   drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
> >>>>   1 file changed, 11 insertions(+)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/intel_dp.c
> >>>>b/drivers/gpu/drm/i915/intel_dp.c
> >>>>index 23184b0..75df3e2 100644
> >>>>--- a/drivers/gpu/drm/i915/intel_dp.c
> >>>>+++ b/drivers/gpu/drm/i915/intel_dp.c
> >>>>@@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp
> >>>>*intel_dp)
> >>>>   {
> >>>>          struct drm_device *dev = intel_dp_to_dev(intel_dp);
> >>>>          struct intel_encoder *intel_encoder =
> >>>>&dp_to_dig_port(intel_dp)->base;
> >>>>+       struct drm_connector *connector =
> >>>>&intel_dp->attached_connector->base;
> >>>>+       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
> >>>>+       struct edid *edid_read = NULL;
> >>>>          u8 sink_irq_vector;
> >>>>          u8 link_status[DP_LINK_STATUS_SIZE];
> >>>>
> >>>>@@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp
> >>>>*intel_dp)
> >>>>                  return;
> >>>>          }
> >>>>
> >>>>+       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
> >>>>+        * 4.2.2.1 - EDID read required for all HPD events
> >>>>+         */
> >>>>+        edid_read = drm_get_edid(connector, adapter);
> >>>>+        if (!edid_read) {
> >>>>+                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
> >>>>+        }
> >>>>+
> >>>We already briefly discussed this patch in private, so I'm going to
> >>>summarize the discussion and also add some more points here.
> >>>
> >>>Frist, the actual detailed review: the indentation here is using
> >>>spaces and we're leaking the EDID. This will cause rebases to a few of
> >>>the next patches.
> >>>
> >>>Back to the hight level architecture: your initial versions of the
> >>>series contained just 1 extra EDID read, and it was contained inside
> >>>the compliance testing function. Then the versions submitted a few
> >>>days ago had 2 extra EDID reads, then after some discussion you
> >>>reduced to 1 extra EDID read (the one on this patch). I previously
> >>>asked "But what about the automatic EDID read we do when we get a
> >>>hotplug? Can't we just rely on it?". I got some answers to the
> >>>question, but I was not really convinced.
> >>>
> >>>Yesterday I was arguing that this extra EDID read is going to add a
> >>>small delay to every hotplug event we get, so my initial suggestion
> >>>was to organize the compliance testing in a way that would require the
> >>>user space program to call the GetResources() IOCTL to force the EDID
> >>>when needed. Your argument was that then the DP compliance testing
> >>>procedure would be testing our app for compliance, not the Kernel.
> >>>
> >>>But today I decided to finally do some debugging regarding this, and I
> >>>was able to confirm that we do follow the DP requirements: we do have
> >>>an automatic EDID read done by the Kernel whenever we do a hotplug:
> >>>i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
> >>>drm_get_edid() at some point. This function also does other stuff that
> >>>is required by the compliance testing, such as the DPCD reads.
> >>>
> >>>Now there's a problem with using i915_hotplug_work_func(), which could
> >>>the reason why you rejected it: it only happens after
> >>>intel_dp_hpd_pulse(), which means that we only really do the EDID read
> >>>after intel_dp_handle_test_request().
> >>>
> >>>I consider i915_hotplug_work_func() a fundamental part of our DP
> >>>framework, and the DP compliance testing seems to be just ignoring its
> >>>existence. So my idea for a solution here would be to make
> >>>intel_dp_handle_test_request() run on its own delayed work function.
> >>>It would wait for both i915_digport_work_func() and
> >>>i915_hotplug_work_func() to finish, and only then it would do the
> >>>normal processing. With this, we would be able to avoid the edid read
> >>>on this patch, we would maybe be able to avoid at least part of patch
> >>>2, we would maybe be able to completely avoid patch 7, and then on
> >>>patch 8 we would start touching intel_dp_get_edid() instead.
> >>>
> >>>I know this is sort of a fundamental change that is being requested a
> >>>little late in the review process, and it can be frustrating, but this
> >>>aspect of the code only recently changed (I was fine with the EDID
> >>>reads just in the compliance testing function), and since the DP
> >>>compliance code is quite complex, it took me a while to realize
> >>>everything that's going on and what is the purpose of each piece. I
> >>>also think that, since this idea will allow the compliance testing to
> >>>take into consideration the work done by i915_hotplug_work_func(),
> >>>compliance testing will better reflect the behavior that is actually
> >>>done by the Kernel when DP devices are plugged/unplugged. And I did
> >>>ask about those new EDID reads as soon as I started reviewing the
> >>>patch that introduced them.
> >>>
> >>>Now, since I know how frustrating it is to have to change a
> >>>significant portion of the code once again, I will leave to the
> >>>maintainers the decision of whether the current proposed
> >>>implementation is acceptable or if we want to make the DP compliance
> >>>testing code take into consideration the work done by
> >>>i915_hotplug_work_func(). I would also like to know your opinion on
> >>>this. Maybe my idea just doesn't make sense because of something else
> >>>I didn't realize :)
> >>I don't think this is a good idea. The work loop aspect seems like a very
> >>complex solution solution to a problem that is relatively simple. In a
> >>discussion with Daniel, he indicated that adding a work loop is something to
> >>be avoided unless it's *really* necessary, as they are prone to race
> >>conditions. In this case, I just don't see that it's necessary.
> >The workqueue thing was just an idea to implement a solution for the
> >real problem. I think we should be focusing about discussing the fact
> >that we're not taking i915_hotplug_work_func() into consideration when
> >doing the compliance testing, not on the fact that one of the possible
> >implementations could use a workqueue. I'd still like to hear your
> >arguments on that.
> Fair enough.
> 
> So I've been looking into this and why the i915_hotplug_work_func wasn't
> part of this. It is, as you said, a relatively fundamental code path for
> Displayport through the driver. What I discovered was that this function is
> never called on HSW (my primary test vehicle), mainly because
> check_link_status() returns IRQ_HANDLED instead of IRQ_NONE. The work
> function for HSW is i915_digport_work_func, so when it gets the IRQ_HANDLED
> return code, it doesn't fall through to the legacy i915_hotplug_work_func
> handler. This is important because this handler calls intel_hpd_irq_event
> which is where the ->detect connector function is called. And
> intel_dp_detect() is where all the happy goodness for Displayport begins.
> 
> Up until I discovered this, I had mistakenly propagated that problem forward
> in to the SST case in intel_dp_hot_pulse() in patch 6 by returning
> IRQ_HANDLED instead of IRQ_NONE, which is what the code was doing for SST
> prior to patch 6. With this problem corrected (as it is in the latest update
> in patch set V6) the work functions are now called as they should be. The
> point being that this opens up the possibility of using elements along this
> path to pass compliance testing, thereby creating a more valid test case.
> 
> With this in mind, I am not opposed to using elements along that path to
> satisfy compliance requirements (that's the spirit of the tests, anyways)
> but as I've indicated, there are cases where we need to take special steps
> (like the edid_corrupt flag) in order to do the right things to pass the
> tests. I have concerns about trying to do that at this point, as it requires
> substantial rework to that code path that have a significant chance of
> breaking things. So to avoid that,  I propose that this patch be merged now
> so that a working solution is in place. This discussion should continue and
> we can decide where to put things in the hotplug_work path to satisfy the
> compliance requires over the course of some followup patches.

I've looked a bit at all this and I think the other issue here is the
placement of intel_dp_handle_test_request in check_link_status. This has
been done almost 4 years ago in

commit a60f0e38d72a5e24085d6e7e27a4cadc20ae268a
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Oct 20 15:09:17 2011 -0700

    drm/i915: add DP test request handling

but never contained (up to this point) any functional code. It was however
dutifully moved around together with the other code. And way back the
placement even made some sense - check_link_status was called
unconditionally from our hot_plug handler. But since the MST rework (and a
few other things) happened that has been changed pretty radically and the
current place where the test request handler is called just doesn't seem
to make that much sense any more.

While I started pulling in patches I also noticed other places where we
duplicate existing logic (e.g. the dpcd read), so this isn't just about
the edid.

The other aspect here is that nowadays we do cache the edid for dp ports
aggressively, which means if we don't read the edid the kernel will indeed
keep on using a stale one. Hence there's a good risk that we don't just
have a minor piece of duct-tape to keep the somewhat strange expectations
of DP compliance testers happy but might be hiding a real bug in our DP
code. Giving how many we've had that seems fairly likely and I'm not happy
with sweeping this under the rug. This definitely needs a solid
analysis and explanation either way (i.e. whether this is a bug or just an
overly strict dp compliance tester requirement).

Finally there's the dp hotplug handling. Ever since MST support was merged
this has a been a lot of fun and took us a while to make it work correctly
- lots of deadlocks and other issues. And given the above we still seem to
have regressions due to MST support, or at least evidence for such. Since
this is a fairly fragile piece of code, which is also not that well-tested
(we don't have any MST hw in our test matrix yet) I prefer to keep changes
to a minimum. Merging these patches here first and then potentially
undoing them again because the bug has been the (mis)placement of the test
request handler in the MST patches feels too risky.

Given all that I'd like to hold off merging these patches that rework the
code around the check_link_status function until we have clarity here.

I've pulled in the other patches meanwhile which are reviewed and ready. I
also think we can pull in the drm_edid.c patch with the statistics code we
need for compliance testing ahead of resolving the above opens. And it
would be good to get some feedback from other (non-intel) drm developers
beforehand, the changes are quite invasive in some parts.

I've chatted a lot with Todd and Paulo and I think my decision here and
the rough plan laid out is the best choice I have from a technical point
of view.

Thanks, Daniel
Todd Previte April 16, 2015, 5:32 p.m. UTC | #7
On 4/16/2015 9:31 AM, Daniel Vetter wrote:
> On Thu, Apr 16, 2015 at 08:41:33AM -0700, Todd Previte wrote:
>> On 4/15/2015 10:42 AM, Paulo Zanoni wrote:
>>> 2015-04-15 12:37 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>> On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
>>>>> 2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>>>> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in
>>>>>> the
>>>>>> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
>>>>>> all HPD plug events. To reduce the amount of code, this EDID read is also
>>>>>> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
>>>>>> support for these tests is implemented in later patches in this series.
>>>>>>
>>>>>> V2:
>>>>>> - Fixed compilation error introduced during rework
>>>>>>
>>>>>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>>>>>>    1 file changed, 11 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_dp.c
>>>>>> b/drivers/gpu/drm/i915/intel_dp.c
>>>>>> index 23184b0..75df3e2 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_dp.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>>>>>> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp
>>>>>> *intel_dp)
>>>>>>    {
>>>>>>           struct drm_device *dev = intel_dp_to_dev(intel_dp);
>>>>>>           struct intel_encoder *intel_encoder =
>>>>>> &dp_to_dig_port(intel_dp)->base;
>>>>>> +       struct drm_connector *connector =
>>>>>> &intel_dp->attached_connector->base;
>>>>>> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
>>>>>> +       struct edid *edid_read = NULL;
>>>>>>           u8 sink_irq_vector;
>>>>>>           u8 link_status[DP_LINK_STATUS_SIZE];
>>>>>>
>>>>>> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp
>>>>>> *intel_dp)
>>>>>>                   return;
>>>>>>           }
>>>>>>
>>>>>> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
>>>>>> +        * 4.2.2.1 - EDID read required for all HPD events
>>>>>> +         */
>>>>>> +        edid_read = drm_get_edid(connector, adapter);
>>>>>> +        if (!edid_read) {
>>>>>> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
>>>>>> +        }
>>>>>> +
>>>>> We already briefly discussed this patch in private, so I'm going to
>>>>> summarize the discussion and also add some more points here.
>>>>>
>>>>> Frist, the actual detailed review: the indentation here is using
>>>>> spaces and we're leaking the EDID. This will cause rebases to a few of
>>>>> the next patches.
>>>>>
>>>>> Back to the hight level architecture: your initial versions of the
>>>>> series contained just 1 extra EDID read, and it was contained inside
>>>>> the compliance testing function. Then the versions submitted a few
>>>>> days ago had 2 extra EDID reads, then after some discussion you
>>>>> reduced to 1 extra EDID read (the one on this patch). I previously
>>>>> asked "But what about the automatic EDID read we do when we get a
>>>>> hotplug? Can't we just rely on it?". I got some answers to the
>>>>> question, but I was not really convinced.
>>>>>
>>>>> Yesterday I was arguing that this extra EDID read is going to add a
>>>>> small delay to every hotplug event we get, so my initial suggestion
>>>>> was to organize the compliance testing in a way that would require the
>>>>> user space program to call the GetResources() IOCTL to force the EDID
>>>>> when needed. Your argument was that then the DP compliance testing
>>>>> procedure would be testing our app for compliance, not the Kernel.
>>>>>
>>>>> But today I decided to finally do some debugging regarding this, and I
>>>>> was able to confirm that we do follow the DP requirements: we do have
>>>>> an automatic EDID read done by the Kernel whenever we do a hotplug:
>>>>> i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
>>>>> drm_get_edid() at some point. This function also does other stuff that
>>>>> is required by the compliance testing, such as the DPCD reads.
>>>>>
>>>>> Now there's a problem with using i915_hotplug_work_func(), which could
>>>>> the reason why you rejected it: it only happens after
>>>>> intel_dp_hpd_pulse(), which means that we only really do the EDID read
>>>>> after intel_dp_handle_test_request().
>>>>>
>>>>> I consider i915_hotplug_work_func() a fundamental part of our DP
>>>>> framework, and the DP compliance testing seems to be just ignoring its
>>>>> existence. So my idea for a solution here would be to make
>>>>> intel_dp_handle_test_request() run on its own delayed work function.
>>>>> It would wait for both i915_digport_work_func() and
>>>>> i915_hotplug_work_func() to finish, and only then it would do the
>>>>> normal processing. With this, we would be able to avoid the edid read
>>>>> on this patch, we would maybe be able to avoid at least part of patch
>>>>> 2, we would maybe be able to completely avoid patch 7, and then on
>>>>> patch 8 we would start touching intel_dp_get_edid() instead.
>>>>>
>>>>> I know this is sort of a fundamental change that is being requested a
>>>>> little late in the review process, and it can be frustrating, but this
>>>>> aspect of the code only recently changed (I was fine with the EDID
>>>>> reads just in the compliance testing function), and since the DP
>>>>> compliance code is quite complex, it took me a while to realize
>>>>> everything that's going on and what is the purpose of each piece. I
>>>>> also think that, since this idea will allow the compliance testing to
>>>>> take into consideration the work done by i915_hotplug_work_func(),
>>>>> compliance testing will better reflect the behavior that is actually
>>>>> done by the Kernel when DP devices are plugged/unplugged. And I did
>>>>> ask about those new EDID reads as soon as I started reviewing the
>>>>> patch that introduced them.
>>>>>
>>>>> Now, since I know how frustrating it is to have to change a
>>>>> significant portion of the code once again, I will leave to the
>>>>> maintainers the decision of whether the current proposed
>>>>> implementation is acceptable or if we want to make the DP compliance
>>>>> testing code take into consideration the work done by
>>>>> i915_hotplug_work_func(). I would also like to know your opinion on
>>>>> this. Maybe my idea just doesn't make sense because of something else
>>>>> I didn't realize :)
>>>> I don't think this is a good idea. The work loop aspect seems like a very
>>>> complex solution solution to a problem that is relatively simple. In a
>>>> discussion with Daniel, he indicated that adding a work loop is something to
>>>> be avoided unless it's *really* necessary, as they are prone to race
>>>> conditions. In this case, I just don't see that it's necessary.
>>> The workqueue thing was just an idea to implement a solution for the
>>> real problem. I think we should be focusing about discussing the fact
>>> that we're not taking i915_hotplug_work_func() into consideration when
>>> doing the compliance testing, not on the fact that one of the possible
>>> implementations could use a workqueue. I'd still like to hear your
>>> arguments on that.
>> Fair enough.
>>
>> So I've been looking into this and why the i915_hotplug_work_func wasn't
>> part of this. It is, as you said, a relatively fundamental code path for
>> Displayport through the driver. What I discovered was that this function is
>> never called on HSW (my primary test vehicle), mainly because
>> check_link_status() returns IRQ_HANDLED instead of IRQ_NONE. The work
>> function for HSW is i915_digport_work_func, so when it gets the IRQ_HANDLED
>> return code, it doesn't fall through to the legacy i915_hotplug_work_func
>> handler. This is important because this handler calls intel_hpd_irq_event
>> which is where the ->detect connector function is called. And
>> intel_dp_detect() is where all the happy goodness for Displayport begins.
>>
>> Up until I discovered this, I had mistakenly propagated that problem forward
>> in to the SST case in intel_dp_hot_pulse() in patch 6 by returning
>> IRQ_HANDLED instead of IRQ_NONE, which is what the code was doing for SST
>> prior to patch 6. With this problem corrected (as it is in the latest update
>> in patch set V6) the work functions are now called as they should be. The
>> point being that this opens up the possibility of using elements along this
>> path to pass compliance testing, thereby creating a more valid test case.
>>
>> With this in mind, I am not opposed to using elements along that path to
>> satisfy compliance requirements (that's the spirit of the tests, anyways)
>> but as I've indicated, there are cases where we need to take special steps
>> (like the edid_corrupt flag) in order to do the right things to pass the
>> tests. I have concerns about trying to do that at this point, as it requires
>> substantial rework to that code path that have a significant chance of
>> breaking things. So to avoid that,  I propose that this patch be merged now
>> so that a working solution is in place. This discussion should continue and
>> we can decide where to put things in the hotplug_work path to satisfy the
>> compliance requires over the course of some followup patches.
> I've looked a bit at all this and I think the other issue here is the
> placement of intel_dp_handle_test_request in check_link_status. This has
> been done almost 4 years ago in
>
> commit a60f0e38d72a5e24085d6e7e27a4cadc20ae268a
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date:   Thu Oct 20 15:09:17 2011 -0700
>
>      drm/i915: add DP test request handling
>
> but never contained (up to this point) any functional code. It was however
> dutifully moved around together with the other code. And way back the
> placement even made some sense - check_link_status was called
> unconditionally from our hot_plug handler. But since the MST rework (and a
> few other things) happened that has been changed pretty radically and the
> current place where the test request handler is called just doesn't seem
> to make that much sense any more.
I will agree with you on this. In light of my earlier findings, I think 
that check_link_status may have become a superfluous function call, much 
like the old intel_dp_hot_plug() function. However, there is some code 
that needs to be carried forward into the normal work func path to make 
compliance testing work properly. Those things are as follows:
     - A fresh EDID read early in the HPD process
     - Detection of a corrupted EDID at the same point
     - A check of the Displayport IRQ_VECTOR register to determine if a 
test request has occurred

There is also one change that must be made to intel_dp_hpd_pulse(), and 
this is to make sure that it returns IRQ_NONE for the SST case such that 
the hotplug_work_func gets invoked. I believe that is the key to 
restoring working status to this code path for the reasons I detailed 
above.

As an aside, I think the obsolete functions should be removed instead of 
remaining in the code base. This should be done as a separate patch so 
that it's clear and bisectable should that become an issue in the future.

> While I started pulling in patches I also noticed other places where we
> duplicate existing logic (e.g. the dpcd read), so this isn't just about
> the edid.
Paulo noticed this as well and I was addressing it as it came up in 
review. In the patch churn some of those changes may have been lost or 
reverted, but this is something I will continue to fix as I work out 
this code.
> The other aspect here is that nowadays we do cache the edid for dp ports
> aggressively, which means if we don't read the edid the kernel will indeed
> keep on using a stale one. Hence there's a good risk that we don't just
> have a minor piece of duct-tape to keep the somewhat strange expectations
> of DP compliance testers happy but might be hiding a real bug in our DP
> code. Giving how many we've had that seems fairly likely and I'm not happy
> with sweeping this under the rug. This definitely needs a solid
> analysis and explanation either way (i.e. whether this is a bug or just an
> overly strict dp compliance tester requirement).
I think this can be resolved by starting to rework this in 
intel_dp_detect(). The DPCD and EDID reads/updates happen here, so this 
is the most logical place to start integrating the code from 
check_link_status().  I believe it can be done such that it satisfies 
both real world operations and compliance testing, with a minimum of 
special cases to accommodate the test apparatus.

>
> Finally there's the dp hotplug handling. Ever since MST support was merged
> this has a been a lot of fun and took us a while to make it work correctly
> - lots of deadlocks and other issues. And given the above we still seem to
> have regressions due to MST support, or at least evidence for such. Since
> this is a fairly fragile piece of code, which is also not that well-tested
> (we don't have any MST hw in our test matrix yet) I prefer to keep changes
> to a minimum. Merging these patches here first and then potentially
> undoing them again because the bug has been the (mis)placement of the test
> request handler in the MST patches feels too risky.
I can understand your point here. As I indicated above, I think the only 
real change required is to ensure that IRQ_NONE is returned for the SST 
case such that the correct work functions are invoked.
> Given all that I'd like to hold off merging these patches that rework the
> code around the check_link_status function until we have clarity here.
>
> I've pulled in the other patches meanwhile which are reviewed and ready. I
> also think we can pull in the drm_edid.c patch with the statistics code we
> need for compliance testing ahead of resolving the above opens. And it
> would be good to get some feedback from other (non-intel) drm developers
> beforehand, the changes are quite invasive in some parts.
>
> I've chatted a lot with Todd and Paulo and I think my decision here and
> the rough plan laid out is the best choice I have from a technical point
> of view.
>
> Thanks, Daniel

Thank you for the feedback, Daniel. I think this discussion needs to be 
carried forward into the V6 of the patch set to keep things current 
there. I'm going to begin working on the solution for this now.
Todd Previte April 20, 2015, 5:10 a.m. UTC | #8
On 4/16/15 9:31 AM, Daniel Vetter wrote:
> On Thu, Apr 16, 2015 at 08:41:33AM -0700, Todd Previte wrote:
>> On 4/15/2015 10:42 AM, Paulo Zanoni wrote:
>>> 2015-04-15 12:37 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>> On 4/14/2015 9:53 AM, Paulo Zanoni wrote:
>>>>> 2015-04-13 11:53 GMT-03:00 Todd Previte <tprevite@gmail.com>:
>>>>>> Adds in an EDID read after the DPCD read to accommodate test 4.2.2.1 in
>>>>>> the
>>>>>> Displayport Link CTS Core 1.2 rev1.1. This test requires an EDID read for
>>>>>> all HPD plug events. To reduce the amount of code, this EDID read is also
>>>>>> used for Link CTS tests 4.2.2.3, 4.2.2.4, 4.2.2.5 and 4.2.2.6. Actual
>>>>>> support for these tests is implemented in later patches in this series.
>>>>>>
>>>>>> V2:
>>>>>> - Fixed compilation error introduced during rework
>>>>>>
>>>>>> Signed-off-by: Todd Previte <tprevite@gmail.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/i915/intel_dp.c | 11 +++++++++++
>>>>>>    1 file changed, 11 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_dp.c
>>>>>> b/drivers/gpu/drm/i915/intel_dp.c
>>>>>> index 23184b0..75df3e2 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_dp.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>>>>>> @@ -3890,6 +3890,9 @@ intel_dp_check_link_status(struct intel_dp
>>>>>> *intel_dp)
>>>>>>    {
>>>>>>           struct drm_device *dev = intel_dp_to_dev(intel_dp);
>>>>>>           struct intel_encoder *intel_encoder =
>>>>>> &dp_to_dig_port(intel_dp)->base;
>>>>>> +       struct drm_connector *connector =
>>>>>> &intel_dp->attached_connector->base;
>>>>>> +       struct i2c_adapter *adapter = &intel_dp->aux.ddc;
>>>>>> +       struct edid *edid_read = NULL;
>>>>>>           u8 sink_irq_vector;
>>>>>>           u8 link_status[DP_LINK_STATUS_SIZE];
>>>>>>
>>>>>> @@ -3906,6 +3909,14 @@ intel_dp_check_link_status(struct intel_dp
>>>>>> *intel_dp)
>>>>>>                   return;
>>>>>>           }
>>>>>>
>>>>>> +       /* Displayport Link CTS Core 1.2 rev1.1 EDID testing
>>>>>> +        * 4.2.2.1 - EDID read required for all HPD events
>>>>>> +         */
>>>>>> +        edid_read = drm_get_edid(connector, adapter);
>>>>>> +        if (!edid_read) {
>>>>>> +                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
>>>>>> +        }
>>>>>> +
>>>>> We already briefly discussed this patch in private, so I'm going to
>>>>> summarize the discussion and also add some more points here.
>>>>>
>>>>> Frist, the actual detailed review: the indentation here is using
>>>>> spaces and we're leaking the EDID. This will cause rebases to a few of
>>>>> the next patches.
>>>>>
>>>>> Back to the hight level architecture: your initial versions of the
>>>>> series contained just 1 extra EDID read, and it was contained inside
>>>>> the compliance testing function. Then the versions submitted a few
>>>>> days ago had 2 extra EDID reads, then after some discussion you
>>>>> reduced to 1 extra EDID read (the one on this patch). I previously
>>>>> asked "But what about the automatic EDID read we do when we get a
>>>>> hotplug? Can't we just rely on it?". I got some answers to the
>>>>> question, but I was not really convinced.
>>>>>
>>>>> Yesterday I was arguing that this extra EDID read is going to add a
>>>>> small delay to every hotplug event we get, so my initial suggestion
>>>>> was to organize the compliance testing in a way that would require the
>>>>> user space program to call the GetResources() IOCTL to force the EDID
>>>>> when needed. Your argument was that then the DP compliance testing
>>>>> procedure would be testing our app for compliance, not the Kernel.
>>>>>
>>>>> But today I decided to finally do some debugging regarding this, and I
>>>>> was able to confirm that we do follow the DP requirements: we do have
>>>>> an automatic EDID read done by the Kernel whenever we do a hotplug:
>>>>> i915_hotplug_work_func() calls intel_dp_detect(), which ends calling
>>>>> drm_get_edid() at some point. This function also does other stuff that
>>>>> is required by the compliance testing, such as the DPCD reads.
>>>>>
>>>>> Now there's a problem with using i915_hotplug_work_func(), which could
>>>>> the reason why you rejected it: it only happens after
>>>>> intel_dp_hpd_pulse(), which means that we only really do the EDID read
>>>>> after intel_dp_handle_test_request().
>>>>>
>>>>> I consider i915_hotplug_work_func() a fundamental part of our DP
>>>>> framework, and the DP compliance testing seems to be just ignoring its
>>>>> existence. So my idea for a solution here would be to make
>>>>> intel_dp_handle_test_request() run on its own delayed work function.
>>>>> It would wait for both i915_digport_work_func() and
>>>>> i915_hotplug_work_func() to finish, and only then it would do the
>>>>> normal processing. With this, we would be able to avoid the edid read
>>>>> on this patch, we would maybe be able to avoid at least part of patch
>>>>> 2, we would maybe be able to completely avoid patch 7, and then on
>>>>> patch 8 we would start touching intel_dp_get_edid() instead.
>>>>>
>>>>> I know this is sort of a fundamental change that is being requested a
>>>>> little late in the review process, and it can be frustrating, but this
>>>>> aspect of the code only recently changed (I was fine with the EDID
>>>>> reads just in the compliance testing function), and since the DP
>>>>> compliance code is quite complex, it took me a while to realize
>>>>> everything that's going on and what is the purpose of each piece. I
>>>>> also think that, since this idea will allow the compliance testing to
>>>>> take into consideration the work done by i915_hotplug_work_func(),
>>>>> compliance testing will better reflect the behavior that is actually
>>>>> done by the Kernel when DP devices are plugged/unplugged. And I did
>>>>> ask about those new EDID reads as soon as I started reviewing the
>>>>> patch that introduced them.
>>>>>
>>>>> Now, since I know how frustrating it is to have to change a
>>>>> significant portion of the code once again, I will leave to the
>>>>> maintainers the decision of whether the current proposed
>>>>> implementation is acceptable or if we want to make the DP compliance
>>>>> testing code take into consideration the work done by
>>>>> i915_hotplug_work_func(). I would also like to know your opinion on
>>>>> this. Maybe my idea just doesn't make sense because of something else
>>>>> I didn't realize :)
>>>> I don't think this is a good idea. The work loop aspect seems like a very
>>>> complex solution solution to a problem that is relatively simple. In a
>>>> discussion with Daniel, he indicated that adding a work loop is something to
>>>> be avoided unless it's *really* necessary, as they are prone to race
>>>> conditions. In this case, I just don't see that it's necessary.
>>> The workqueue thing was just an idea to implement a solution for the
>>> real problem. I think we should be focusing about discussing the fact
>>> that we're not taking i915_hotplug_work_func() into consideration when
>>> doing the compliance testing, not on the fact that one of the possible
>>> implementations could use a workqueue. I'd still like to hear your
>>> arguments on that.
>> Fair enough.
>>
>> So I've been looking into this and why the i915_hotplug_work_func wasn't
>> part of this. It is, as you said, a relatively fundamental code path for
>> Displayport through the driver. What I discovered was that this function is
>> never called on HSW (my primary test vehicle), mainly because
>> check_link_status() returns IRQ_HANDLED instead of IRQ_NONE. The work
>> function for HSW is i915_digport_work_func, so when it gets the IRQ_HANDLED
>> return code, it doesn't fall through to the legacy i915_hotplug_work_func
>> handler. This is important because this handler calls intel_hpd_irq_event
>> which is where the ->detect connector function is called. And
>> intel_dp_detect() is where all the happy goodness for Displayport begins.
>>
>> Up until I discovered this, I had mistakenly propagated that problem forward
>> in to the SST case in intel_dp_hot_pulse() in patch 6 by returning
>> IRQ_HANDLED instead of IRQ_NONE, which is what the code was doing for SST
>> prior to patch 6. With this problem corrected (as it is in the latest update
>> in patch set V6) the work functions are now called as they should be. The
>> point being that this opens up the possibility of using elements along this
>> path to pass compliance testing, thereby creating a more valid test case.
>>
>> With this in mind, I am not opposed to using elements along that path to
>> satisfy compliance requirements (that's the spirit of the tests, anyways)
>> but as I've indicated, there are cases where we need to take special steps
>> (like the edid_corrupt flag) in order to do the right things to pass the
>> tests. I have concerns about trying to do that at this point, as it requires
>> substantial rework to that code path that have a significant chance of
>> breaking things. So to avoid that,  I propose that this patch be merged now
>> so that a working solution is in place. This discussion should continue and
>> we can decide where to put things in the hotplug_work path to satisfy the
>> compliance requires over the course of some followup patches.
> I've looked a bit at all this and I think the other issue here is the
> placement of intel_dp_handle_test_request in check_link_status. This has
> been done almost 4 years ago in
>
> commit a60f0e38d72a5e24085d6e7e27a4cadc20ae268a
> Author: Jesse Barnes <jbarnes@virtuousgeek.org>
> Date:   Thu Oct 20 15:09:17 2011 -0700
>
>      drm/i915: add DP test request handling
>
> but never contained (up to this point) any functional code. It was however
> dutifully moved around together with the other code. And way back the
> placement even made some sense - check_link_status was called
> unconditionally from our hot_plug handler. But since the MST rework (and a
> few other things) happened that has been changed pretty radically and the
> current place where the test request handler is called just doesn't seem
> to make that much sense any more.
Agreed. For V7, I moved it from here into intel_dp_detect. It's placed 
so that all the necessary operations to satisfy compliance testing 
happen before the test handler is called, so there's virtually no 
duplicated code at all now.
>
> While I started pulling in patches I also noticed other places where we
> duplicate existing logic (e.g. the dpcd read), so this isn't just about
> the edid.
>
> The other aspect here is that nowadays we do cache the edid for dp ports
> aggressively, which means if we don't read the edid the kernel will indeed
> keep on using a stale one. Hence there's a good risk that we don't just
> have a minor piece of duct-tape to keep the somewhat strange expectations
> of DP compliance testers happy but might be hiding a real bug in our DP
> code. Giving how many we've had that seems fairly likely and I'm not happy
> with sweeping this under the rug. This definitely needs a solid
> analysis and explanation either way (i.e. whether this is a bug or just an
> overly strict dp compliance tester requirement).
With the test handler being called from the regular hot plug path as 
above, this isn't really an issue anymore. All the EDID reads are real 
reads because they're occurring during a long pulse. And in that case, 
when intel_dp_detect is called, we unset the edid and read a fresh copy 
from the device. The compliance test handler uses also the real data 
from that instead of intermediate values as it did before.
> Finally there's the dp hotplug handling. Ever since MST support was merged
> this has a been a lot of fun and took us a while to make it work correctly
> - lots of deadlocks and other issues. And given the above we still seem to
> have regressions due to MST support, or at least evidence for such. Since
> this is a fairly fragile piece of code, which is also not that well-tested
> (we don't have any MST hw in our test matrix yet) I prefer to keep changes
> to a minimum. Merging these patches here first and then potentially
> undoing them again because the bug has been the (mis)placement of the test
> request handler in the MST patches feels too risky.
For V7, I removed all the changes to intel_dp_hpd_pulse as I found they 
were unnecessary once the test request code was moved into 
intel_dp_detect. The move of the test request handling from 
check_link_status to intel_dp_detect is the only code change on that 
path now.
> Given all that I'd like to hold off merging these patches that rework the
> code around the check_link_status function until we have clarity here.
>
> I've pulled in the other patches meanwhile which are reviewed and ready. I
> also think we can pull in the drm_edid.c patch with the statistics code we
> need for compliance testing ahead of resolving the above opens. And it
> would be good to get some feedback from other (non-intel) drm developers
> beforehand, the changes are quite invasive in some parts.
>
> I've chatted a lot with Todd and Paulo and I think my decision here and
> the rough plan laid out is the best choice I have from a technical point
> of view.
>
> Thanks, Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 23184b0..75df3e2 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -3890,6 +3890,9 @@  intel_dp_check_link_status(struct intel_dp *intel_dp)
 {
 	struct drm_device *dev = intel_dp_to_dev(intel_dp);
 	struct intel_encoder *intel_encoder = &dp_to_dig_port(intel_dp)->base;
+	struct drm_connector *connector = &intel_dp->attached_connector->base;
+	struct i2c_adapter *adapter = &intel_dp->aux.ddc;
+	struct edid *edid_read = NULL;
 	u8 sink_irq_vector;
 	u8 link_status[DP_LINK_STATUS_SIZE];
 
@@ -3906,6 +3909,14 @@  intel_dp_check_link_status(struct intel_dp *intel_dp)
 		return;
 	}
 
+	/* Displayport Link CTS Core 1.2 rev1.1 EDID testing
+	 * 4.2.2.1 - EDID read required for all HPD events
+         */
+        edid_read = drm_get_edid(connector, adapter);
+        if (!edid_read) {
+                DRM_DEBUG_DRIVER("Invalid EDID detected\n");
+        }
+
 	/* Try to read the source of the interrupt */
 	if (intel_dp->dpcd[DP_DPCD_REV] >= 0x11 &&
 	    intel_dp_get_sink_irq(intel_dp, &sink_irq_vector)) {