diff mbox

drm/edid: limit printk when facing bad edid

Message ID 1344525951-5907-1-git-send-email-j.glisse@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jerome Glisse Aug. 9, 2012, 3:25 p.m. UTC
From: Jerome Glisse <jglisse@redhat.com>

Limit printing bad edid information at one time per connector.
Connector that are connected to a bad monitor/kvm will likely
stay connected to the same bad monitor/kvm and it makes no
sense to keep printing the bad edid message.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/drm_edid.c      | 22 ++++++++++++++--------
 drivers/gpu/drm/drm_edid_load.c |  6 ++++--
 include/drm/drm_crtc.h          |  3 ++-
 3 files changed, 20 insertions(+), 11 deletions(-)

Comments

Adam Jackson Aug. 14, 2012, 2:54 p.m. UTC | #1
On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
> From: Jerome Glisse <jglisse@redhat.com>
>
> Limit printing bad edid information at one time per connector.
> Connector that are connected to a bad monitor/kvm will likely
> stay connected to the same bad monitor/kvm and it makes no
> sense to keep printing the bad edid message.
>
> Signed-off-by: Jerome Glisse <jglisse@redhat.com>

I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if 
we're going to suppress it, but this does what it says on the box.

Reviewed-by: Adam Jackson <ajax@redhat.com>

- ajax
Jerome Glisse Aug. 14, 2012, 3:01 p.m. UTC | #2
On Tue, Aug 14, 2012 at 10:54 AM, Adam Jackson <ajax@redhat.com> wrote:
> On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
>>
>> From: Jerome Glisse <jglisse@redhat.com>
>>
>> Limit printing bad edid information at one time per connector.
>> Connector that are connected to a bad monitor/kvm will likely
>> stay connected to the same bad monitor/kvm and it makes no
>> sense to keep printing the bad edid message.
>>
>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>
>
> I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if we're
> going to suppress it, but this does what it says on the box.
>
> Reviewed-by: Adam Jackson <ajax@redhat.com>
>
> - ajax
>

I think there is still value in getting at least once the bad edid.

Cheers,
Jerome
Jani Nikula Aug. 16, 2012, 3:13 p.m. UTC | #3
There's a bug [1] where the faster GMBUS transmissions fail with some
CRTs, and the fix [2] is to fallback to GPIO bit-banging upon errors. As
noted in the bug, the fix still leaves plenty of EDID dumps in dmesg, so
some measures to reduce the EDID error messages would be most welcome.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=45881
[2] http://thread.gmane.org/gmane.linux.kernel/1332810/focus=1341912

On Tue, 14 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
> On Tue, Aug 14, 2012 at 10:54 AM, Adam Jackson <ajax@redhat.com> wrote:
>> On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
>>>
>>> From: Jerome Glisse <jglisse@redhat.com>
>>>
>>> Limit printing bad edid information at one time per connector.
>>> Connector that are connected to a bad monitor/kvm will likely
>>> stay connected to the same bad monitor/kvm and it makes no
>>> sense to keep printing the bad edid message.

Do I understand correctly that bad_edid_counter is only reset when you
reboot or reload the module? So if you have a laptop that you connect to
the monitor at home, the monitor at the office, the projector in the
meeting room, and to a TV somewhere else, etc, the message about bad
EDID will only printed once? I don't think that's good. But please do
correct me if I'm wrong.

>>>
>>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>>
>>
>> I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if we're
>> going to suppress it, but this does what it says on the box.
>>
>> Reviewed-by: Adam Jackson <ajax@redhat.com>
>>
>> - ajax
>>
>
> I think there is still value in getting at least once the bad edid.

I think the raw edid dumps should be DEBUG level no matter what. Perhaps
some of the other messages could use WARNING/DEBUG too. And with that,
and my comment above, I not sure there really needs to be all that logic
to count errors and act differently further on.


BR,
Jani.
Jerome Glisse Aug. 16, 2012, 4:07 p.m. UTC | #4
On Thu, Aug 16, 2012 at 11:13 AM, Jani Nikula
<jani.nikula@linux.intel.com> wrote:
>
> There's a bug [1] where the faster GMBUS transmissions fail with some
> CRTs, and the fix [2] is to fallback to GPIO bit-banging upon errors. As
> noted in the bug, the fix still leaves plenty of EDID dumps in dmesg, so
> some measures to reduce the EDID error messages would be most welcome.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=45881
> [2] http://thread.gmane.org/gmane.linux.kernel/1332810/focus=1341912
>
> On Tue, 14 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
>> On Tue, Aug 14, 2012 at 10:54 AM, Adam Jackson <ajax@redhat.com> wrote:
>>> On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
>>>>
>>>> From: Jerome Glisse <jglisse@redhat.com>
>>>>
>>>> Limit printing bad edid information at one time per connector.
>>>> Connector that are connected to a bad monitor/kvm will likely
>>>> stay connected to the same bad monitor/kvm and it makes no
>>>> sense to keep printing the bad edid message.
>
> Do I understand correctly that bad_edid_counter is only reset when you
> reboot or reload the module? So if you have a laptop that you connect to
> the monitor at home, the monitor at the office, the projector in the
> meeting room, and to a TV somewhere else, etc, the message about bad
> EDID will only printed once? I don't think that's good. But please do
> correct me if I'm wrong.

I wanted to reset the counter any time the connector is connected to
something with good edid but i did not do that in the end. I can do a
patch on top if you think it would be nicer. That way only thing with
bad edid will be printed once and assuming you don't repeatly
alternate btw good and bad edid device you would not get spam.

>>>>
>>>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>>>
>>>
>>> I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if we're
>>> going to suppress it, but this does what it says on the box.
>>>
>>> Reviewed-by: Adam Jackson <ajax@redhat.com>
>>>
>>> - ajax
>>>
>>
>> I think there is still value in getting at least once the bad edid.
>
> I think the raw edid dumps should be DEBUG level no matter what. Perhaps
> some of the other messages could use WARNING/DEBUG too. And with that,
> and my comment above, I not sure there really needs to be all that logic
> to count errors and act differently further on.
>

No, i do think we want bad edid as normal log at least once per
connector and we definitely don't want to spam bomb the log messages.

Cheers,
Jerome
Jani Nikula Aug. 17, 2012, 7:01 a.m. UTC | #5
On Thu, 16 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
> On Thu, Aug 16, 2012 at 11:13 AM, Jani Nikula
> <jani.nikula@linux.intel.com> wrote:
>>
>> There's a bug [1] where the faster GMBUS transmissions fail with some
>> CRTs, and the fix [2] is to fallback to GPIO bit-banging upon errors. As
>> noted in the bug, the fix still leaves plenty of EDID dumps in dmesg, so
>> some measures to reduce the EDID error messages would be most welcome.
>>
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=45881
>> [2] http://thread.gmane.org/gmane.linux.kernel/1332810/focus=1341912
>>
>> On Tue, 14 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
>>> On Tue, Aug 14, 2012 at 10:54 AM, Adam Jackson <ajax@redhat.com> wrote:
>>>> On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
>>>>>
>>>>> From: Jerome Glisse <jglisse@redhat.com>
>>>>>
>>>>> Limit printing bad edid information at one time per connector.
>>>>> Connector that are connected to a bad monitor/kvm will likely
>>>>> stay connected to the same bad monitor/kvm and it makes no
>>>>> sense to keep printing the bad edid message.
>>
>> Do I understand correctly that bad_edid_counter is only reset when you
>> reboot or reload the module? So if you have a laptop that you connect to
>> the monitor at home, the monitor at the office, the projector in the
>> meeting room, and to a TV somewhere else, etc, the message about bad
>> EDID will only printed once? I don't think that's good. But please do
>> correct me if I'm wrong.
>
> I wanted to reset the counter any time the connector is connected to
> something with good edid but i did not do that in the end. I can do a
> patch on top if you think it would be nicer. That way only thing with
> bad edid will be printed once and assuming you don't repeatly
> alternate btw good and bad edid device you would not get spam.

IMHO this is, with or without the additional patch, overengineering the
logic to print out error messages.

For me, as a developer, this would be annoying because, to debug an EDID
issue, I'd have to reload the module, connect to something with "good"
EDID in between, or patch this out, to repeat a problem. I might have to
ask a tester or a bug reporter to do the same.

Also, in the referenced bug, the first problem with GMBUS would disable
error messages, and the fallback retry with GPIO bit-banging would no
longer produce messages even if that failed too.

>
>>>>>
>>>>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>>>>
>>>>
>>>> I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if we're
>>>> going to suppress it, but this does what it says on the box.
>>>>
>>>> Reviewed-by: Adam Jackson <ajax@redhat.com>
>>>>
>>>> - ajax
>>>>
>>>
>>> I think there is still value in getting at least once the bad edid.
>>
>> I think the raw edid dumps should be DEBUG level no matter what. Perhaps
>> some of the other messages could use WARNING/DEBUG too. And with that,
>> and my comment above, I not sure there really needs to be all that logic
>> to count errors and act differently further on.
>>
>
> No, i do think we want bad edid as normal log at least once per
> connector and we definitely don't want to spam bomb the log messages.

Well, at least we agree on silencing the dmesg here. ;)

I'd be happy with very simply adjusting the loglevel of the messages, so
that I can also very simply adjust the amount of messages I get. Or ask
a bug reported to do drm.debug=0xe instead of trying to ensure they
follow a bunch of steps in between. But I'm not going to bikeshed
further if others think the patch here is the way to go.


BR,
Jani.
Jerome Glisse Aug. 17, 2012, 2:50 p.m. UTC | #6
On Fri, Aug 17, 2012 at 3:01 AM, Jani Nikula
<jani.nikula@linux.intel.com> wrote:
> On Thu, 16 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
>> On Thu, Aug 16, 2012 at 11:13 AM, Jani Nikula
>> <jani.nikula@linux.intel.com> wrote:
>>>
>>> There's a bug [1] where the faster GMBUS transmissions fail with some
>>> CRTs, and the fix [2] is to fallback to GPIO bit-banging upon errors. As
>>> noted in the bug, the fix still leaves plenty of EDID dumps in dmesg, so
>>> some measures to reduce the EDID error messages would be most welcome.
>>>
>>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=45881
>>> [2] http://thread.gmane.org/gmane.linux.kernel/1332810/focus=1341912
>>>
>>> On Tue, 14 Aug 2012, Jerome Glisse <j.glisse@gmail.com> wrote:
>>>> On Tue, Aug 14, 2012 at 10:54 AM, Adam Jackson <ajax@redhat.com> wrote:
>>>>> On 8/9/12 11:25 AM, j.glisse@gmail.com wrote:
>>>>>>
>>>>>> From: Jerome Glisse <jglisse@redhat.com>
>>>>>>
>>>>>> Limit printing bad edid information at one time per connector.
>>>>>> Connector that are connected to a bad monitor/kvm will likely
>>>>>> stay connected to the same bad monitor/kvm and it makes no
>>>>>> sense to keep printing the bad edid message.
>>>
>>> Do I understand correctly that bad_edid_counter is only reset when you
>>> reboot or reload the module? So if you have a laptop that you connect to
>>> the monitor at home, the monitor at the office, the projector in the
>>> meeting room, and to a TV somewhere else, etc, the message about bad
>>> EDID will only printed once? I don't think that's good. But please do
>>> correct me if I'm wrong.
>>
>> I wanted to reset the counter any time the connector is connected to
>> something with good edid but i did not do that in the end. I can do a
>> patch on top if you think it would be nicer. That way only thing with
>> bad edid will be printed once and assuming you don't repeatly
>> alternate btw good and bad edid device you would not get spam.
>
> IMHO this is, with or without the additional patch, overengineering the
> logic to print out error messages.
>
> For me, as a developer, this would be annoying because, to debug an EDID
> issue, I'd have to reload the module, connect to something with "good"
> EDID in between, or patch this out, to repeat a problem. I might have to
> ask a tester or a bug reporter to do the same.
>
> Also, in the referenced bug, the first problem with GMBUS would disable
> error messages, and the fallback retry with GPIO bit-banging would no
> longer produce messages even if that failed too.

If you set drm.debug=4 my patch will keep printing bad edid and
possibly spam your log. So it's not harder for developer. I restricted
the spamming to when some one is asking for debug.

>>
>>>>>>
>>>>>> Signed-off-by: Jerome Glisse <jglisse@redhat.com>
>>>>>
>>>>>
>>>>> I guess.  I don't see why we don't just move it into DRM_DEBUG_KMS if we're
>>>>> going to suppress it, but this does what it says on the box.
>>>>>
>>>>> Reviewed-by: Adam Jackson <ajax@redhat.com>
>>>>>
>>>>> - ajax
>>>>>
>>>>
>>>> I think there is still value in getting at least once the bad edid.
>>>
>>> I think the raw edid dumps should be DEBUG level no matter what. Perhaps
>>> some of the other messages could use WARNING/DEBUG too. And with that,
>>> and my comment above, I not sure there really needs to be all that logic
>>> to count errors and act differently further on.
>>>
>>
>> No, i do think we want bad edid as normal log at least once per
>> connector and we definitely don't want to spam bomb the log messages.
>
> Well, at least we agree on silencing the dmesg here. ;)
>
> I'd be happy with very simply adjusting the loglevel of the messages, so
> that I can also very simply adjust the amount of messages I get. Or ask
> a bug reported to do drm.debug=0xe instead of trying to ensure they
> follow a bunch of steps in between. But I'm not going to bikeshed
> further if others think the patch here is the way to go.
>

The patch aim was mostly with server where we dont want to spam bomb
error log but where we still want to be able to get bad edid without
having to ask for a reboot which on server is very much appreciated.

Cheers,
Jerome
diff mbox

Patch

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index a8743c3..7380ee3 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -158,7 +158,7 @@  MODULE_PARM_DESC(edid_fixup,
  * Sanity check the EDID block (base or extension).  Return 0 if the block
  * doesn't check out, or 1 if it's valid.
  */
-bool drm_edid_block_valid(u8 *raw_edid, int block)
+bool drm_edid_block_valid(u8 *raw_edid, int block, bool print_bad_edid)
 {
 	int i;
 	u8 csum = 0;
@@ -181,7 +181,9 @@  bool drm_edid_block_valid(u8 *raw_edid, int block)
 	for (i = 0; i < EDID_LENGTH; i++)
 		csum += raw_edid[i];
 	if (csum) {
-		DRM_ERROR("EDID checksum is invalid, remainder is %d\n", csum);
+		if (print_bad_edid) {
+			DRM_ERROR("EDID checksum is invalid, remainder is %d\n", csum);
+		}
 
 		/* allow CEA to slide through, switches mangle this */
 		if (raw_edid[0] != 0x02)
@@ -207,7 +209,7 @@  bool drm_edid_block_valid(u8 *raw_edid, int block)
 	return 1;
 
 bad:
-	if (raw_edid) {
+	if (raw_edid && print_bad_edid) {
 		printk(KERN_ERR "Raw EDID:\n");
 		print_hex_dump(KERN_ERR, " \t", DUMP_PREFIX_NONE, 16, 1,
 			       raw_edid, EDID_LENGTH, false);
@@ -231,7 +233,7 @@  bool drm_edid_is_valid(struct edid *edid)
 		return false;
 
 	for (i = 0; i <= edid->extensions; i++)
-		if (!drm_edid_block_valid(raw + i * EDID_LENGTH, i))
+		if (!drm_edid_block_valid(raw + i * EDID_LENGTH, i, true))
 			return false;
 
 	return true;
@@ -303,6 +305,7 @@  drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
 {
 	int i, j = 0, valid_extensions = 0;
 	u8 *block, *new;
+	bool print_bad_edid = !connector->bad_edid_counter || (drm_debug & DRM_UT_KMS);
 
 	if ((block = kmalloc(EDID_LENGTH, GFP_KERNEL)) == NULL)
 		return NULL;
@@ -311,7 +314,7 @@  drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
 	for (i = 0; i < 4; i++) {
 		if (drm_do_probe_ddc_edid(adapter, block, 0, EDID_LENGTH))
 			goto out;
-		if (drm_edid_block_valid(block, 0))
+		if (drm_edid_block_valid(block, 0, print_bad_edid))
 			break;
 		if (i == 0 && drm_edid_is_zero(block, EDID_LENGTH)) {
 			connector->null_edid_counter++;
@@ -336,7 +339,7 @@  drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
 				  block + (valid_extensions + 1) * EDID_LENGTH,
 				  j, EDID_LENGTH))
 				goto out;
-			if (drm_edid_block_valid(block + (valid_extensions + 1) * EDID_LENGTH, j)) {
+			if (drm_edid_block_valid(block + (valid_extensions + 1) * EDID_LENGTH, j, print_bad_edid)) {
 				valid_extensions++;
 				break;
 			}
@@ -359,8 +362,11 @@  drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
 	return block;
 
 carp:
-	dev_warn(connector->dev->dev, "%s: EDID block %d invalid.\n",
-		 drm_get_connector_name(connector), j);
+	if (print_bad_edid) {
+		dev_warn(connector->dev->dev, "%s: EDID block %d invalid.\n",
+			 drm_get_connector_name(connector), j);
+	}
+	connector->bad_edid_counter++;
 
 out:
 	kfree(block);
diff --git a/drivers/gpu/drm/drm_edid_load.c b/drivers/gpu/drm/drm_edid_load.c
index 66d4a28..14f46dd 100644
--- a/drivers/gpu/drm/drm_edid_load.c
+++ b/drivers/gpu/drm/drm_edid_load.c
@@ -123,6 +123,7 @@  static int edid_load(struct drm_connector *connector, char *name,
 	int fwsize, expected;
 	int builtin = 0, err = 0;
 	int i, valid_extensions = 0;
+	bool print_bad_edid = !connector->bad_edid_counter || (drm_debug & DRM_UT_KMS);
 
 	pdev = platform_device_register_simple(connector_name, -1, NULL, 0);
 	if (IS_ERR(pdev)) {
@@ -173,7 +174,8 @@  static int edid_load(struct drm_connector *connector, char *name,
 	}
 	memcpy(edid, fwdata, fwsize);
 
-	if (!drm_edid_block_valid(edid, 0)) {
+	if (!drm_edid_block_valid(edid, 0, print_bad_edid)) {
+		connector->bad_edid_counter++;
 		DRM_ERROR("Base block of EDID firmware \"%s\" is invalid ",
 		    name);
 		kfree(edid);
@@ -185,7 +187,7 @@  static int edid_load(struct drm_connector *connector, char *name,
 		if (i != valid_extensions + 1)
 			memcpy(edid + (valid_extensions + 1) * EDID_LENGTH,
 			    edid + i * EDID_LENGTH, EDID_LENGTH);
-		if (drm_edid_block_valid(edid + i * EDID_LENGTH, i))
+		if (drm_edid_block_valid(edid + i * EDID_LENGTH, i, print_bad_edid))
 			valid_extensions++;
 	}
 
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index bac55c2..1a2987b 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -594,6 +594,7 @@  struct drm_connector {
 	int video_latency[2];	/* [0]: progressive, [1]: interlaced */
 	int audio_latency[2];
 	int null_edid_counter; /* needed to workaround some HW bugs where we get all 0s */
+	unsigned bad_edid_counter;
 };
 
 /**
@@ -1038,7 +1039,7 @@  extern int drm_add_modes_noedid(struct drm_connector *connector,
 				int hdisplay, int vdisplay);
 
 extern int drm_edid_header_is_valid(const u8 *raw_edid);
-extern bool drm_edid_block_valid(u8 *raw_edid, int block);
+extern bool drm_edid_block_valid(u8 *raw_edid, int block, bool print_bad_edid);
 extern bool drm_edid_is_valid(struct edid *edid);
 struct drm_display_mode *drm_mode_find_dmt(struct drm_device *dev,
 					   int hsize, int vsize, int fresh,