WARNING: CPU: 0 PID: 3634 at drivers/gpu/drm/drm_irq.c:1141 drm_wait_one_vblank
diff mbox

Message ID 20150630115707.GA4578@dhcp22.suse.cz
State New
Headers show

Commit Message

Michal Hocko June 30, 2015, 11:57 a.m. UTC
Hi,
I am getting the following warning when I switch to the text console
from X. I do not know when this has started because I have noticed
that only now (in 4.1 kernel). I can try some older kernels if this is
useful.

I have tried to instrument drm_wait_one_vblank and dump drm_vblank_count
before the wait_event and the value when it returns (see the diff below)
and it seems to be increasing with new switching to the text console:
$ dmesg | grep XXX
[    6.531908] XXX: last:36
[    6.545852] XXX: drm_vblank_count:37
[    9.038658] XXX: last:186
[    9.051332] XXX: drm_vblank_count:187
[    9.051424] XXX: last:187
[    9.068036] XXX: drm_vblank_count:188
[   16.962193] XXX: last:660
[   16.968683] XXX: drm_vblank_count:661
[   30.592874] XXX: last:1476
[   30.598656] XXX: drm_vblank_count:1477
[   30.598768] XXX: last:1477
[   30.615339] XXX: drm_vblank_count:1478
[   37.313338] XXX: last:1879
[   37.330102] XXX: drm_vblank_count:1880
[   39.726809] XXX: last:2023
[   39.735265] XXX: drm_vblank_count:2024
[   39.735375] XXX: last:2024
[   39.752094] XXX: drm_vblank_count:2025
[   60.842401] XXX: last:3287
[   60.848437] XXX: drm_vblank_count:3288
[   62.149546] XXX: last:3365
[   62.151277] XXX: drm_vblank_count:3366
[   62.151411] XXX: last:3366
[   62.249376] XXX: drm_vblank_count:3366
[   92.198305] XXX: last:5160
[   92.297091] XXX: drm_vblank_count:5160
[   93.822331] XXX: last:5253
[   93.922153] XXX: drm_vblank_count:5253
[   93.922424] XXX: last:5254
[   94.022213] XXX: drm_vblank_count:5254
[  100.877802] XXX: last:5665
[  100.974740] XXX: drm_vblank_count:5665

I am sorry about the lack of information in this report but I am really
not sure what might be helpful. I can only tell that I haven't observed
anything wrong going on after the warning so it might be harmless.

Let me know what kind of information might be helpful.

[ 7322.444556] ------------[ cut here ]------------
[ 7322.444569] WARNING: CPU: 0 PID: 3634 at drivers/gpu/drm/drm_irq.c:1141 drm_wait_one_vblank+0x144/0x16f [drm]()
[ 7322.444570] vblank wait timed out on crtc 0
[ 7322.444586] Modules linked in: i915 fbcon cfbfillrect bitblit softcursor cfbimgblt font i2c_algo_bit cfbcopyarea drm_kms_helper drm fb fbdev binfmt_misc snd_hda_codec_hdmi uvcvideo videobuf2_vmalloc videobuf2_memops snd_hda_codec_idt i2c_i801 snd_hda_codec_generic snd_hda_intel snd_hda_controller videobuf2_core arc4 snd_hda_codec snd_hda_core snd_pcm_oss v4l2_common snd_mixer_oss videodev iwldvm media i2c_core mac80211 video iwlwifi backlight snd_pcm sdhci_pci sdhci mmc_core cfg80211 snd_timer snd
[ 7322.444588] CPU: 0 PID: 3634 Comm: Xorg Not tainted 4.1.0 #587
[ 7322.444589] Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A08 10/18/2011
[ 7322.444591]  0000000000000009 ffff8800c5bb7888 ffffffff8151b709 0000000080000000
[ 7322.444592]  ffff8800c5bb78d8 ffff8800c5bb78c8 ffffffff8104550a ffff8800c5bb7958
[ 7322.444593]  ffffffffa02d414e 0000000000000000 0000000000000000 ffff8800c58d7000
[ 7322.444594] Call Trace:
[ 7322.444599]  [<ffffffff8151b709>] dump_stack+0x4f/0x7b
[ 7322.444601]  [<ffffffff8104550a>] warn_slowpath_common+0xa1/0xbb
[ 7322.444608]  [<ffffffffa02d414e>] ? drm_wait_one_vblank+0x144/0x16f [drm]
[ 7322.444609]  [<ffffffff8104556a>] warn_slowpath_fmt+0x46/0x48
[ 7322.444611]  [<ffffffff81075b7d>] ? finish_wait+0x59/0x62
[ 7322.444617]  [<ffffffffa02d414e>] drm_wait_one_vblank+0x144/0x16f [drm]
[ 7322.444618]  [<ffffffff81075c9d>] ? wait_woken+0x76/0x76
[ 7322.444624]  [<ffffffffa02d4196>] drm_crtc_wait_one_vblank+0x1d/0x21 [drm]
[ 7322.444628]  [<ffffffffa0331baa>] drm_plane_helper_commit+0x1b3/0x240 [drm_kms_helper]
[ 7322.444631]  [<ffffffffa0331cfe>] drm_plane_helper_update+0xc7/0xd6 [drm_kms_helper]
[ 7322.444652]  [<ffffffffa03cd9a5>] intel_crtc_set_config+0x95e/0xc4f [i915]
[ 7322.444661]  [<ffffffffa02db020>] drm_mode_set_config_internal+0x5c/0xe8 [drm]
[ 7322.444666]  [<ffffffffa0338ae3>] drm_fb_helper_pan_display+0xa2/0xd8 [drm_kms_helper]
[ 7322.444668]  [<ffffffffa02be312>] fb_pan_display+0xee/0x131 [fb]
[ 7322.444670]  [<ffffffffa0353254>] bit_update_start+0x20/0x43 [bitblit]
[ 7322.444672]  [<ffffffffa035fa24>] fbcon_switch+0x3b7/0x438 [fbcon]
[ 7322.444674]  [<ffffffff812f7f0d>] redraw_screen+0x112/0x1e3
[ 7322.444676]  [<ffffffff812f0711>] complete_change_console+0x3e/0xc7
[ 7322.444678]  [<ffffffff812f1684>] vt_ioctl+0xeea/0x117f
[ 7322.444680]  [<ffffffff812e7538>] tty_ioctl+0xa01/0xa74
[ 7322.444682]  [<ffffffff81065154>] ? preempt_count_sub+0xc6/0xd3
[ 7322.444684]  [<ffffffff81156204>] do_vfs_ioctl+0x377/0x425
[ 7322.444685]  [<ffffffff8115e2a9>] ? __fget+0x70/0x7b
[ 7322.444686]  [<ffffffff811562f6>] SyS_ioctl+0x44/0x63
[ 7322.444688]  [<ffffffff81520197>] system_call_fastpath+0x12/0x6a
[ 7322.444689] ---[ end trace 9d3b554e7f553db3 ]---
[ 7326.066994] ------------[ cut here ]------------

Debugging patch
--

Comments

Daniel Vetter June 30, 2015, 2:32 p.m. UTC | #1
On Tue, Jun 30, 2015 at 01:57:07PM +0200, Michal Hocko wrote:
> Hi,
> I am getting the following warning when I switch to the text console
> from X. I do not know when this has started because I have noticed
> that only now (in 4.1 kernel). I can try some older kernels if this is
> useful.
> 
> I have tried to instrument drm_wait_one_vblank and dump drm_vblank_count
> before the wait_event and the value when it returns (see the diff below)
> and it seems to be increasing with new switching to the text console:
> $ dmesg | grep XXX
> [    6.531908] XXX: last:36
> [    6.545852] XXX: drm_vblank_count:37
> [    9.038658] XXX: last:186
> [    9.051332] XXX: drm_vblank_count:187
> [    9.051424] XXX: last:187
> [    9.068036] XXX: drm_vblank_count:188
> [   16.962193] XXX: last:660
> [   16.968683] XXX: drm_vblank_count:661
> [   30.592874] XXX: last:1476
> [   30.598656] XXX: drm_vblank_count:1477
> [   30.598768] XXX: last:1477
> [   30.615339] XXX: drm_vblank_count:1478
> [   37.313338] XXX: last:1879
> [   37.330102] XXX: drm_vblank_count:1880
> [   39.726809] XXX: last:2023
> [   39.735265] XXX: drm_vblank_count:2024
> [   39.735375] XXX: last:2024
> [   39.752094] XXX: drm_vblank_count:2025
> [   60.842401] XXX: last:3287
> [   60.848437] XXX: drm_vblank_count:3288
> [   62.149546] XXX: last:3365
> [   62.151277] XXX: drm_vblank_count:3366
> [   62.151411] XXX: last:3366
> [   62.249376] XXX: drm_vblank_count:3366
> [   92.198305] XXX: last:5160
> [   92.297091] XXX: drm_vblank_count:5160
> [   93.822331] XXX: last:5253
> [   93.922153] XXX: drm_vblank_count:5253
> [   93.922424] XXX: last:5254
> [   94.022213] XXX: drm_vblank_count:5254
> [  100.877802] XXX: last:5665
> [  100.974740] XXX: drm_vblank_count:5665
> 
> I am sorry about the lack of information in this report but I am really
> not sure what might be helpful. I can only tell that I haven't observed
> anything wrong going on after the warning so it might be harmless.
> 
> Let me know what kind of information might be helpful.

Looks like the vblank is actually running, just the wakeup somehow doesn't
happen in time. What machine is this (lspci -nn)? Also please boot with
drm.debug=0xe and grab dmesg after this happens for the first time.
logfile will be large, you might need to increase the buffer size with
log_buf_len or grab
it from the hd.

Also what happens when you increase the timeout to 1000 (just to make sure
it's not just a silly delay somewhere). Also have you any special features
like psr, fbc or something similar enabled?

Really surprising since we have all the checks in place that should ensure
that the vblank stuff is indeed up&running.
-Daniel

> 
> [ 7322.444556] ------------[ cut here ]------------
> [ 7322.444569] WARNING: CPU: 0 PID: 3634 at drivers/gpu/drm/drm_irq.c:1141 drm_wait_one_vblank+0x144/0x16f [drm]()
> [ 7322.444570] vblank wait timed out on crtc 0
> [ 7322.444586] Modules linked in: i915 fbcon cfbfillrect bitblit softcursor cfbimgblt font i2c_algo_bit cfbcopyarea drm_kms_helper drm fb fbdev binfmt_misc snd_hda_codec_hdmi uvcvideo videobuf2_vmalloc videobuf2_memops snd_hda_codec_idt i2c_i801 snd_hda_codec_generic snd_hda_intel snd_hda_controller videobuf2_core arc4 snd_hda_codec snd_hda_core snd_pcm_oss v4l2_common snd_mixer_oss videodev iwldvm media i2c_core mac80211 video iwlwifi backlight snd_pcm sdhci_pci sdhci mmc_core cfg80211 snd_timer snd
> [ 7322.444588] CPU: 0 PID: 3634 Comm: Xorg Not tainted 4.1.0 #587
> [ 7322.444589] Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A08 10/18/2011
> [ 7322.444591]  0000000000000009 ffff8800c5bb7888 ffffffff8151b709 0000000080000000
> [ 7322.444592]  ffff8800c5bb78d8 ffff8800c5bb78c8 ffffffff8104550a ffff8800c5bb7958
> [ 7322.444593]  ffffffffa02d414e 0000000000000000 0000000000000000 ffff8800c58d7000
> [ 7322.444594] Call Trace:
> [ 7322.444599]  [<ffffffff8151b709>] dump_stack+0x4f/0x7b
> [ 7322.444601]  [<ffffffff8104550a>] warn_slowpath_common+0xa1/0xbb
> [ 7322.444608]  [<ffffffffa02d414e>] ? drm_wait_one_vblank+0x144/0x16f [drm]
> [ 7322.444609]  [<ffffffff8104556a>] warn_slowpath_fmt+0x46/0x48
> [ 7322.444611]  [<ffffffff81075b7d>] ? finish_wait+0x59/0x62
> [ 7322.444617]  [<ffffffffa02d414e>] drm_wait_one_vblank+0x144/0x16f [drm]
> [ 7322.444618]  [<ffffffff81075c9d>] ? wait_woken+0x76/0x76
> [ 7322.444624]  [<ffffffffa02d4196>] drm_crtc_wait_one_vblank+0x1d/0x21 [drm]
> [ 7322.444628]  [<ffffffffa0331baa>] drm_plane_helper_commit+0x1b3/0x240 [drm_kms_helper]
> [ 7322.444631]  [<ffffffffa0331cfe>] drm_plane_helper_update+0xc7/0xd6 [drm_kms_helper]
> [ 7322.444652]  [<ffffffffa03cd9a5>] intel_crtc_set_config+0x95e/0xc4f [i915]
> [ 7322.444661]  [<ffffffffa02db020>] drm_mode_set_config_internal+0x5c/0xe8 [drm]
> [ 7322.444666]  [<ffffffffa0338ae3>] drm_fb_helper_pan_display+0xa2/0xd8 [drm_kms_helper]
> [ 7322.444668]  [<ffffffffa02be312>] fb_pan_display+0xee/0x131 [fb]
> [ 7322.444670]  [<ffffffffa0353254>] bit_update_start+0x20/0x43 [bitblit]
> [ 7322.444672]  [<ffffffffa035fa24>] fbcon_switch+0x3b7/0x438 [fbcon]
> [ 7322.444674]  [<ffffffff812f7f0d>] redraw_screen+0x112/0x1e3
> [ 7322.444676]  [<ffffffff812f0711>] complete_change_console+0x3e/0xc7
> [ 7322.444678]  [<ffffffff812f1684>] vt_ioctl+0xeea/0x117f
> [ 7322.444680]  [<ffffffff812e7538>] tty_ioctl+0xa01/0xa74
> [ 7322.444682]  [<ffffffff81065154>] ? preempt_count_sub+0xc6/0xd3
> [ 7322.444684]  [<ffffffff81156204>] do_vfs_ioctl+0x377/0x425
> [ 7322.444685]  [<ffffffff8115e2a9>] ? __fget+0x70/0x7b
> [ 7322.444686]  [<ffffffff811562f6>] SyS_ioctl+0x44/0x63
> [ 7322.444688]  [<ffffffff81520197>] system_call_fastpath+0x12/0x6a
> [ 7322.444689] ---[ end trace 9d3b554e7f553db3 ]---
> [ 7326.066994] ------------[ cut here ]------------
> 
> Debugging patch
> --
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index af9662e58272..446f4ce02a9d 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -1126,17 +1126,19 @@ EXPORT_SYMBOL(drm_crtc_vblank_put);
>  void drm_wait_one_vblank(struct drm_device *dev, int crtc)
>  {
>  	int ret;
> -	u32 last;
> +	u32 last, l;
>  
>  	ret = drm_vblank_get(dev, crtc);
>  	if (WARN(ret, "vblank not available on crtc %i, ret=%i\n", crtc, ret))
>  		return;
>  
>  	last = drm_vblank_count(dev, crtc);
> +	pr_info("XXX: last:%u\n", last);
>  
>  	ret = wait_event_timeout(dev->vblank[crtc].queue,
> -				 last != drm_vblank_count(dev, crtc),
> +				 last != (l = drm_vblank_count(dev, crtc)),
>  				 msecs_to_jiffies(100));
> +	pr_info("XXX: drm_vblank_count:%u\n", l);
>  
>  	WARN(ret == 0, "vblank wait timed out on crtc %i\n", crtc);
>  
> -- 
> Michal Hocko
> SUSE Labs
Michal Hocko June 30, 2015, 4:07 p.m. UTC | #2
On Tue 30-06-15 16:32:49, Daniel Vetter wrote:
[...]
> Looks like the vblank is actually running, just the wakeup somehow doesn't
> happen in time. What machine is this (lspci -nn)?

00:00.0 Host bridge [0600]: Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0104] (rev 09)
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
00:16.3 Serial controller [0700]: Intel Corporation 6 Series/C200 Series Chipset Family KT Controller [8086:1c3d] (rev 04)
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 04)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 04)
00:1b.0 Audio device [0403]: Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] (rev 04)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b4)
00:1c.1 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 2 [8086:1c12] (rev b4)
00:1c.2 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 3 [8086:1c14] (rev b4)
00:1c.3 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 4 [8086:1c16] (rev b4)
00:1c.5 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 6 [8086:1c1a] (rev b4)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 04)
00:1f.0 ISA bridge [0601]: Intel Corporation QM67 Express Chipset Family LPC Controller [8086:1c4f] (rev 04)
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller [8086:1c03] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 04)
02:00.0 Network controller [0280]: Intel Corporation Centrino Ultimate-N 6300 [8086:422b] (rev 35)
0a:00.0 SD Host controller [0805]: O2 Micro, Inc. OZ600FJ0/OZ900FJ0/OZ600FJS SD/MMC Card Reader Controller [1217:8221] (rev 05)

> Also please boot with
> drm.debug=0xe and grab dmesg after this happens for the first time.
> logfile will be large, you might need to increase the buffer size with
> log_buf_len or grab it from the hd.

See attached.
 
> Also what happens when you increase the timeout to 1000 (just to make sure
> it's not just a silly delay somewhere).

Still happening even when the timeout was increased to 1000. Curiously
the display is much darker with this change and I cannot make it
brighter by function keys...

> Also have you any special features like psr, fbc or something similar
> enabled?

I am not aware of anything like that. How do I check it?

Patch
diff mbox

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index af9662e58272..446f4ce02a9d 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1126,17 +1126,19 @@  EXPORT_SYMBOL(drm_crtc_vblank_put);
 void drm_wait_one_vblank(struct drm_device *dev, int crtc)
 {
 	int ret;
-	u32 last;
+	u32 last, l;
 
 	ret = drm_vblank_get(dev, crtc);
 	if (WARN(ret, "vblank not available on crtc %i, ret=%i\n", crtc, ret))
 		return;
 
 	last = drm_vblank_count(dev, crtc);
+	pr_info("XXX: last:%u\n", last);
 
 	ret = wait_event_timeout(dev->vblank[crtc].queue,
-				 last != drm_vblank_count(dev, crtc),
+				 last != (l = drm_vblank_count(dev, crtc)),
 				 msecs_to_jiffies(100));
+	pr_info("XXX: drm_vblank_count:%u\n", l);
 
 	WARN(ret == 0, "vblank wait timed out on crtc %i\n", crtc);