Message ID | 20240718104551.575912-1-tzimmermann@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | drm/mgag200: Implement VBLANK support | expand |
My system threw out a bunch of stack traces while booting v6.12-rc1 and hung. First of these looks like this: [ 33.639799] fbcon: mgag200drmfb (fb0) is primary device [ 33.651573] ixgbe 0000:03:00.0: Intel(R) 10 Gigabit Network Connection [ 33.652092] ixgbe 0000:03:00.1: enabling device (0100 -> 0102) [ 33.818328] ------------[ cut here ]------------ [ 33.818362] [CRTC:34:crtc-0] vblank wait timed out [ 33.818422] WARNING: CPU: 44 PID: 1815 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] [ 33.818447] Modules linked in: crct10dif_pclmul mgag200(+) crc32_pclmul i2c_algo_bit crc32c_intel drm_shmem_helper ghash_clmulni_intel sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 ixgbe(+) mpt3sas mdio drm raid_class dca scsi_transport_sas wmi fuse [ 33.818475] CPU: 44 PID: 1815 Comm: systemd-udevd Not tainted 6.10.0-rc1+ #168 [ 33.818478] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 33.818481] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] [ 33.818490] Code: 00 48 8d 7b 08 e8 2b 7e 61 da 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 38 52 ba c0 e8 8b 53 59 da <0f> 0b e9 b5 fe ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 [ 33.818493] RSP: 0018:ffffbf61a3faf690 EFLAGS: 00010282 [ 33.818496] RAX: 0000000000000026 RBX: ffff99be04ad3028 RCX: 0000000000000000 [ 33.818499] RDX: 0000000000000002 RSI: ffffffff9c9fd7c8 RDI: 00000000ffffffff [ 33.818501] RBP: ffff99be08a76c00 R08: 0000000000000001 R09: 0000000000000000 [ 33.818503] R10: 0000000000000001 R11: ffff99f1011fffe8 R12: 0000000000000000 [ 33.818504] R13: 0000000000000000 R14: ffff99be0bcf93f8 R15: 0000000000000000 [ 33.818506] FS: 00007fbe18e7db40(0000) GS:ffff99ca61c00000(0000) knlGS:0000000000000000 [ 33.818509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 33.818510] CR2: 000055b77636c1f8 CR3: 000000000e486004 CR4: 00000000003706f0 [ 33.818513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 33.818514] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 33.818516] Call Trace: [ 33.818519] <TASK> [ 33.818521] ? __warn+0x8b/0x190 [ 33.818535] ? drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] [ 33.818545] ? report_bug+0x1c3/0x1d0 [ 33.818559] ? handle_bug+0x42/0x70 [ 33.818571] ? exc_invalid_op+0x14/0x70 [ 33.818575] ? asm_exc_invalid_op+0x16/0x20 [ 33.818589] ? drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] [ 33.818602] ? __pfx_autoremove_wake_function+0x10/0x10 [ 33.818614] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] [ 33.818625] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 33.818633] commit_tail+0x94/0x130 [drm_kms_helper] [ 33.818644] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 33.818655] drm_atomic_commit+0x97/0xb0 [drm] [ 33.818717] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 33.818750] drm_client_modeset_commit_atomic+0x207/0x250 [drm] [ 33.818783] drm_client_modeset_commit_locked+0x5b/0x190 [drm] [ 33.818807] drm_client_modeset_commit+0x24/0x50 [drm] [ 33.818829] __drm_fb_helper_restore_fbdev_mode_unlocked+0x92/0xc0 [drm_kms_helper] [ 33.818841] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] [ 33.818850] fbcon_init+0x2a8/0x560 [ 33.818860] visual_init+0xc4/0x120 [ 33.818867] do_bind_con_driver.isra.0+0x1a1/0x3d0 [ 33.818875] do_take_over_console+0x10b/0x1a0 [ 33.818880] do_fbcon_takeover+0x5c/0xc0 [ 33.818883] fbcon_fb_registered+0x49/0x70 [ 33.818886] register_framebuffer+0x190/0x250 [ 33.818896] __drm_fb_helper_initial_config_and_unlock+0x345/0x590 [drm_kms_helper] [ 33.818906] ? drm_client_register+0x33/0xc0 [drm] [ 33.818934] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] [ 33.818939] drm_client_register+0x7b/0xc0 [drm] [ 33.818963] mgag200_pci_probe+0x90/0x180 [mgag200] [ 33.818970] local_pci_probe+0x46/0xa0 [ 33.818978] pci_device_probe+0xb5/0x230 [ 33.818986] really_probe+0xd9/0x380 [ 33.818993] __driver_probe_device+0x78/0x150 [ 33.818997] driver_probe_device+0x1e/0x90 [ 33.819000] __driver_attach+0xd6/0x1d0 [ 33.819003] ? __pfx___driver_attach+0x10/0x10 [ 33.819005] bus_for_each_dev+0x66/0xa0 [ 33.819012] bus_add_driver+0x111/0x240 [ 33.819018] driver_register+0x5c/0x120 [ 33.819021] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] [ 33.819026] do_one_initcall+0x62/0x3a0 [ 33.819035] ? kmalloc_trace_noprof+0x2a0/0x340 [ 33.819048] do_init_module+0x64/0x240 [ 33.819058] init_module_from_file+0x7a/0xa0 [ 33.819072] idempotent_init_module+0x15a/0x210 [ 33.819079] ? __startup_64+0x70/0x3f0 [ 33.819086] __x64_sys_finit_module+0x5a/0xb0 [ 33.819092] do_syscall_64+0x73/0x190 [ 33.819098] entry_SYSCALL_64_after_hwframe+0x76/0x7e I peeked at changes to this driver between v6.11 and v6.12-rc1 and saw this set: $ git log --oneline linus ^v6.11 -- drivers/gpu/drm/mgag200 219b45d023ed drm/mgag200: Remove BMC output 0f9ff361ad82 drm/mgag200: vga-bmc: Control BMC scanout from encoder 9d09cac47de5 drm/mgag200: vga-bmc: Control CRTC VIDRST flag from encoder dc06efbb7934 drm/mgag200: vga-bmc: Transparently handle BMC f5510726608f drm/mgag200: Add VGA-BMC output 6c9e14ee9f51 drm/mgag200: Fix VBLANK interrupt handling d5070c9b2944 drm/mgag200: Implement struct drm_crtc_funcs.get_vblank_timestamp 89c6ea2006e2 drm/mgag200: Add vblank support 5cd522b5331b drm/mgag200: Add dedicted variable for <linecomp> field d6460bd52c27 drm/mgag200: Add dedicated variables for blanking fields e8f834b55962 drm/mgag200: Use adjusted mode values for CRTCs b345b3542d66 drm/mgag200: Align register field names with documentation 754c9129b949 drm/mgag200: Use hexadecimal register indeces 3ac9384061b2 drm/mgag200: Rename BMC vidrst names 7bb97cf91588 drm/mgag200: Remove vidrst callbacks from struct mgag200_device_funcs cd3a2e8b0a03 drm/mgag200: Only set VIDRST bits in CRTC modesetting I tried a mini-bisct across these changes and found the system boots normally with: 5cd522b5331b drm/mgag200: Add dedicted variable for <linecomp> field and fails with: 89c6ea2006e2 drm/mgag200: Add vblank support I do see that there is a subsequent "Fix VBLANK" patch, but it appears that whatever it fixed didn't help on my system. -Tony
Hi Am 02.10.24 um 00:41 schrieb Tony Luck: > My system threw out a bunch of stack traces while booting > v6.12-rc1 and hung. Thanks for the bug report. Can you provide the output of 'sudo lspci -vvv' for the graphics device? Best regards Thomas > > First of these looks like this: > > [ 33.639799] fbcon: mgag200drmfb (fb0) is primary device > [ 33.651573] ixgbe 0000:03:00.0: Intel(R) 10 Gigabit Network Connection > [ 33.652092] ixgbe 0000:03:00.1: enabling device (0100 -> 0102) > [ 33.818328] ------------[ cut here ]------------ > [ 33.818362] [CRTC:34:crtc-0] vblank wait timed out > [ 33.818422] WARNING: CPU: 44 PID: 1815 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] > [ 33.818447] Modules linked in: crct10dif_pclmul mgag200(+) crc32_pclmul i2c_algo_bit crc32c_intel drm_shmem_helper ghash_clmulni_intel sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 ixgbe(+) mpt3sas mdio drm raid_class dca scsi_transport_sas wmi fuse > [ 33.818475] CPU: 44 PID: 1815 Comm: systemd-udevd Not tainted 6.10.0-rc1+ #168 > [ 33.818478] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 33.818481] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] > [ 33.818490] Code: 00 48 8d 7b 08 e8 2b 7e 61 da 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 38 52 ba c0 e8 8b 53 59 da <0f> 0b e9 b5 fe ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 > [ 33.818493] RSP: 0018:ffffbf61a3faf690 EFLAGS: 00010282 > [ 33.818496] RAX: 0000000000000026 RBX: ffff99be04ad3028 RCX: 0000000000000000 > [ 33.818499] RDX: 0000000000000002 RSI: ffffffff9c9fd7c8 RDI: 00000000ffffffff > [ 33.818501] RBP: ffff99be08a76c00 R08: 0000000000000001 R09: 0000000000000000 > [ 33.818503] R10: 0000000000000001 R11: ffff99f1011fffe8 R12: 0000000000000000 > [ 33.818504] R13: 0000000000000000 R14: ffff99be0bcf93f8 R15: 0000000000000000 > [ 33.818506] FS: 00007fbe18e7db40(0000) GS:ffff99ca61c00000(0000) knlGS:0000000000000000 > [ 33.818509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 33.818510] CR2: 000055b77636c1f8 CR3: 000000000e486004 CR4: 00000000003706f0 > [ 33.818513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 33.818514] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 33.818516] Call Trace: > [ 33.818519] <TASK> > [ 33.818521] ? __warn+0x8b/0x190 > [ 33.818535] ? drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] > [ 33.818545] ? report_bug+0x1c3/0x1d0 > [ 33.818559] ? handle_bug+0x42/0x70 > [ 33.818571] ? exc_invalid_op+0x14/0x70 > [ 33.818575] ? asm_exc_invalid_op+0x16/0x20 > [ 33.818589] ? drm_atomic_helper_wait_for_vblanks.part.0+0x245/0x250 [drm_kms_helper] > [ 33.818602] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 33.818614] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] > [ 33.818625] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 33.818633] commit_tail+0x94/0x130 [drm_kms_helper] > [ 33.818644] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 33.818655] drm_atomic_commit+0x97/0xb0 [drm] > [ 33.818717] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 33.818750] drm_client_modeset_commit_atomic+0x207/0x250 [drm] > [ 33.818783] drm_client_modeset_commit_locked+0x5b/0x190 [drm] > [ 33.818807] drm_client_modeset_commit+0x24/0x50 [drm] > [ 33.818829] __drm_fb_helper_restore_fbdev_mode_unlocked+0x92/0xc0 [drm_kms_helper] > [ 33.818841] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] > [ 33.818850] fbcon_init+0x2a8/0x560 > [ 33.818860] visual_init+0xc4/0x120 > [ 33.818867] do_bind_con_driver.isra.0+0x1a1/0x3d0 > [ 33.818875] do_take_over_console+0x10b/0x1a0 > [ 33.818880] do_fbcon_takeover+0x5c/0xc0 > [ 33.818883] fbcon_fb_registered+0x49/0x70 > [ 33.818886] register_framebuffer+0x190/0x250 > [ 33.818896] __drm_fb_helper_initial_config_and_unlock+0x345/0x590 [drm_kms_helper] > [ 33.818906] ? drm_client_register+0x33/0xc0 [drm] > [ 33.818934] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] > [ 33.818939] drm_client_register+0x7b/0xc0 [drm] > [ 33.818963] mgag200_pci_probe+0x90/0x180 [mgag200] > [ 33.818970] local_pci_probe+0x46/0xa0 > [ 33.818978] pci_device_probe+0xb5/0x230 > [ 33.818986] really_probe+0xd9/0x380 > [ 33.818993] __driver_probe_device+0x78/0x150 > [ 33.818997] driver_probe_device+0x1e/0x90 > [ 33.819000] __driver_attach+0xd6/0x1d0 > [ 33.819003] ? __pfx___driver_attach+0x10/0x10 > [ 33.819005] bus_for_each_dev+0x66/0xa0 > [ 33.819012] bus_add_driver+0x111/0x240 > [ 33.819018] driver_register+0x5c/0x120 > [ 33.819021] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] > [ 33.819026] do_one_initcall+0x62/0x3a0 > [ 33.819035] ? kmalloc_trace_noprof+0x2a0/0x340 > [ 33.819048] do_init_module+0x64/0x240 > [ 33.819058] init_module_from_file+0x7a/0xa0 > [ 33.819072] idempotent_init_module+0x15a/0x210 > [ 33.819079] ? __startup_64+0x70/0x3f0 > [ 33.819086] __x64_sys_finit_module+0x5a/0xb0 > [ 33.819092] do_syscall_64+0x73/0x190 > [ 33.819098] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > I peeked at changes to this driver between v6.11 and v6.12-rc1 and saw > this set: > > $ git log --oneline linus ^v6.11 -- drivers/gpu/drm/mgag200 > 219b45d023ed drm/mgag200: Remove BMC output > 0f9ff361ad82 drm/mgag200: vga-bmc: Control BMC scanout from encoder > 9d09cac47de5 drm/mgag200: vga-bmc: Control CRTC VIDRST flag from encoder > dc06efbb7934 drm/mgag200: vga-bmc: Transparently handle BMC > f5510726608f drm/mgag200: Add VGA-BMC output > 6c9e14ee9f51 drm/mgag200: Fix VBLANK interrupt handling > d5070c9b2944 drm/mgag200: Implement struct drm_crtc_funcs.get_vblank_timestamp > 89c6ea2006e2 drm/mgag200: Add vblank support > 5cd522b5331b drm/mgag200: Add dedicted variable for <linecomp> field > d6460bd52c27 drm/mgag200: Add dedicated variables for blanking fields > e8f834b55962 drm/mgag200: Use adjusted mode values for CRTCs > b345b3542d66 drm/mgag200: Align register field names with documentation > 754c9129b949 drm/mgag200: Use hexadecimal register indeces > 3ac9384061b2 drm/mgag200: Rename BMC vidrst names > 7bb97cf91588 drm/mgag200: Remove vidrst callbacks from struct mgag200_device_funcs > cd3a2e8b0a03 drm/mgag200: Only set VIDRST bits in CRTC modesetting > > I tried a mini-bisct across these changes and found the system boots > normally with: > > 5cd522b5331b drm/mgag200: Add dedicted variable for <linecomp> field > > and fails with: > > 89c6ea2006e2 drm/mgag200: Add vblank support > > I do see that there is a subsequent "Fix VBLANK" patch, but it appears > that whatever it fixed didn't help on my system. > > -Tony
> Thanks for the bug report. Can you provide the output of 'sudo lspci > -vvv' for the graphics device? Thomas, Sure. Here's the output (run on the v6.11.0 kernel) $ sudo lspci -vvv -s 0000:08:00.0 08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) (rev 05) (prog-if 00 [VGA controller]) Subsystem: Intel Corporation Device 0103 Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 16 Region 0: Memory at 90000000 (32-bit, prefetchable) [size=16M] Region 1: Memory at 91800000 (32-bit, non-prefetchable) [size=16K] Region 2: Memory at 91000000 (32-bit, non-prefetchable) [size=8M] Expansion ROM at 91810000 [disabled] [size=64K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [e4] Express (v1) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [54] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Kernel driver in use: mgag200 Kernel modules: mgag200 -Tony
Hi Am 02.10.24 um 18:15 schrieb Luck, Tony: >> Thanks for the bug report. Can you provide the output of 'sudo lspci >> -vvv' for the graphics device? > Thomas, > > Sure. Here's the output (run on the v6.11.0 kernel) Thanks. It doesn't look much different from other systems. IRQ is also assigned. Attached is a patch that fixes a possible off-by-one error in the register settings. This would affect the bug you're reporting. If possible, please apply the patch to your 6.12-rc1, test and report the result. Best regards Thomas > > $ sudo lspci -vvv -s 0000:08:00.0 > 08:00.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200e [Pilot] ServerEngines (SEP1) (rev 05) (prog-if 00 [VGA controller]) > Subsystem: Intel Corporation Device 0103 > Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- > Interrupt: pin A routed to IRQ 16 > Region 0: Memory at 90000000 (32-bit, prefetchable) [size=16M] > Region 1: Memory at 91800000 (32-bit, non-prefetchable) [size=16K] > Region 2: Memory at 91000000 (32-bit, non-prefetchable) [size=8M] > Expansion ROM at 91810000 [disabled] [size=64K] > Capabilities: [dc] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [e4] Express (v1) Legacy Endpoint, MSI 00 > DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us > ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- > DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- > RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- > MaxPayload 128 bytes, MaxReadReq 128 bytes > DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- > LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- > LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s (ok), Width x1 (ok) > TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- > Capabilities: [54] MSI: Enable- Count=1/1 Maskable- 64bit- > Address: 00000000 Data: 0000 > Kernel driver in use: mgag200 > Kernel modules: mgag200 > > -Tony
On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: > Hi > > Am 02.10.24 um 18:15 schrieb Luck, Tony: > >> Thanks for the bug report. Can you provide the output of 'sudo lspci > >> -vvv' for the graphics device? > > Thomas, > > > > Sure. Here's the output (run on the v6.11.0 kernel) > > Thanks. It doesn't look much different from other systems. IRQ is also > assigned. > > Attached is a patch that fixes a possible off-by-one error in the > register settings. This would affect the bug you're reporting. If > possible, please apply the patch to your 6.12-rc1, test and report the > result. Didn't one of these weird variants have some bug where the CRTC startadd was not working? Is this one of those? That to me sounds like maybe linecomp has internally been tied to be always active somehow. Perhaps that would also prevent it from generating the interrupt... Anyways, sounds like someone should just double check whether the status bit ever get asserted or not. If yes, then the problem must be with interrupt delivery, otherwise the problem is that the internal interrupt is never even generated. In the latter case you could try using the vsync interrupt instead.
Hi thanks for your help. Am 04.10.24 um 12:01 schrieb Ville Syrjälä: > On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: >> Hi >> >> Am 02.10.24 um 18:15 schrieb Luck, Tony: >>>> Thanks for the bug report. Can you provide the output of 'sudo lspci >>>> -vvv' for the graphics device? >>> Thomas, >>> >>> Sure. Here's the output (run on the v6.11.0 kernel) >> Thanks. It doesn't look much different from other systems. IRQ is also >> assigned. >> >> Attached is a patch that fixes a possible off-by-one error in the >> register settings. This would affect the bug you're reporting. If >> possible, please apply the patch to your 6.12-rc1, test and report the >> result. > Didn't one of these weird variants have some bug where the > CRTC startadd was not working? Is this one of those? Yeah, but it seems unrelated. > That to me sounds like maybe linecomp has internally been > tied to be always active somehow. Perhaps that would > also prevent it from generating the interrupt... Linecomp is usually set to vtotal and that disables the irq. When set to vblank_start/vdisplay_end, it acts like a vblank IRQ. But the other matrox drivers I saw (fbdev, Xorg-video-matrox) set the value -1, while mgag200 doesn't. So there really is an off-by-one error. > > Anyways, sounds like someone should just double check whether > the status bit ever get asserted or not. If yes, then the > problem must be with interrupt delivery, otherwise the > problem is that the internal interrupt is never even > generated. In the latter case you could try using the > vsync interrupt instead. I didn't want to go into full debugging while there's a low-hanging fix to try first. I'll probably take that patch anyway even if it doesn't fix the reported bug. Wrt. vsync: isn't that way to late for vblank events? Does DRM give any timing guarantees? (It doesn't AFAIK.) Or does it just mean that a vblank has happened at some point in the past? Best regards Thomas >
On Fri, Oct 04, 2024 at 01:03:21PM +0200, Thomas Zimmermann wrote: > Hi > > thanks for your help. > > > Am 04.10.24 um 12:01 schrieb Ville Syrjälä: > > On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: > >> Hi > >> > >> Am 02.10.24 um 18:15 schrieb Luck, Tony: > >>>> Thanks for the bug report. Can you provide the output of 'sudo lspci > >>>> -vvv' for the graphics device? > >>> Thomas, > >>> > >>> Sure. Here's the output (run on the v6.11.0 kernel) > >> Thanks. It doesn't look much different from other systems. IRQ is also > >> assigned. > >> > >> Attached is a patch that fixes a possible off-by-one error in the > >> register settings. This would affect the bug you're reporting. If > >> possible, please apply the patch to your 6.12-rc1, test and report the > >> result. > > Didn't one of these weird variants have some bug where the > > CRTC startadd was not working? Is this one of those? > > Yeah, but it seems unrelated. > > > That to me sounds like maybe linecomp has internally been > > tied to be always active somehow. Perhaps that would > > also prevent it from generating the interrupt... > > Linecomp is usually set to vtotal and that disables the irq. When set to > vblank_start/vdisplay_end, it acts like a vblank IRQ. But the other > matrox drivers I saw (fbdev, Xorg-video-matrox) set the value -1, while > mgag200 doesn't. So there really is an off-by-one error. For the purposes of the interrupt it shouldn't matter at all what the linecomp value is, as long as it's between 0 and vtotal. The patch seemed to just care about vblkstr which doesn't seem relevant to me. > > > > > Anyways, sounds like someone should just double check whether > > the status bit ever get asserted or not. If yes, then the > > problem must be with interrupt delivery, otherwise the > > problem is that the internal interrupt is never even > > generated. In the latter case you could try using the > > vsync interrupt instead. > > I didn't want to go into full debugging while there's a low-hanging fix > to try first. I'll probably take that patch anyway even if it doesn't > fix the reported bug. > > Wrt. vsync: isn't that way to late for vblank events? Does DRM give any > timing guarantees? (It doesn't AFAIK.) Or does it just mean that a > vblank has happened at some point in the past? It doesn't really matter when the interrupt gets signalled as long as it's after vblank start. And since the hardware doesn't even have double buffered register and IIRC doesn't really care when you reprogram eg. the start address it should matter even less. Not that it looks like you even try to do any atomic updates from the vblank handler, so I guess you just want this for throttling purposes?
Hi Am 04.10.24 um 13:19 schrieb Ville Syrjälä: > On Fri, Oct 04, 2024 at 01:03:21PM +0200, Thomas Zimmermann wrote: >> Hi >> >> thanks for your help. >> >> >> Am 04.10.24 um 12:01 schrieb Ville Syrjälä: >>> On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: >>>> Hi >>>> >>>> Am 02.10.24 um 18:15 schrieb Luck, Tony: >>>>>> Thanks for the bug report. Can you provide the output of 'sudo lspci >>>>>> -vvv' for the graphics device? >>>>> Thomas, >>>>> >>>>> Sure. Here's the output (run on the v6.11.0 kernel) >>>> Thanks. It doesn't look much different from other systems. IRQ is also >>>> assigned. >>>> >>>> Attached is a patch that fixes a possible off-by-one error in the >>>> register settings. This would affect the bug you're reporting. If >>>> possible, please apply the patch to your 6.12-rc1, test and report the >>>> result. >>> Didn't one of these weird variants have some bug where the >>> CRTC startadd was not working? Is this one of those? >> Yeah, but it seems unrelated. >> >>> That to me sounds like maybe linecomp has internally been >>> tied to be always active somehow. Perhaps that would >>> also prevent it from generating the interrupt... >> Linecomp is usually set to vtotal and that disables the irq. When set to >> vblank_start/vdisplay_end, it acts like a vblank IRQ. But the other >> matrox drivers I saw (fbdev, Xorg-video-matrox) set the value -1, while >> mgag200 doesn't. So there really is an off-by-one error. > For the purposes of the interrupt it shouldn't matter > at all what the linecomp value is, as long as it's > between 0 and vtotal. The patch seemed to just care > about vblkstr which doesn't seem relevant to me. vblkstr is "vblank start" and equal to vdisplay_end. Then linecomp = vblkstr; happens at some later point in the function. I've run into several mysterious vblank timeouts while making this patchset and they all seemed to be related to the exact values in these registers. So I'm not sure if linecomp really fires an interrupt if it happens too late after vdisplay_end/vblank_start. The official documentation is a bit confusing IIRC. So my first step here is to make mgag200 behave like other existing drivers and see if that fixes the issue. Hence the off-by-one fix. > >>> Anyways, sounds like someone should just double check whether >>> the status bit ever get asserted or not. If yes, then the >>> problem must be with interrupt delivery, otherwise the >>> problem is that the internal interrupt is never even >>> generated. In the latter case you could try using the >>> vsync interrupt instead. >> I didn't want to go into full debugging while there's a low-hanging fix >> to try first. I'll probably take that patch anyway even if it doesn't >> fix the reported bug. >> >> Wrt. vsync: isn't that way to late for vblank events? Does DRM give any >> timing guarantees? (It doesn't AFAIK.) Or does it just mean that a >> vblank has happened at some point in the past? > It doesn't really matter when the interrupt gets signalled > as long as it's after vblank start. And since the hardware > doesn't even have double buffered register and IIRC doesn't > really care when you reprogram eg. the start address it should > matter even less. Not that it looks like you even try to > do any atomic updates from the vblank handler, so I guess > you just want this for throttling purposes? I see. VSYNC would likely work for that. Throttling is the main purpose. Best regards Thomas >
Thomas, v6.12-rc1 plus your off-by-one patch is still broken. Console log from when things went off the rails: [ 32.126676] Console: switching to colour dummy device 80x25 [ 32.134887] mgag200 0000:08:00.0: vgaarb: deactivate vga console [ OK ] Started Show Plymouth Boot Screen. [ OK ] Started Forward Password R…[ 32.155183] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) s to Plymouth Di[ 32.157213] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 rectory Watch 32.167994] mpt2sas_cm0: request pool(0x00000000b4be1d72) - dma(0xf880000): depth(3200), frame_size(128), pool_size(400 kB) m. [ OK ] Reached target Path Units. [ OK ] Reached target Basic System. [ 32.190444] fbcon: mgag200drmfb (fb0) is primary device [ 32.224946] mpt2sas_cm0: sense pool(0x000000005610eff3) - dma(0x10100000): depth(2939), element_size(96), pool_size (275 kB) [ 32.225059] mpt2sas_cm0: reply pool(0x000000000f24e619) - dma(0x10180000): depth(3264), frame_size(128), pool_size(408 kB) [ 32.225073] mpt2sas_cm0: config page(0x00000000ba53d4ed) - dma(0xfea3000): size(512) [ 32.225076] mpt2sas_cm0: Allocated physical memory: size(7012 kB) [ 32.225078] mpt2sas_cm0: Current Controller Queue Depth(2936),Max Controller Queue Depth(3072) [ 32.225080] mpt2sas_cm0: Scatter Gather Elements per IO(128) [ 32.242578] ixgbe 0000:03:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 [ 32.273473] mpt2sas_cm0: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05) [ 32.273486] mpt2sas_cm0: Intel(R) Controller: Subsystem ID: 0x3050 [ 32.273490] mpt2sas_cm0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) [ 32.273693] scsi host6: Fusion MPT SAS Host [ 32.281337] mpt2sas_cm0: sending port enable !! [ 32.327525] ixgbe 0000:03:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) [ 32.349384] ------------[ cut here ]------------ [ 32.349467] [CRTC:34:crtc-0] vblank wait timed out [ 32.349549] WARNING: CPU: 164 PID: 1820 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.349600] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel mgag200(+) ghash_clmulni_intel i2c_algo_bit sha512_ssse3 drm_shmem_helper drm_kms_helper sha256_ssse3 sha1_ssse3 ixgbe(+) mpt3sas mdio drm raid_class scsi_transport_sas dca wmi fuse [ 32.349676] CPU: 164 UID: 0 PID: 1820 Comm: systemd-udevd Not tainted 6.12.0-rc1+ #170 [ 32.349694] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 32.349696] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.349706] Code: 00 48 8d 7b 08 e8 61 96 36 e8 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 e0 e1 c0 e8 21 3d 2e e8 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 [ 32.349708] RSP: 0018:ffffa373a4097680 EFLAGS: 00010282 [ 32.349712] RAX: 0000000000000026 RBX: ffff94c556084028 RCX: 0000000000000000 [ 32.349715] RDX: 0000000000000002 RSI: ffffffffaaa3ec38 RDI: 00000000ffffffff [ 32.349717] RBP: ffff94c55a259e00 R08: 0000000000000001 R09: 0000000000000000 [ 32.349719] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 32.349721] R13: 0000000000000000 R14: ffff94c55b1893f0 R15: 0000000000000000 [ 32.349723] FS: 00007fc95bdc2b40(0000) GS:ffff94d1b2800000(0000) knlGS:0000000000000000 [ 32.349726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.349728] CR2: 00007f906e3ca948 CR3: 000000000cd1c002 CR4: 00000000003706f0 [ 32.349730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 32.349732] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 32.349734] Call Trace: [ 32.349736] <TASK> [ 32.349739] ? __warn+0x90/0x1a0 [ 32.349754] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.349764] ? report_bug+0x1c3/0x1d0 [ 32.349779] ? handle_bug+0x5b/0xa0 [ 32.349789] ? exc_invalid_op+0x14/0x70 [ 32.349793] ? asm_exc_invalid_op+0x16/0x20 [ 32.349811] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.349822] ? __pfx_autoremove_wake_function+0x10/0x10 [ 32.349833] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] [ 32.349843] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 32.349851] commit_tail+0x94/0x130 [drm_kms_helper] [ 32.349862] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 32.349872] drm_atomic_commit+0x97/0xb0 [drm] [ 32.349923] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 32.349955] drm_client_modeset_commit_atomic+0x207/0x250 [drm] [ 32.349991] drm_client_modeset_commit_locked+0x5b/0x190 [drm] [ 32.350015] drm_client_modeset_commit+0x24/0x50 [drm] [ 32.350038] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] [ 32.350050] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] [ 32.350059] fbcon_init+0x2a8/0x560 [ 32.350070] visual_init+0xc4/0x120 [ 32.350078] do_bind_con_driver.isra.0+0x1a1/0x3d0 [ 32.350086] do_take_over_console+0x10b/0x1a0 [ 32.350092] do_fbcon_takeover+0x5c/0xc0 [ 32.350095] fbcon_fb_registered+0x49/0x70 [ 32.350098] do_register_framebuffer+0x184/0x230 [ 32.350109] register_framebuffer+0x20/0x40 [ 32.350112] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] [ 32.350122] ? drm_client_register+0x33/0xc0 [drm] [ 32.350154] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] [ 32.350160] drm_client_register+0x7b/0xc0 [drm] [ 32.350184] mgag200_pci_probe+0x90/0x180 [mgag200] [ 32.350191] local_pci_probe+0x46/0xa0 [ 32.350199] pci_device_probe+0xb5/0x220 [ 32.350206] really_probe+0xd9/0x380 [ 32.350214] __driver_probe_device+0x78/0x150 [ 32.350249] driver_probe_device+0x1e/0x90 [ 32.350254] __driver_attach+0xd6/0x1d0 [ 32.350258] ? __pfx___driver_attach+0x10/0x10 [ 32.350261] bus_for_each_dev+0x66/0xa0 [ 32.350267] bus_add_driver+0x111/0x240 [ 32.350272] driver_register+0x5c/0x120 [ 32.350280] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] [ 32.350285] do_one_initcall+0x62/0x3a0 [ 32.350299] ? __kmalloc_cache_noprof+0x240/0x300 [ 32.350315] do_init_module+0x64/0x240 [ 32.350329] init_module_from_file+0x7a/0xa0 [ 32.350341] idempotent_init_module+0x15f/0x260 [ 32.350353] __x64_sys_finit_module+0x5a/0xb0 [ 32.350358] do_syscall_64+0x73/0x190 [ 32.350364] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 32.350368] RIP: 0033:0x7fc95ca07e0d [ 32.350371] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 [ 32.350373] RSP: 002b:00007ffef10b2468 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 32.350377] RAX: ffffffffffffffda RBX: 0000562610eb0f80 RCX: 00007fc95ca07e0d [ 32.350379] RDX: 0000000000000000 RSI: 00007fc95cb6132c RDI: 0000000000000010 [ 32.350381] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 [ 32.350384] R10: 0000000000000010 R11: 0000000000000246 R12: 00007fc95cb6132c [ 32.350386] R13: 0000562610ed00d0 R14: 0000000000000007 R15: 0000562610ed0380 [ 32.350397] </TASK> [ 32.350399] irq event stamp: 51727 [ 32.350402] hardirqs last enabled at (51733): [<ffffffffa918f854>] vprintk_emit+0x3d4/0x3e0 [ 32.350407] hardirqs last disabled at (51738): [<ffffffffa918f807>] vprintk_emit+0x387/0x3e0 [ 32.350409] softirqs last enabled at (51576): [<ffffffffa90e2891>] __irq_exit_rcu+0xa1/0x110 [ 32.350420] softirqs last disabled at (51569): [<ffffffffa90e2891>] __irq_exit_rcu+0xa1/0x110 [ 32.350423] ---[ end trace 0000000000000000 ]--- [ 32.350452] Console: switching to colour frame buffer device 128x48
Hi Am 04.10.24 um 12:01 schrieb Ville Syrjälä: > On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: >> Hi >> >> Am 02.10.24 um 18:15 schrieb Luck, Tony: >>>> Thanks for the bug report. Can you provide the output of 'sudo lspci >>>> -vvv' for the graphics device? >>> Thomas, >>> >>> Sure. Here's the output (run on the v6.11.0 kernel) >> Thanks. It doesn't look much different from other systems. IRQ is also >> assigned. >> >> Attached is a patch that fixes a possible off-by-one error in the >> register settings. This would affect the bug you're reporting. If >> possible, please apply the patch to your 6.12-rc1, test and report the >> result. > Didn't one of these weird variants have some bug where the > CRTC startadd was not working? Is this one of those? > That to me sounds like maybe linecomp has internally been > tied to be always active somehow. Perhaps that would > also prevent it from generating the interrupt... Impressive debugging skills! The broken chip has vendor id 0x0522 according to commit 21e74bf99596 ("drm/mgag200: Store HW_BUG_NO_STARTADD flag in device info"). And that's the same type the Tony reported. [1] I'm just not sure if it's worth special casing the chip again or simply revert vblank irqs. Best regards Thomas [1] https://admin.pci-ids.ucw.cz/read/PC/102b/0522 > > Anyways, sounds like someone should just double check whether > the status bit ever get asserted or not. If yes, then the > problem must be with interrupt delivery, otherwise the > problem is that the internal interrupt is never even > generated. In the latter case you could try using the > vsync interrupt instead. >
On Mon, Oct 07, 2024 at 03:37:40PM +0200, Thomas Zimmermann wrote: > Hi > > Am 04.10.24 um 12:01 schrieb Ville Syrjälä: > > On Fri, Oct 04, 2024 at 11:17:02AM +0200, Thomas Zimmermann wrote: > >> Hi > >> > >> Am 02.10.24 um 18:15 schrieb Luck, Tony: > >>>> Thanks for the bug report. Can you provide the output of 'sudo lspci > >>>> -vvv' for the graphics device? > >>> Thomas, > >>> > >>> Sure. Here's the output (run on the v6.11.0 kernel) > >> Thanks. It doesn't look much different from other systems. IRQ is also > >> assigned. > >> > >> Attached is a patch that fixes a possible off-by-one error in the > >> register settings. This would affect the bug you're reporting. If > >> possible, please apply the patch to your 6.12-rc1, test and report the > >> result. > > Didn't one of these weird variants have some bug where the > > CRTC startadd was not working? Is this one of those? > > That to me sounds like maybe linecomp has internally been > > tied to be always active somehow. Perhaps that would > > also prevent it from generating the interrupt... > > Impressive debugging skills! The broken chip has vendor id 0x0522 > according to commit 21e74bf99596 ("drm/mgag200: Store HW_BUG_NO_STARTADD > flag in device info"). And that's the same type the Tony reported. [1] > I'm just not sure if it's worth special casing the chip again or simply > revert vblank irqs. Heh. Though I'm not sure if my theory is quite right. It seems I've been confused about linecomp all these years; I thought the split screen effect affected both VGA and MGA modes (at least on the older chips), but looks like it never affected MGA mode. I tested it here on a 2064w based card, which is almost as old as you can go (I do have an older Athena based card somewhere as well but didn't bother digging it up).
Hi Am 04.10.24 um 18:58 schrieb Luck, Tony: > Thomas, > > v6.12-rc1 plus your off-by-one patch is still broken. Thanks for testing. Here's another patch to try Ville's suggestion. It should disable HW vblank IRQs on your system. Could you please test it and report on the results? Best regards Thomas > > Console log from when things went off the rails: > > [ 32.126676] Console: switching to colour dummy device 80x25 > [ 32.134887] mgag200 0000:08:00.0: vgaarb: deactivate vga console > [ OK ] Started Show Plymouth Boot Screen. > [ OK ] Started Forward Password R…[ 32.155183] mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) > s to Plymouth Di[ 32.157213] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 > rectory Watch 32.167994] mpt2sas_cm0: request pool(0x00000000b4be1d72) - dma(0xf880000): depth(3200), frame_size(128), pool_size(400 kB) > m. > [ OK ] Reached target Path Units. > [ OK ] Reached target Basic System. > [ 32.190444] fbcon: mgag200drmfb (fb0) is primary device > [ 32.224946] mpt2sas_cm0: sense pool(0x000000005610eff3) - dma(0x10100000): depth(2939), element_size(96), pool_size (275 kB) > [ 32.225059] mpt2sas_cm0: reply pool(0x000000000f24e619) - dma(0x10180000): depth(3264), frame_size(128), pool_size(408 kB) > [ 32.225073] mpt2sas_cm0: config page(0x00000000ba53d4ed) - dma(0xfea3000): size(512) > [ 32.225076] mpt2sas_cm0: Allocated physical memory: size(7012 kB) > [ 32.225078] mpt2sas_cm0: Current Controller Queue Depth(2936),Max Controller Queue Depth(3072) > [ 32.225080] mpt2sas_cm0: Scatter Gather Elements per IO(128) > [ 32.242578] ixgbe 0000:03:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 > [ 32.273473] mpt2sas_cm0: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05) > [ 32.273486] mpt2sas_cm0: Intel(R) Controller: Subsystem ID: 0x3050 > [ 32.273490] mpt2sas_cm0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) > [ 32.273693] scsi host6: Fusion MPT SAS Host > [ 32.281337] mpt2sas_cm0: sending port enable !! > [ 32.327525] ixgbe 0000:03:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) > [ 32.349384] ------------[ cut here ]------------ > [ 32.349467] [CRTC:34:crtc-0] vblank wait timed out > [ 32.349549] WARNING: CPU: 164 PID: 1820 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.349600] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel mgag200(+) ghash_clmulni_intel i2c_algo_bit sha512_ssse3 drm_shmem_helper drm_kms_helper sha256_ssse3 sha1_ssse3 ixgbe(+) mpt3sas mdio drm raid_class scsi_transport_sas dca wmi fuse > [ 32.349676] CPU: 164 UID: 0 PID: 1820 Comm: systemd-udevd Not tainted 6.12.0-rc1+ #170 > [ 32.349694] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 32.349696] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.349706] Code: 00 48 8d 7b 08 e8 61 96 36 e8 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 e0 e1 c0 e8 21 3d 2e e8 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 > [ 32.349708] RSP: 0018:ffffa373a4097680 EFLAGS: 00010282 > [ 32.349712] RAX: 0000000000000026 RBX: ffff94c556084028 RCX: 0000000000000000 > [ 32.349715] RDX: 0000000000000002 RSI: ffffffffaaa3ec38 RDI: 00000000ffffffff > [ 32.349717] RBP: ffff94c55a259e00 R08: 0000000000000001 R09: 0000000000000000 > [ 32.349719] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > [ 32.349721] R13: 0000000000000000 R14: ffff94c55b1893f0 R15: 0000000000000000 > [ 32.349723] FS: 00007fc95bdc2b40(0000) GS:ffff94d1b2800000(0000) knlGS:0000000000000000 > [ 32.349726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 32.349728] CR2: 00007f906e3ca948 CR3: 000000000cd1c002 CR4: 00000000003706f0 > [ 32.349730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 32.349732] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 32.349734] Call Trace: > [ 32.349736] <TASK> > [ 32.349739] ? __warn+0x90/0x1a0 > [ 32.349754] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.349764] ? report_bug+0x1c3/0x1d0 > [ 32.349779] ? handle_bug+0x5b/0xa0 > [ 32.349789] ? exc_invalid_op+0x14/0x70 > [ 32.349793] ? asm_exc_invalid_op+0x16/0x20 > [ 32.349811] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.349822] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 32.349833] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] > [ 32.349843] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 32.349851] commit_tail+0x94/0x130 [drm_kms_helper] > [ 32.349862] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 32.349872] drm_atomic_commit+0x97/0xb0 [drm] > [ 32.349923] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 32.349955] drm_client_modeset_commit_atomic+0x207/0x250 [drm] > [ 32.349991] drm_client_modeset_commit_locked+0x5b/0x190 [drm] > [ 32.350015] drm_client_modeset_commit+0x24/0x50 [drm] > [ 32.350038] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] > [ 32.350050] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] > [ 32.350059] fbcon_init+0x2a8/0x560 > [ 32.350070] visual_init+0xc4/0x120 > [ 32.350078] do_bind_con_driver.isra.0+0x1a1/0x3d0 > [ 32.350086] do_take_over_console+0x10b/0x1a0 > [ 32.350092] do_fbcon_takeover+0x5c/0xc0 > [ 32.350095] fbcon_fb_registered+0x49/0x70 > [ 32.350098] do_register_framebuffer+0x184/0x230 > [ 32.350109] register_framebuffer+0x20/0x40 > [ 32.350112] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] > [ 32.350122] ? drm_client_register+0x33/0xc0 [drm] > [ 32.350154] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] > [ 32.350160] drm_client_register+0x7b/0xc0 [drm] > [ 32.350184] mgag200_pci_probe+0x90/0x180 [mgag200] > [ 32.350191] local_pci_probe+0x46/0xa0 > [ 32.350199] pci_device_probe+0xb5/0x220 > [ 32.350206] really_probe+0xd9/0x380 > [ 32.350214] __driver_probe_device+0x78/0x150 > [ 32.350249] driver_probe_device+0x1e/0x90 > [ 32.350254] __driver_attach+0xd6/0x1d0 > [ 32.350258] ? __pfx___driver_attach+0x10/0x10 > [ 32.350261] bus_for_each_dev+0x66/0xa0 > [ 32.350267] bus_add_driver+0x111/0x240 > [ 32.350272] driver_register+0x5c/0x120 > [ 32.350280] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] > [ 32.350285] do_one_initcall+0x62/0x3a0 > [ 32.350299] ? __kmalloc_cache_noprof+0x240/0x300 > [ 32.350315] do_init_module+0x64/0x240 > [ 32.350329] init_module_from_file+0x7a/0xa0 > [ 32.350341] idempotent_init_module+0x15f/0x260 > [ 32.350353] __x64_sys_finit_module+0x5a/0xb0 > [ 32.350358] do_syscall_64+0x73/0x190 > [ 32.350364] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 32.350368] RIP: 0033:0x7fc95ca07e0d > [ 32.350371] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 > [ 32.350373] RSP: 002b:00007ffef10b2468 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > [ 32.350377] RAX: ffffffffffffffda RBX: 0000562610eb0f80 RCX: 00007fc95ca07e0d > [ 32.350379] RDX: 0000000000000000 RSI: 00007fc95cb6132c RDI: 0000000000000010 > [ 32.350381] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 > [ 32.350384] R10: 0000000000000010 R11: 0000000000000246 R12: 00007fc95cb6132c > [ 32.350386] R13: 0000562610ed00d0 R14: 0000000000000007 R15: 0000562610ed0380 > [ 32.350397] </TASK> > [ 32.350399] irq event stamp: 51727 > [ 32.350402] hardirqs last enabled at (51733): [<ffffffffa918f854>] vprintk_emit+0x3d4/0x3e0 > [ 32.350407] hardirqs last disabled at (51738): [<ffffffffa918f807>] vprintk_emit+0x387/0x3e0 > [ 32.350409] softirqs last enabled at (51576): [<ffffffffa90e2891>] __irq_exit_rcu+0xa1/0x110 > [ 32.350420] softirqs last disabled at (51569): [<ffffffffa90e2891>] __irq_exit_rcu+0xa1/0x110 > [ 32.350423] ---[ end trace 0000000000000000 ]--- > [ 32.350452] Console: switching to colour frame buffer device 128x48
> Thanks for testing. Here's another patch to try Ville's suggestion. It > should disable HW vblank IRQs on your system. Could you please test it > and report on the results? Thomas, Thanks for keeping working on this. Output is different, but still dies with vblank problems. [ OK ] Started GNOME Display Manager. [ 329.575813] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 329.582889] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out [ 329.719779] ------------[ cut here ]------------ [ 329.725174] [CRTC:34:crtc-0] vblank wait timed out [ 329.730724] WARNING: CPU: 150 PID: 1402 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 329.746264] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set rfkill nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac iTCO_wdt intel_pmc_bxt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp rapl intel_cstate joydev intel_uncore acpi_ipmi pcspkr mei_me i2c_i801 i2c_smbus ipmi_si lpc_ich mei ioatdma wmi ipmi_devintf ipmi_msghandler acpi_pad zram ip_tables crct10dif_pclmul crc32_pclmul mgag200 crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 [ 329.746604] drm_kms_helper sha256_ssse3 mpt3sas sha1_ssse3 ixgbe raid_class mdio drm scsi_transport_sas dca fuse [ 329.858506] CPU: 150 UID: 0 PID: 1402 Comm: kworker/150:1 Tainted: G W 6.12.0-rc2+ #171 [ 329.869030] Tainted: [W]=WARN [ 329.872357] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 329.883941] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] [ 329.891472] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 329.900937] Code: 00 48 8d 7b 08 e8 41 b7 38 d1 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 40 df c0 e8 21 61 30 d1 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 [ 329.921932] RSP: 0018:ffffbb9f23277c00 EFLAGS: 00010286 [ 329.927797] RAX: 0000000000000026 RBX: ffff9de18562e028 RCX: 0000000000000000 [ 329.935793] RDX: 0000000000000002 RSI: ffffffff93a00e78 RDI: 00000000ffffffff [ 329.943786] RBP: ffff9e13d910dc80 R08: 0000000000000000 R09: ffffbb9f23277ac0 [ 329.951778] R10: ffffbb9f23277ab8 R11: ffff9e33811fffe8 R12: 0000000000000000 [ 329.959784] R13: 0000000000000000 R14: ffff9de0ada653f0 R15: 0000000000000000 [ 329.967777] FS: 0000000000000000(0000) GS:ffff9e2032100000(0000) knlGS:0000000000000000 [ 329.976838] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 329.983280] CR2: 0000555ce9d0d030 CR3: 0000003eccc3a004 CR4: 00000000003706f0 [ 329.991273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 329.999268] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 330.007262] Call Trace: [ 330.010011] <TASK> [ 330.012383] ? __warn+0x90/0x1a0 [ 330.016022] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 330.024803] ? report_bug+0x1c3/0x1d0 [ 330.028924] ? __irq_work_queue_local+0x48/0x130 [ 330.034116] ? handle_bug+0x5b/0xa0 [ 330.038043] ? exc_invalid_op+0x14/0x70 [ 330.042353] ? asm_exc_invalid_op+0x16/0x20 [ 330.047064] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 330.055851] ? __pfx_autoremove_wake_function+0x10/0x10 [ 330.061723] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] [ 330.068954] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 330.077057] commit_tail+0x94/0x130 [drm_kms_helper] [ 330.082642] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 330.089597] drm_atomic_commit+0x97/0xb0 [drm] [ 330.094706] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 330.100624] drm_atomic_helper_dirtyfb+0x185/0x250 [drm_kms_helper] [ 330.107672] drm_fbdev_shmem_helper_fb_dirty+0x4c/0xb0 [drm_shmem_helper] [ 330.115282] drm_fb_helper_damage_work+0x83/0x150 [drm_kms_helper] [ 330.122221] process_one_work+0x214/0x600 [ 330.126727] worker_thread+0x17f/0x320 [ 330.130932] ? __pfx_worker_thread+0x10/0x10 [ 330.135714] kthread+0xe0/0x110 [ 330.139245] ? __pfx_kthread+0x10/0x10 [ 330.143455] ret_from_fork+0x30/0x50 [ 330.147473] ? __pfx_kthread+0x10/0x10 [ 330.151683] ret_from_fork_asm+0x1a/0x30 [ 330.156104] </TASK> [ 330.158553] irq event stamp: 68963 [ 330.162368] hardirqs last enabled at (68975): [<ffffffff92183fae>] __up_console_sem+0x5e/0x70 [ 330.172011] hardirqs last disabled at (68986): [<ffffffff92183f93>] __up_console_sem+0x43/0x70 [ 330.181647] softirqs last enabled at (68850): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 330.191195] softirqs last disabled at (69007): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 330.200734] ---[ end trace 0000000000000000 ]--- [ 340.327342] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 340.334379] mgag200 0000:08:00.0: [drm] *ERROR* [CRTC:34:crtc-0] commit wait timed out [ 350.566891] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 350.573925] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out [ 350.710886] ------------[ cut here ]------------
Apologies. The trace below isn't the first place where things went wrong. I dug up the full serial log and found some earlier mgag errors. Actual first one is: [ OK ] Reached target Basic System. [ 32.366479] fbcon: mgag200drmfb (fb0) is primary device [ 32.405678] mpt2sas_cm0: sense pool(0x00000000dfa0f36f) - dma(0xf500000): depth(2939), element_size(96), pool_size (275 kB) [ 32.405790] mpt2sas_cm0: reply pool(0x000000004919fe15) - dma(0xf580000): depth(3264), frame_size(128), pool_size(408 kB) [ 32.405804] mpt2sas_cm0: config page(0x00000000ac9398d5) - dma(0xf2e4000): size(512) [ 32.405806] mpt2sas_cm0: Allocated physical memory: size(7012 kB) [ 32.405808] mpt2sas_cm0: Current Controller Queue Depth(2936),Max Controller Queue Depth(3072) [ 32.405810] mpt2sas_cm0: Scatter Gather Elements per IO(128) [ 32.436831] ixgbe 0000:03:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) [ 32.454205] mpt2sas_cm0: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05) [ 32.454218] mpt2sas_cm0: Intel(R) Controller: Subsystem ID: 0x3050 [ 32.454222] mpt2sas_cm0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) [ 32.454513] scsi host6: Fusion MPT SAS Host [ 32.461960] mpt2sas_cm0: sending port enable !! [ 32.520483] ------------[ cut here ]------------ [ 32.520517] [CRTC:34:crtc-0] vblank wait timed out [ 32.520582] WARNING: CPU: 114 PID: 1783 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.520603] Modules linked in: crct10dif_pclmul crc32_pclmul mgag200(+) crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 mpt3sas sha1_ssse3 ixgbe(+) raid_class mdio drm scsi_transport_sas dca fuse [ 32.520631] CPU: 114 UID: 0 PID: 1783 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #171 [ 32.520634] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 32.520637] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.520648] Code: 00 48 8d 7b 08 e8 41 b7 38 d1 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 40 df c0 e8 21 61 30 d1 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 [ 32.520651] RSP: 0018:ffffbb9f23fc3680 EFLAGS: 00010282 [ 32.520655] RAX: 0000000000000026 RBX: ffff9de18562e028 RCX: 0000000000000000 [ 32.520657] RDX: 0000000000000002 RSI: ffffffff93a00e78 RDI: 00000000ffffffff [ 32.520659] RBP: ffff9de18540eb40 R08: 0000000000000001 R09: 0000000000000000 [ 32.520662] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 32.520664] R13: 0000000000000000 R14: ffff9de0ada653f0 R15: 0000000000000000 [ 32.520667] FS: 00007f64988d9b40(0000) GS:ffff9de187900000(0000) knlGS:0000000000000000 [ 32.520669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 32.520671] CR2: 000055daa91a4b40 CR3: 000000000c13c003 CR4: 00000000003706f0 [ 32.520674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 32.520675] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 32.520677] Call Trace: [ 32.520680] <TASK> [ 32.520682] ? __warn+0x90/0x1a0 [ 32.520693] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.520703] ? report_bug+0x1c3/0x1d0 [ 32.520716] ? handle_bug+0x5b/0xa0 [ 32.520724] ? exc_invalid_op+0x14/0x70 [ 32.520727] ? asm_exc_invalid_op+0x16/0x20 [ 32.520741] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 32.520753] ? __pfx_autoremove_wake_function+0x10/0x10 [ 32.520766] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] [ 32.520776] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 32.520784] commit_tail+0x94/0x130 [drm_kms_helper] [ 32.520796] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 32.520807] drm_atomic_commit+0x97/0xb0 [drm] [ 32.520850] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 32.520881] drm_client_modeset_commit_atomic+0x207/0x250 [drm] [ 32.520918] drm_client_modeset_commit_locked+0x5b/0x190 [drm] [ 32.520945] drm_client_modeset_commit+0x24/0x50 [drm] [ 32.520970] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] [ 32.520982] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] [ 32.520992] fbcon_init+0x2a8/0x560 [ 32.521005] visual_init+0xc4/0x120 [ 32.521013] do_bind_con_driver.isra.0+0x1a1/0x3d0 [ 32.521020] do_take_over_console+0x10b/0x1a0 [ 32.521026] do_fbcon_takeover+0x5c/0xc0 [ 32.521028] fbcon_fb_registered+0x49/0x70 [ 32.521032] do_register_framebuffer+0x184/0x230 [ 32.521041] register_framebuffer+0x20/0x40 [ 32.521044] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] [ 32.521054] ? drm_client_register+0x33/0xc0 [drm] [ 32.521084] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] [ 32.521090] drm_client_register+0x7b/0xc0 [drm] [ 32.521116] mgag200_pci_probe+0x90/0x180 [mgag200] [ 32.521124] local_pci_probe+0x46/0xa0 [ 32.521131] pci_device_probe+0xb5/0x220 [ 32.521138] really_probe+0xd9/0x380 [ 32.521146] __driver_probe_device+0x78/0x150 [ 32.521151] driver_probe_device+0x1e/0x90 [ 32.521155] __driver_attach+0xd6/0x1d0 [ 32.521159] ? __pfx___driver_attach+0x10/0x10 [ 32.521162] bus_for_each_dev+0x66/0xa0 [ 32.521167] bus_add_driver+0x111/0x240 [ 32.521173] driver_register+0x5c/0x120 [ 32.521176] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] [ 32.521182] do_one_initcall+0x62/0x3a0 [ 32.521189] ? __kmalloc_cache_noprof+0x240/0x300 [ 32.521202] do_init_module+0x64/0x240 [ 32.521213] init_module_from_file+0x7a/0xa0 [ 32.521226] idempotent_init_module+0x15f/0x260 [ 32.521240] __x64_sys_finit_module+0x5a/0xb0 [ 32.521245] do_syscall_64+0x73/0x190 [ 32.521260] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 32.521265] RIP: 0033:0x7f649951ee0d [ 32.521271] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 [ 32.521273] RSP: 002b:00007ffc84905b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 32.521278] RAX: ffffffffffffffda RBX: 000055daa9188020 RCX: 00007f649951ee0d [ 32.521280] RDX: 0000000000000000 RSI: 00007f649967832c RDI: 0000000000000010 [ 32.521282] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 [ 32.521284] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f649967832c [ 32.521286] R13: 000055daa9189330 R14: 0000000000000007 R15: 000055daa91adb10 [ 32.521298] </TASK> [ 32.521300] irq event stamp: 52913 [ 32.521301] hardirqs last enabled at (52919): [<ffffffff92187784>] vprintk_emit+0x3d4/0x3e0 [ 32.521313] hardirqs last disabled at (52924): [<ffffffff92187737>] vprintk_emit+0x387/0x3e0 [ 32.521317] softirqs last enabled at (52274): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 32.521326] softirqs last disabled at (52267): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 32.521329] ---[ end trace 0000000000000000 ]--- -----Original Message----- From: Luck, Tony Sent: Thursday, October 10, 2024 9:07 AM To: Thomas Zimmermann <tzimmermann@suse.de> Cc: jfalempe@redhat.com; airlied@redhat.com; sam@ravnborg.org; emil.l.velikov@gmail.com; maarten.lankhorst@linux.intel.com; mripard@kernel.org; airlied@gmail.com; daniel@ffwll.ch; dri-devel@lists.freedesktop.org Subject: RE: [PATCH v5 0/7] drm/mgag200: Implement VBLANK support > Thanks for testing. Here's another patch to try Ville's suggestion. It > should disable HW vblank IRQs on your system. Could you please test it > and report on the results? Thomas, Thanks for keeping working on this. Output is different, but still dies with vblank problems. [ OK ] Started GNOME Display Manager. [ 329.575813] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 329.582889] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out [ 329.719779] ------------[ cut here ]------------ [ 329.725174] [CRTC:34:crtc-0] vblank wait timed out [ 329.730724] WARNING: CPU: 150 PID: 1402 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 329.746264] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set rfkill nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac iTCO_wdt intel_pmc_bxt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp rapl intel_cstate joydev intel_uncore acpi_ipmi pcspkr mei_me i2c_i801 i2c_smbus ipmi_si lpc_ich mei ioatdma wmi ipmi_devintf ipmi_msghandler acpi_pad zram ip_tables crct10dif_pclmul crc32_pclmul mgag200 crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 [ 329.746604] drm_kms_helper sha256_ssse3 mpt3sas sha1_ssse3 ixgbe raid_class mdio drm scsi_transport_sas dca fuse [ 329.858506] CPU: 150 UID: 0 PID: 1402 Comm: kworker/150:1 Tainted: G W 6.12.0-rc2+ #171 [ 329.869030] Tainted: [W]=WARN [ 329.872357] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 329.883941] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] [ 329.891472] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 329.900937] Code: 00 48 8d 7b 08 e8 41 b7 38 d1 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 40 df c0 e8 21 61 30 d1 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 [ 329.921932] RSP: 0018:ffffbb9f23277c00 EFLAGS: 00010286 [ 329.927797] RAX: 0000000000000026 RBX: ffff9de18562e028 RCX: 0000000000000000 [ 329.935793] RDX: 0000000000000002 RSI: ffffffff93a00e78 RDI: 00000000ffffffff [ 329.943786] RBP: ffff9e13d910dc80 R08: 0000000000000000 R09: ffffbb9f23277ac0 [ 329.951778] R10: ffffbb9f23277ab8 R11: ffff9e33811fffe8 R12: 0000000000000000 [ 329.959784] R13: 0000000000000000 R14: ffff9de0ada653f0 R15: 0000000000000000 [ 329.967777] FS: 0000000000000000(0000) GS:ffff9e2032100000(0000) knlGS:0000000000000000 [ 329.976838] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 329.983280] CR2: 0000555ce9d0d030 CR3: 0000003eccc3a004 CR4: 00000000003706f0 [ 329.991273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 329.999268] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 330.007262] Call Trace: [ 330.010011] <TASK> [ 330.012383] ? __warn+0x90/0x1a0 [ 330.016022] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 330.024803] ? report_bug+0x1c3/0x1d0 [ 330.028924] ? __irq_work_queue_local+0x48/0x130 [ 330.034116] ? handle_bug+0x5b/0xa0 [ 330.038043] ? exc_invalid_op+0x14/0x70 [ 330.042353] ? asm_exc_invalid_op+0x16/0x20 [ 330.047064] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] [ 330.055851] ? __pfx_autoremove_wake_function+0x10/0x10 [ 330.061723] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] [ 330.068954] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 330.077057] commit_tail+0x94/0x130 [drm_kms_helper] [ 330.082642] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 330.089597] drm_atomic_commit+0x97/0xb0 [drm] [ 330.094706] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 330.100624] drm_atomic_helper_dirtyfb+0x185/0x250 [drm_kms_helper] [ 330.107672] drm_fbdev_shmem_helper_fb_dirty+0x4c/0xb0 [drm_shmem_helper] [ 330.115282] drm_fb_helper_damage_work+0x83/0x150 [drm_kms_helper] [ 330.122221] process_one_work+0x214/0x600 [ 330.126727] worker_thread+0x17f/0x320 [ 330.130932] ? __pfx_worker_thread+0x10/0x10 [ 330.135714] kthread+0xe0/0x110 [ 330.139245] ? __pfx_kthread+0x10/0x10 [ 330.143455] ret_from_fork+0x30/0x50 [ 330.147473] ? __pfx_kthread+0x10/0x10 [ 330.151683] ret_from_fork_asm+0x1a/0x30 [ 330.156104] </TASK> [ 330.158553] irq event stamp: 68963 [ 330.162368] hardirqs last enabled at (68975): [<ffffffff92183fae>] __up_console_sem+0x5e/0x70 [ 330.172011] hardirqs last disabled at (68986): [<ffffffff92183f93>] __up_console_sem+0x43/0x70 [ 330.181647] softirqs last enabled at (68850): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 330.191195] softirqs last disabled at (69007): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 [ 330.200734] ---[ end trace 0000000000000000 ]--- [ 340.327342] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 340.334379] mgag200 0000:08:00.0: [drm] *ERROR* [CRTC:34:crtc-0] commit wait timed out [ 350.566891] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out [ 350.573925] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out [ 350.710886] ------------[ cut here ]------------
Hi Am 10.10.24 um 20:12 schrieb Luck, Tony: > Apologies. The trace below isn't the first place where things went wrong. I dug up the full serial log > and found some earlier mgag errors. Actual first one is: I have to apologize, as the patch I sent was incorrect. The if condition was inverted. Here's a fixed patch for you to test. Best regards Thomas > > [ OK ] Reached target Basic System. > [ 32.366479] fbcon: mgag200drmfb (fb0) is primary device > [ 32.405678] mpt2sas_cm0: sense pool(0x00000000dfa0f36f) - dma(0xf500000): depth(2939), element_size(96), pool_size (275 kB) > [ 32.405790] mpt2sas_cm0: reply pool(0x000000004919fe15) - dma(0xf580000): depth(3264), frame_size(128), pool_size(408 kB) > [ 32.405804] mpt2sas_cm0: config page(0x00000000ac9398d5) - dma(0xf2e4000): size(512) > [ 32.405806] mpt2sas_cm0: Allocated physical memory: size(7012 kB) > [ 32.405808] mpt2sas_cm0: Current Controller Queue Depth(2936),Max Controller Queue Depth(3072) > [ 32.405810] mpt2sas_cm0: Scatter Gather Elements per IO(128) > [ 32.436831] ixgbe 0000:03:00.0: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) > [ 32.454205] mpt2sas_cm0: LSISAS2308: FWVersion(17.00.01.00), ChipRevision(0x05) > [ 32.454218] mpt2sas_cm0: Intel(R) Controller: Subsystem ID: 0x3050 > [ 32.454222] mpt2sas_cm0: Protocol=(Initiator), Capabilities=(Raid,TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) > [ 32.454513] scsi host6: Fusion MPT SAS Host > [ 32.461960] mpt2sas_cm0: sending port enable !! > [ 32.520483] ------------[ cut here ]------------ > [ 32.520517] [CRTC:34:crtc-0] vblank wait timed out > [ 32.520582] WARNING: CPU: 114 PID: 1783 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.520603] Modules linked in: crct10dif_pclmul crc32_pclmul mgag200(+) crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 mpt3sas sha1_ssse3 ixgbe(+) raid_class mdio drm scsi_transport_sas dca fuse > [ 32.520631] CPU: 114 UID: 0 PID: 1783 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #171 > [ 32.520634] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 32.520637] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.520648] Code: 00 48 8d 7b 08 e8 41 b7 38 d1 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 40 df c0 e8 21 61 30 d1 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 > [ 32.520651] RSP: 0018:ffffbb9f23fc3680 EFLAGS: 00010282 > [ 32.520655] RAX: 0000000000000026 RBX: ffff9de18562e028 RCX: 0000000000000000 > [ 32.520657] RDX: 0000000000000002 RSI: ffffffff93a00e78 RDI: 00000000ffffffff > [ 32.520659] RBP: ffff9de18540eb40 R08: 0000000000000001 R09: 0000000000000000 > [ 32.520662] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 > [ 32.520664] R13: 0000000000000000 R14: ffff9de0ada653f0 R15: 0000000000000000 > [ 32.520667] FS: 00007f64988d9b40(0000) GS:ffff9de187900000(0000) knlGS:0000000000000000 > [ 32.520669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 32.520671] CR2: 000055daa91a4b40 CR3: 000000000c13c003 CR4: 00000000003706f0 > [ 32.520674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 32.520675] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 32.520677] Call Trace: > [ 32.520680] <TASK> > [ 32.520682] ? __warn+0x90/0x1a0 > [ 32.520693] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.520703] ? report_bug+0x1c3/0x1d0 > [ 32.520716] ? handle_bug+0x5b/0xa0 > [ 32.520724] ? exc_invalid_op+0x14/0x70 > [ 32.520727] ? asm_exc_invalid_op+0x16/0x20 > [ 32.520741] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 32.520753] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 32.520766] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] > [ 32.520776] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 32.520784] commit_tail+0x94/0x130 [drm_kms_helper] > [ 32.520796] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 32.520807] drm_atomic_commit+0x97/0xb0 [drm] > [ 32.520850] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 32.520881] drm_client_modeset_commit_atomic+0x207/0x250 [drm] > [ 32.520918] drm_client_modeset_commit_locked+0x5b/0x190 [drm] > [ 32.520945] drm_client_modeset_commit+0x24/0x50 [drm] > [ 32.520970] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] > [ 32.520982] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] > [ 32.520992] fbcon_init+0x2a8/0x560 > [ 32.521005] visual_init+0xc4/0x120 > [ 32.521013] do_bind_con_driver.isra.0+0x1a1/0x3d0 > [ 32.521020] do_take_over_console+0x10b/0x1a0 > [ 32.521026] do_fbcon_takeover+0x5c/0xc0 > [ 32.521028] fbcon_fb_registered+0x49/0x70 > [ 32.521032] do_register_framebuffer+0x184/0x230 > [ 32.521041] register_framebuffer+0x20/0x40 > [ 32.521044] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] > [ 32.521054] ? drm_client_register+0x33/0xc0 [drm] > [ 32.521084] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] > [ 32.521090] drm_client_register+0x7b/0xc0 [drm] > [ 32.521116] mgag200_pci_probe+0x90/0x180 [mgag200] > [ 32.521124] local_pci_probe+0x46/0xa0 > [ 32.521131] pci_device_probe+0xb5/0x220 > [ 32.521138] really_probe+0xd9/0x380 > [ 32.521146] __driver_probe_device+0x78/0x150 > [ 32.521151] driver_probe_device+0x1e/0x90 > [ 32.521155] __driver_attach+0xd6/0x1d0 > [ 32.521159] ? __pfx___driver_attach+0x10/0x10 > [ 32.521162] bus_for_each_dev+0x66/0xa0 > [ 32.521167] bus_add_driver+0x111/0x240 > [ 32.521173] driver_register+0x5c/0x120 > [ 32.521176] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] > [ 32.521182] do_one_initcall+0x62/0x3a0 > [ 32.521189] ? __kmalloc_cache_noprof+0x240/0x300 > [ 32.521202] do_init_module+0x64/0x240 > [ 32.521213] init_module_from_file+0x7a/0xa0 > [ 32.521226] idempotent_init_module+0x15f/0x260 > [ 32.521240] __x64_sys_finit_module+0x5a/0xb0 > [ 32.521245] do_syscall_64+0x73/0x190 > [ 32.521260] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 32.521265] RIP: 0033:0x7f649951ee0d > [ 32.521271] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 > [ 32.521273] RSP: 002b:00007ffc84905b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > [ 32.521278] RAX: ffffffffffffffda RBX: 000055daa9188020 RCX: 00007f649951ee0d > [ 32.521280] RDX: 0000000000000000 RSI: 00007f649967832c RDI: 0000000000000010 > [ 32.521282] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 > [ 32.521284] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f649967832c > [ 32.521286] R13: 000055daa9189330 R14: 0000000000000007 R15: 000055daa91adb10 > [ 32.521298] </TASK> > [ 32.521300] irq event stamp: 52913 > [ 32.521301] hardirqs last enabled at (52919): [<ffffffff92187784>] vprintk_emit+0x3d4/0x3e0 > [ 32.521313] hardirqs last disabled at (52924): [<ffffffff92187737>] vprintk_emit+0x387/0x3e0 > [ 32.521317] softirqs last enabled at (52274): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 > [ 32.521326] softirqs last disabled at (52267): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 > [ 32.521329] ---[ end trace 0000000000000000 ]--- > > -----Original Message----- > From: Luck, Tony > Sent: Thursday, October 10, 2024 9:07 AM > To: Thomas Zimmermann <tzimmermann@suse.de> > Cc: jfalempe@redhat.com; airlied@redhat.com; sam@ravnborg.org; emil.l.velikov@gmail.com; maarten.lankhorst@linux.intel.com; mripard@kernel.org; airlied@gmail.com; daniel@ffwll.ch; dri-devel@lists.freedesktop.org > Subject: RE: [PATCH v5 0/7] drm/mgag200: Implement VBLANK support > >> Thanks for testing. Here's another patch to try Ville's suggestion. It >> should disable HW vblank IRQs on your system. Could you please test it >> and report on the results? > Thomas, > > Thanks for keeping working on this. Output is different, but still dies with vblank problems. > > [ OK ] Started GNOME Display Manager. > [ 329.575813] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out > [ 329.582889] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out > [ 329.719779] ------------[ cut here ]------------ > [ 329.725174] [CRTC:34:crtc-0] vblank wait timed out > [ 329.730724] WARNING: CPU: 150 PID: 1402 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 329.746264] Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set rfkill nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac iTCO_wdt intel_pmc_bxt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp ipmi_ssif coretemp rapl intel_cstate joydev intel_uncore acpi_ipmi pcspkr mei_me i2c_i801 i2c_smbus ipmi_si lpc_ich mei ioatdma wmi ipmi_devintf ipmi_msghandler acpi_pad zram ip_tables crct10dif_pclmul crc32_pclmul mgag200 crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 > [ 329.746604] drm_kms_helper sha256_ssse3 mpt3sas sha1_ssse3 ixgbe raid_class mdio drm scsi_transport_sas dca fuse > [ 329.858506] CPU: 150 UID: 0 PID: 1402 Comm: kworker/150:1 Tainted: G W 6.12.0-rc2+ #171 > [ 329.869030] Tainted: [W]=WARN > [ 329.872357] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 329.883941] Workqueue: events drm_fb_helper_damage_work [drm_kms_helper] > [ 329.891472] RIP: 0010:drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 329.900937] Code: 00 48 8d 7b 08 e8 41 b7 38 d1 45 85 ff 0f 85 d3 fe ff ff 49 8b 56 20 41 8b b6 d8 00 00 00 48 c7 c7 b0 40 df c0 e8 21 61 30 d1 <0f> 0b e9 b5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 > [ 329.921932] RSP: 0018:ffffbb9f23277c00 EFLAGS: 00010286 > [ 329.927797] RAX: 0000000000000026 RBX: ffff9de18562e028 RCX: 0000000000000000 > [ 329.935793] RDX: 0000000000000002 RSI: ffffffff93a00e78 RDI: 00000000ffffffff > [ 329.943786] RBP: ffff9e13d910dc80 R08: 0000000000000000 R09: ffffbb9f23277ac0 > [ 329.951778] R10: ffffbb9f23277ab8 R11: ffff9e33811fffe8 R12: 0000000000000000 > [ 329.959784] R13: 0000000000000000 R14: ffff9de0ada653f0 R15: 0000000000000000 > [ 329.967777] FS: 0000000000000000(0000) GS:ffff9e2032100000(0000) knlGS:0000000000000000 > [ 329.976838] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 329.983280] CR2: 0000555ce9d0d030 CR3: 0000003eccc3a004 CR4: 00000000003706f0 > [ 329.991273] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 329.999268] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 330.007262] Call Trace: > [ 330.010011] <TASK> > [ 330.012383] ? __warn+0x90/0x1a0 > [ 330.016022] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 330.024803] ? report_bug+0x1c3/0x1d0 > [ 330.028924] ? __irq_work_queue_local+0x48/0x130 > [ 330.034116] ? handle_bug+0x5b/0xa0 > [ 330.038043] ? exc_invalid_op+0x14/0x70 > [ 330.042353] ? asm_exc_invalid_op+0x16/0x20 > [ 330.047064] ? drm_atomic_helper_wait_for_vblanks.part.0+0x24f/0x260 [drm_kms_helper] > [ 330.055851] ? __pfx_autoremove_wake_function+0x10/0x10 > [ 330.061723] drm_atomic_helper_commit_tail+0x71/0x80 [drm_kms_helper] > [ 330.068954] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 330.077057] commit_tail+0x94/0x130 [drm_kms_helper] > [ 330.082642] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 330.089597] drm_atomic_commit+0x97/0xb0 [drm] > [ 330.094706] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 330.100624] drm_atomic_helper_dirtyfb+0x185/0x250 [drm_kms_helper] > [ 330.107672] drm_fbdev_shmem_helper_fb_dirty+0x4c/0xb0 [drm_shmem_helper] > [ 330.115282] drm_fb_helper_damage_work+0x83/0x150 [drm_kms_helper] > [ 330.122221] process_one_work+0x214/0x600 > [ 330.126727] worker_thread+0x17f/0x320 > [ 330.130932] ? __pfx_worker_thread+0x10/0x10 > [ 330.135714] kthread+0xe0/0x110 > [ 330.139245] ? __pfx_kthread+0x10/0x10 > [ 330.143455] ret_from_fork+0x30/0x50 > [ 330.147473] ? __pfx_kthread+0x10/0x10 > [ 330.151683] ret_from_fork_asm+0x1a/0x30 > [ 330.156104] </TASK> > [ 330.158553] irq event stamp: 68963 > [ 330.162368] hardirqs last enabled at (68975): [<ffffffff92183fae>] __up_console_sem+0x5e/0x70 > [ 330.172011] hardirqs last disabled at (68986): [<ffffffff92183f93>] __up_console_sem+0x43/0x70 > [ 330.181647] softirqs last enabled at (68850): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 > [ 330.191195] softirqs last disabled at (69007): [<ffffffff920dac91>] __irq_exit_rcu+0xa1/0x110 > [ 330.200734] ---[ end trace 0000000000000000 ]--- > [ 340.327342] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out > [ 340.334379] mgag200 0000:08:00.0: [drm] *ERROR* [CRTC:34:crtc-0] commit wait timed out > [ 350.566891] mgag200 0000:08:00.0: [drm] *ERROR* flip_done timed out > [ 350.573925] mgag200 0000:08:00.0: [drm] *ERROR* [PLANE:32:plane-0] commit wait timed out > [ 350.710886] ------------[ cut here ]------------
On 02/10/2024 00:41, Tony Luck wrote: > My system threw out a bunch of stack traces while booting > v6.12-rc1 and hung. Sorry for replying late, but when writing DMA support for mgag200, I had a few servers where IRQ wasn't working at all: https://patchwork.freedesktop.org/series/117380/ Here are my notes from my testing: hp-dl180 MGA G200e [102b:0522] (rev 02) [ 20.627122] mgag200 0000:02:00.0: [drm] *ERROR* DMA transfer timed out dell-pem520 G200eR2 [102b:0534] [ 308.168976] mgag200 0000:1a:00.0: [drm] *ERROR* DMA transfer timed out I don't have access to those machines anymore, but I suspect IRQ is either misconfigured or not connected on them.
Progress! My system now boots. But there's one WARN_ON dump along the way to the "login:" prompt. Thanks -Tony --- [ 33.111505] Console: switching to colour dummy device 80x25 [ 33.119581] mgag200 0000:08:00.0: vgaarb: deactivate vga console [ 33.139574] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 [ 33.157665] fbcon: mgag200drmfb (fb0) is primary device [ 33.196490] ixgbe 0000:03:00.1: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 [ 33.281367] ixgbe 0000:03:00.1: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) [ 33.282519] ------------[ cut here ]------------ [ 33.282550] mgag200 0000:08:00.0: [drm] drm_WARN_ON(pipe >= dev->num_crtcs) [ 33.282610] WARNING: CPU: 123 PID: 1774 at drivers/gpu/drm/drm_vblank.c:1488 drm_crtc_vblank_on_config+0x1b5/0x210 [drm] [ 33.282687] Modules linked in: crct10dif_pclmul crc32_pclmul mgag200(+) crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 mpt3sas ixgbe(+) raid_class mdio drm scsi_transport_sas dca fuse [ 33.282712] CPU: 123 UID: 0 PID: 1774 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #171 [ 33.282716] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 [ 33.282718] RIP: 0010:drm_crtc_vblank_on_config+0x1b5/0x210 [drm] [ 33.282743] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 34 ce 01 d6 48 c7 c1 78 9b b1 c0 4c 89 e2 48 c7 c7 1e d6 b1 c0 48 89 c6 e8 3b 9f 60 d5 <0f> 0b 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 [ 33.282745] RSP: 0018:ffffbd1ca3f8f660 EFLAGS: 00010282 [ 33.282749] RAX: 000000000000003f RBX: ffff9ddf0a498000 RCX: 0000000000000000 [ 33.282751] RDX: 0000000000000002 RSI: ffffffff97a00e78 RDI: 00000000ffffffff [ 33.282753] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 33.282755] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9df257758df0 [ 33.282757] R13: ffff9ddf0a4993f0 R14: ffffffffc0b726c0 R15: ffff9ddf05d33450 [ 33.282758] FS: 00007f66ab8e2b40(0000) GS:ffff9deb61f80000(0000) knlGS:0000000000000000 [ 33.282761] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 33.282763] CR2: 00007f66ab8c7c4b CR3: 000000000bc04003 CR4: 00000000003706f0 [ 33.282765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 33.282766] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 33.282768] Call Trace: [ 33.282771] <TASK> [ 33.282773] ? __warn+0x90/0x1a0 [ 33.282785] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm] [ 33.282808] ? report_bug+0x1c3/0x1d0 [ 33.282819] ? handle_bug+0x5b/0xa0 [ 33.282824] ? exc_invalid_op+0x14/0x70 [ 33.282827] ? asm_exc_invalid_op+0x16/0x20 [ 33.282839] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm] [ 33.282862] ? mgag200_crtc_set_gamma_linear+0x17a/0x190 [mgag200] [ 33.282868] ? mgag200_enable_display+0x13b/0x160 [mgag200] [ 33.282876] drm_crtc_vblank_on+0x28/0x40 [drm] [ 33.282898] drm_atomic_helper_commit_modeset_enables+0xa6/0x240 [drm_kms_helper] [ 33.282920] drm_atomic_helper_commit_tail+0x50/0x80 [drm_kms_helper] [ 33.282931] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] [ 33.282951] commit_tail+0x94/0x130 [drm_kms_helper] [ 33.282963] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] [ 33.282975] drm_atomic_commit+0x97/0xb0 [drm] [ 33.282996] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [ 33.283027] drm_client_modeset_commit_atomic+0x207/0x250 [drm] [ 33.283060] drm_client_modeset_commit_locked+0x5b/0x190 [drm] [ 33.283086] drm_client_modeset_commit+0x24/0x50 [drm] [ 33.283109] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] [ 33.283122] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] [ 33.283132] fbcon_init+0x2a8/0x560 [ 33.283143] visual_init+0xc4/0x120 [ 33.283150] do_bind_con_driver.isra.0+0x1a1/0x3d0 [ 33.283158] do_take_over_console+0x10b/0x1a0 [ 33.283164] do_fbcon_takeover+0x5c/0xc0 [ 33.283167] fbcon_fb_registered+0x49/0x70 [ 33.283170] do_register_framebuffer+0x184/0x230 [ 33.283179] register_framebuffer+0x20/0x40 [ 33.283182] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] [ 33.283193] ? drm_client_register+0x33/0xc0 [drm] [ 33.283222] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] [ 33.283228] drm_client_register+0x7b/0xc0 [drm] [ 33.283254] mgag200_pci_probe+0x90/0x180 [mgag200] [ 33.283262] local_pci_probe+0x46/0xa0 [ 33.283269] pci_device_probe+0xb5/0x220 [ 33.283277] really_probe+0xd9/0x380 [ 33.283288] __driver_probe_device+0x78/0x150 [ 33.283293] driver_probe_device+0x1e/0x90 [ 33.283297] __driver_attach+0xd6/0x1d0 [ 33.283301] ? __pfx___driver_attach+0x10/0x10 [ 33.283305] bus_for_each_dev+0x66/0xa0 [ 33.283311] bus_add_driver+0x111/0x240 [ 33.283317] driver_register+0x5c/0x120 [ 33.283320] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] [ 33.283326] do_one_initcall+0x62/0x3a0 [ 33.283333] ? __kmalloc_cache_noprof+0x240/0x300 [ 33.283343] do_init_module+0x64/0x240 [ 33.283354] init_module_from_file+0x7a/0xa0 [ 33.283366] idempotent_init_module+0x15f/0x260 [ 33.283378] __x64_sys_finit_module+0x5a/0xb0 [ 33.283383] do_syscall_64+0x73/0x190 [ 33.283396] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 33.283399] RIP: 0033:0x7f66ac527e0d [ 33.283403] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 [ 33.283406] RSP: 002b:00007ffff0c752b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 33.283410] RAX: ffffffffffffffda RBX: 0000557cd3b38d00 RCX: 00007f66ac527e0d [ 33.283412] RDX: 0000000000000000 RSI: 00007f66ac68132c RDI: 0000000000000010 [ 33.283414] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 [ 33.283416] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f66ac68132c [ 33.283418] R13: 0000557cd3b18eb0 R14: 0000000000000007 R15: 0000557cd3b38f80 [ 33.283429] </TASK> [ 33.283431] irq event stamp: 45133 [ 33.283433] hardirqs last enabled at (45139): [<ffffffff96187784>] vprintk_emit+0x3d4/0x3e0 [ 33.283444] hardirqs last disabled at (45144): [<ffffffff96187737>] vprintk_emit+0x387/0x3e0 [ 33.283448] softirqs last enabled at (44822): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 [ 33.283456] softirqs last disabled at (44817): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 [ 33.283459] ---[ end trace 0000000000000000 ]--- [ 33.283494] Console: switching to colour frame buffer device 128x48 [ 33.379557] ixgbe 0000:03:00.1: MAC: 3, PHY: 0, PBA No: G36748-005 [ 33.399852] mgag200 0000:08:00.0: [drm] fb0: mgag200drmfb frame buffer device
Posted too soon. Some time (kernel timestamps say a few minutes) after the
successful boot the console spewed another stack dump and the machine hung.
-Tony
brk-bdx-01 login: [ 364.922549] ------------[ cut here ]------------
[ 364.927987] mgag200 0000:08:00.0: [drm] drm_WARN_ON(pipe >= dev->num_crtcs)
[ 364.928157] WARNING: CPU: 46 PID: 3556 at drivers/gpu/drm/drm_vblank.c:1347 drm_crtc_vblank_off+0x250/0x270 [drm]
[ 364.947651] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set rfkill nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt intel_pmc_bxt iTCO_vendor_support rapl ipmi_ssif intel_cstate intel_uncore acpi_ipmi joydev pcspkr ipmi_si i2c_i801 mei_me ipmi_devintf i2c_smbus lpc_ich mei ioatdma wmi ipmi_msghandler acpi_pad zram ip_tables crct10dif_pclmul crc32_pclmul mgag200
[ 364.948006] crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 mpt3sas ixgbe raid_class mdio drm scsi_transport_sas dca fuse
[ 365.066964] CPU: 46 UID: 42 PID: 3556 Comm: gnome-shell Tainted: G W 6.12.0-rc2+ #171
[ 365.077283] Tainted: [W]=WARN
[ 365.080617] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
[ 365.092189] RIP: 0010:drm_crtc_vblank_off+0x250/0x270 [drm]
[ 365.098473] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 e9 be 01 d6 48 c7 c1 78 9b b1 c0 4c 89 e2 48 c7 c7 1e d6 b1 c0 48 89 c6 e8 f0 8f 60 d5 <0f> 0b 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66
[ 365.119464] RSP: 0018:ffffbd1ca3d87b20 EFLAGS: 00010282
[ 365.125316] RAX: 000000000000003f RBX: ffff9ddf0a498000 RCX: 0000000000000000
[ 365.133297] RDX: 0000000000000002 RSI: ffffffff97a00e78 RDI: 00000000ffffffff
[ 365.141283] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffbd1ca3d879e0
[ 365.149274] R10: ffffbd1ca3d879d8 R11: ffff9e12011fffe8 R12: ffff9df257758df0
[ 365.157266] R13: ffff9ddf0a4993f0 R14: ffff9ddf05087a00 R15: ffffffffc0b726c0
[ 365.165244] FS: 00007f2ae47fad80(0000) GS:ffff9deb61d00000(0000) knlGS:0000000000000000
[ 365.174299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 365.180736] CR2: 00000be199403000 CR3: 00000000127a8001 CR4: 00000000003706f0
[ 365.188726] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 365.196718] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 365.204709] Call Trace:
[ 365.207462] <TASK>
[ 365.209829] ? __warn+0x90/0x1a0
[ 365.213469] ? drm_crtc_vblank_off+0x250/0x270 [drm]
[ 365.219104] ? report_bug+0x1c3/0x1d0
[ 365.223226] ? handle_bug+0x5b/0xa0
[ 365.227150] ? exc_invalid_op+0x14/0x70
[ 365.231455] ? asm_exc_invalid_op+0x16/0x20
[ 365.236159] ? drm_crtc_vblank_off+0x250/0x270 [drm]
[ 365.241762] ? _raw_spin_unlock_irq+0x24/0x50
[ 365.246653] ? lockdep_hardirqs_on+0x7b/0x100
[ 365.251549] mgag200_crtc_helper_atomic_disable+0xf/0x160 [mgag200]
[ 365.258576] disable_outputs+0x246/0x360 [drm_kms_helper]
[ 365.264671] drm_atomic_helper_commit_tail+0x1a/0x80 [drm_kms_helper]
[ 365.271896] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200]
[ 365.279998] commit_tail+0x94/0x130 [drm_kms_helper]
[ 365.285578] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper]
[ 365.292513] drm_atomic_commit+0x97/0xb0 [drm]
[ 365.297533] ? __pfx___drm_printfn_info+0x10/0x10 [drm]
[ 365.303439] drm_mode_atomic_ioctl+0x995/0xb80 [drm]
[ 365.309061] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
[ 365.315245] drm_ioctl_kernel+0x85/0xf0 [drm]
[ 365.320183] drm_ioctl+0x23a/0x450 [drm]
[ 365.324640] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm]
[ 365.330825] ? __pfx___fget_files+0xb/0x10
[ 365.335438] __x64_sys_ioctl+0x8a/0xc0
[ 365.339656] do_syscall_64+0x73/0x190
[ 365.343780] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 365.349445] RIP: 0033:0x7f2ae87280ab
[ 365.353462] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9d bd 0c 00 f7 d8 64 89 01 48
[ 365.374448] RSP: 002b:00007ffc89bc33c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 365.382925] RAX: ffffffffffffffda RBX: 00007ffc89bc3410 RCX: 00007f2ae87280ab
[ 365.390908] RDX: 00007ffc89bc3410 RSI: 00000000c03864bc RDI: 000000000000000b
[ 365.398904] RBP: 00000000c03864bc R08: 0000000000000002 R09: 0000000000000002
[ 365.406901] R10: 00007f2ae87f4a00 R11: 0000000000000246 R12: 0000564bfe3dcc80
[ 365.414889] R13: 000000000000000b R14: 0000564bfdead540 R15: 0000564bfdb6b5d0
[ 365.422914] </TASK>
[ 365.425379] irq event stamp: 1043639
[ 365.429393] hardirqs last enabled at (1043651): [<ffffffff96183fae>] __up_console_sem+0x5e/0x70
[ 365.439231] hardirqs last disabled at (1043662): [<ffffffff96183f93>] __up_console_sem+0x43/0x70
[ 365.449074] softirqs last enabled at (1043676): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110
[ 365.458818] softirqs last disabled at (1043671): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110
[ 365.468548] ---[ end trace 0000000000000000 ]---
-----Original Message-----
From: Luck, Tony
Sent: Friday, October 11, 2024 9:37 AM
To: Thomas Zimmermann <tzimmermann@suse.de>
Cc: jfalempe@redhat.com; airlied@redhat.com; sam@ravnborg.org; emil.l.velikov@gmail.com; maarten.lankhorst@linux.intel.com; mripard@kernel.org; airlied@gmail.com; daniel@ffwll.ch; dri-devel@lists.freedesktop.org
Subject: RE: [PATCH v5 0/7] drm/mgag200: Implement VBLANK support
Progress! My system now boots. But there's one WARN_ON dump along the way to the "login:" prompt.
Thanks
-Tony
---
[ 33.111505] Console: switching to colour dummy device 80x25
[ 33.119581] mgag200 0000:08:00.0: vgaarb: deactivate vga console
[ 33.139574] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0
[ 33.157665] fbcon: mgag200drmfb (fb0) is primary device
[ 33.196490] ixgbe 0000:03:00.1: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
[ 33.281367] ixgbe 0000:03:00.1: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link)
[ 33.282519] ------------[ cut here ]------------
[ 33.282550] mgag200 0000:08:00.0: [drm] drm_WARN_ON(pipe >= dev->num_crtcs)
[ 33.282610] WARNING: CPU: 123 PID: 1774 at drivers/gpu/drm/drm_vblank.c:1488 drm_crtc_vblank_on_config+0x1b5/0x210 [drm]
[ 33.282687] Modules linked in: crct10dif_pclmul crc32_pclmul mgag200(+) crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 mpt3sas ixgbe(+) raid_class mdio drm scsi_transport_sas dca fuse
[ 33.282712] CPU: 123 UID: 0 PID: 1774 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #171
[ 33.282716] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016
[ 33.282718] RIP: 0010:drm_crtc_vblank_on_config+0x1b5/0x210 [drm]
[ 33.282743] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 34 ce 01 d6 48 c7 c1 78 9b b1 c0 4c 89 e2 48 c7 c7 1e d6 b1 c0 48 89 c6 e8 3b 9f 60 d5 <0f> 0b 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48
[ 33.282745] RSP: 0018:ffffbd1ca3f8f660 EFLAGS: 00010282
[ 33.282749] RAX: 000000000000003f RBX: ffff9ddf0a498000 RCX: 0000000000000000
[ 33.282751] RDX: 0000000000000002 RSI: ffffffff97a00e78 RDI: 00000000ffffffff
[ 33.282753] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[ 33.282755] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9df257758df0
[ 33.282757] R13: ffff9ddf0a4993f0 R14: ffffffffc0b726c0 R15: ffff9ddf05d33450
[ 33.282758] FS: 00007f66ab8e2b40(0000) GS:ffff9deb61f80000(0000) knlGS:0000000000000000
[ 33.282761] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 33.282763] CR2: 00007f66ab8c7c4b CR3: 000000000bc04003 CR4: 00000000003706f0
[ 33.282765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 33.282766] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 33.282768] Call Trace:
[ 33.282771] <TASK>
[ 33.282773] ? __warn+0x90/0x1a0
[ 33.282785] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm]
[ 33.282808] ? report_bug+0x1c3/0x1d0
[ 33.282819] ? handle_bug+0x5b/0xa0
[ 33.282824] ? exc_invalid_op+0x14/0x70
[ 33.282827] ? asm_exc_invalid_op+0x16/0x20
[ 33.282839] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm]
[ 33.282862] ? mgag200_crtc_set_gamma_linear+0x17a/0x190 [mgag200]
[ 33.282868] ? mgag200_enable_display+0x13b/0x160 [mgag200]
[ 33.282876] drm_crtc_vblank_on+0x28/0x40 [drm]
[ 33.282898] drm_atomic_helper_commit_modeset_enables+0xa6/0x240 [drm_kms_helper]
[ 33.282920] drm_atomic_helper_commit_tail+0x50/0x80 [drm_kms_helper]
[ 33.282931] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200]
[ 33.282951] commit_tail+0x94/0x130 [drm_kms_helper]
[ 33.282963] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper]
[ 33.282975] drm_atomic_commit+0x97/0xb0 [drm]
[ 33.282996] ? __pfx___drm_printfn_info+0x10/0x10 [drm]
[ 33.283027] drm_client_modeset_commit_atomic+0x207/0x250 [drm]
[ 33.283060] drm_client_modeset_commit_locked+0x5b/0x190 [drm]
[ 33.283086] drm_client_modeset_commit+0x24/0x50 [drm]
[ 33.283109] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper]
[ 33.283122] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper]
[ 33.283132] fbcon_init+0x2a8/0x560
[ 33.283143] visual_init+0xc4/0x120
[ 33.283150] do_bind_con_driver.isra.0+0x1a1/0x3d0
[ 33.283158] do_take_over_console+0x10b/0x1a0
[ 33.283164] do_fbcon_takeover+0x5c/0xc0
[ 33.283167] fbcon_fb_registered+0x49/0x70
[ 33.283170] do_register_framebuffer+0x184/0x230
[ 33.283179] register_framebuffer+0x20/0x40
[ 33.283182] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper]
[ 33.283193] ? drm_client_register+0x33/0xc0 [drm]
[ 33.283222] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper]
[ 33.283228] drm_client_register+0x7b/0xc0 [drm]
[ 33.283254] mgag200_pci_probe+0x90/0x180 [mgag200]
[ 33.283262] local_pci_probe+0x46/0xa0
[ 33.283269] pci_device_probe+0xb5/0x220
[ 33.283277] really_probe+0xd9/0x380
[ 33.283288] __driver_probe_device+0x78/0x150
[ 33.283293] driver_probe_device+0x1e/0x90
[ 33.283297] __driver_attach+0xd6/0x1d0
[ 33.283301] ? __pfx___driver_attach+0x10/0x10
[ 33.283305] bus_for_each_dev+0x66/0xa0
[ 33.283311] bus_add_driver+0x111/0x240
[ 33.283317] driver_register+0x5c/0x120
[ 33.283320] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200]
[ 33.283326] do_one_initcall+0x62/0x3a0
[ 33.283333] ? __kmalloc_cache_noprof+0x240/0x300
[ 33.283343] do_init_module+0x64/0x240
[ 33.283354] init_module_from_file+0x7a/0xa0
[ 33.283366] idempotent_init_module+0x15f/0x260
[ 33.283378] __x64_sys_finit_module+0x5a/0xb0
[ 33.283383] do_syscall_64+0x73/0x190
[ 33.283396] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 33.283399] RIP: 0033:0x7f66ac527e0d
[ 33.283403] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48
[ 33.283406] RSP: 002b:00007ffff0c752b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 33.283410] RAX: ffffffffffffffda RBX: 0000557cd3b38d00 RCX: 00007f66ac527e0d
[ 33.283412] RDX: 0000000000000000 RSI: 00007f66ac68132c RDI: 0000000000000010
[ 33.283414] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000
[ 33.283416] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f66ac68132c
[ 33.283418] R13: 0000557cd3b18eb0 R14: 0000000000000007 R15: 0000557cd3b38f80
[ 33.283429] </TASK>
[ 33.283431] irq event stamp: 45133
[ 33.283433] hardirqs last enabled at (45139): [<ffffffff96187784>] vprintk_emit+0x3d4/0x3e0
[ 33.283444] hardirqs last disabled at (45144): [<ffffffff96187737>] vprintk_emit+0x387/0x3e0
[ 33.283448] softirqs last enabled at (44822): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110
[ 33.283456] softirqs last disabled at (44817): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110
[ 33.283459] ---[ end trace 0000000000000000 ]---
[ 33.283494] Console: switching to colour frame buffer device 128x48
[ 33.379557] ixgbe 0000:03:00.1: MAC: 3, PHY: 0, PBA No: G36748-005
[ 33.399852] mgag200 0000:08:00.0: [drm] fb0: mgag200drmfb frame buffer device
Hi Am 11.10.24 um 18:44 schrieb Luck, Tony: > Posted too soon. Some time (kernel timestamps say a few minutes) after the > successful boot the console spewed another stack dump and the machine hung. This warning is OK for the quick workaround. Attached is a full revert of the vblank support for you to test. If that undoes the bug, I'll post it for review to the list. Best regards Thomas > > -Tony > > > brk-bdx-01 login: [ 364.922549] ------------[ cut here ]------------ > [ 364.927987] mgag200 0000:08:00.0: [drm] drm_WARN_ON(pipe >= dev->num_crtcs) > [ 364.928157] WARNING: CPU: 46 PID: 3556 at drivers/gpu/drm/drm_vblank.c:1347 drm_crtc_vblank_off+0x250/0x270 [drm] > [ 364.947651] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_nat_tftp nf_conntrack_tftp bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set rfkill nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp iTCO_wdt intel_pmc_bxt iTCO_vendor_support rapl ipmi_ssif intel_cstate intel_uncore acpi_ipmi joydev pcspkr ipmi_si i2c_i801 mei_me ipmi_devintf i2c_smbus lpc_ich mei ioatdma wmi ipmi_msghandler acpi_pad zram ip_tables crct10dif_pclmul crc32_pclmul mgag200 > [ 364.948006] crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 mpt3sas ixgbe raid_class mdio drm scsi_transport_sas dca fuse > [ 365.066964] CPU: 46 UID: 42 PID: 3556 Comm: gnome-shell Tainted: G W 6.12.0-rc2+ #171 > [ 365.077283] Tainted: [W]=WARN > [ 365.080617] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 365.092189] RIP: 0010:drm_crtc_vblank_off+0x250/0x270 [drm] > [ 365.098473] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 e9 be 01 d6 48 c7 c1 78 9b b1 c0 4c 89 e2 48 c7 c7 1e d6 b1 c0 48 89 c6 e8 f0 8f 60 d5 <0f> 0b 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 > [ 365.119464] RSP: 0018:ffffbd1ca3d87b20 EFLAGS: 00010282 > [ 365.125316] RAX: 000000000000003f RBX: ffff9ddf0a498000 RCX: 0000000000000000 > [ 365.133297] RDX: 0000000000000002 RSI: ffffffff97a00e78 RDI: 00000000ffffffff > [ 365.141283] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffbd1ca3d879e0 > [ 365.149274] R10: ffffbd1ca3d879d8 R11: ffff9e12011fffe8 R12: ffff9df257758df0 > [ 365.157266] R13: ffff9ddf0a4993f0 R14: ffff9ddf05087a00 R15: ffffffffc0b726c0 > [ 365.165244] FS: 00007f2ae47fad80(0000) GS:ffff9deb61d00000(0000) knlGS:0000000000000000 > [ 365.174299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 365.180736] CR2: 00000be199403000 CR3: 00000000127a8001 CR4: 00000000003706f0 > [ 365.188726] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 365.196718] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 365.204709] Call Trace: > [ 365.207462] <TASK> > [ 365.209829] ? __warn+0x90/0x1a0 > [ 365.213469] ? drm_crtc_vblank_off+0x250/0x270 [drm] > [ 365.219104] ? report_bug+0x1c3/0x1d0 > [ 365.223226] ? handle_bug+0x5b/0xa0 > [ 365.227150] ? exc_invalid_op+0x14/0x70 > [ 365.231455] ? asm_exc_invalid_op+0x16/0x20 > [ 365.236159] ? drm_crtc_vblank_off+0x250/0x270 [drm] > [ 365.241762] ? _raw_spin_unlock_irq+0x24/0x50 > [ 365.246653] ? lockdep_hardirqs_on+0x7b/0x100 > [ 365.251549] mgag200_crtc_helper_atomic_disable+0xf/0x160 [mgag200] > [ 365.258576] disable_outputs+0x246/0x360 [drm_kms_helper] > [ 365.264671] drm_atomic_helper_commit_tail+0x1a/0x80 [drm_kms_helper] > [ 365.271896] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 365.279998] commit_tail+0x94/0x130 [drm_kms_helper] > [ 365.285578] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 365.292513] drm_atomic_commit+0x97/0xb0 [drm] > [ 365.297533] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 365.303439] drm_mode_atomic_ioctl+0x995/0xb80 [drm] > [ 365.309061] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] > [ 365.315245] drm_ioctl_kernel+0x85/0xf0 [drm] > [ 365.320183] drm_ioctl+0x23a/0x450 [drm] > [ 365.324640] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] > [ 365.330825] ? __pfx___fget_files+0xb/0x10 > [ 365.335438] __x64_sys_ioctl+0x8a/0xc0 > [ 365.339656] do_syscall_64+0x73/0x190 > [ 365.343780] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 365.349445] RIP: 0033:0x7f2ae87280ab > [ 365.353462] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9d bd 0c 00 f7 d8 64 89 01 48 > [ 365.374448] RSP: 002b:00007ffc89bc33c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > [ 365.382925] RAX: ffffffffffffffda RBX: 00007ffc89bc3410 RCX: 00007f2ae87280ab > [ 365.390908] RDX: 00007ffc89bc3410 RSI: 00000000c03864bc RDI: 000000000000000b > [ 365.398904] RBP: 00000000c03864bc R08: 0000000000000002 R09: 0000000000000002 > [ 365.406901] R10: 00007f2ae87f4a00 R11: 0000000000000246 R12: 0000564bfe3dcc80 > [ 365.414889] R13: 000000000000000b R14: 0000564bfdead540 R15: 0000564bfdb6b5d0 > [ 365.422914] </TASK> > [ 365.425379] irq event stamp: 1043639 > [ 365.429393] hardirqs last enabled at (1043651): [<ffffffff96183fae>] __up_console_sem+0x5e/0x70 > [ 365.439231] hardirqs last disabled at (1043662): [<ffffffff96183f93>] __up_console_sem+0x43/0x70 > [ 365.449074] softirqs last enabled at (1043676): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 > [ 365.458818] softirqs last disabled at (1043671): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 > [ 365.468548] ---[ end trace 0000000000000000 ]--- > > > -----Original Message----- > From: Luck, Tony > Sent: Friday, October 11, 2024 9:37 AM > To: Thomas Zimmermann <tzimmermann@suse.de> > Cc: jfalempe@redhat.com; airlied@redhat.com; sam@ravnborg.org; emil.l.velikov@gmail.com; maarten.lankhorst@linux.intel.com; mripard@kernel.org; airlied@gmail.com; daniel@ffwll.ch; dri-devel@lists.freedesktop.org > Subject: RE: [PATCH v5 0/7] drm/mgag200: Implement VBLANK support > > Progress! My system now boots. But there's one WARN_ON dump along the way to the "login:" prompt. > > Thanks > > -Tony > > --- > > [ 33.111505] Console: switching to colour dummy device 80x25 > [ 33.119581] mgag200 0000:08:00.0: vgaarb: deactivate vga console > [ 33.139574] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 > [ 33.157665] fbcon: mgag200drmfb (fb0) is primary device > [ 33.196490] ixgbe 0000:03:00.1: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0 > [ 33.281367] ixgbe 0000:03:00.1: 16.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x4 link at 0000:00:03.2 (capable of 32.000 Gb/s with 5.0 GT/s PCIe x8 link) > [ 33.282519] ------------[ cut here ]------------ > [ 33.282550] mgag200 0000:08:00.0: [drm] drm_WARN_ON(pipe >= dev->num_crtcs) > [ 33.282610] WARNING: CPU: 123 PID: 1774 at drivers/gpu/drm/drm_vblank.c:1488 drm_crtc_vblank_on_config+0x1b5/0x210 [drm] > [ 33.282687] Modules linked in: crct10dif_pclmul crc32_pclmul mgag200(+) crc32c_intel i2c_algo_bit ghash_clmulni_intel drm_shmem_helper sha512_ssse3 drm_kms_helper sha256_ssse3 sha1_ssse3 mpt3sas ixgbe(+) raid_class mdio drm scsi_transport_sas dca fuse > [ 33.282712] CPU: 123 UID: 0 PID: 1774 Comm: systemd-udevd Not tainted 6.12.0-rc2+ #171 > [ 33.282716] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRBDXSD1.86B.0338.V01.1603162127 03/16/2016 > [ 33.282718] RIP: 0010:drm_crtc_vblank_on_config+0x1b5/0x210 [drm] > [ 33.282743] Code: 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 34 ce 01 d6 48 c7 c1 78 9b b1 c0 4c 89 e2 48 c7 c7 1e d6 b1 c0 48 89 c6 e8 3b 9f 60 d5 <0f> 0b 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 > [ 33.282745] RSP: 0018:ffffbd1ca3f8f660 EFLAGS: 00010282 > [ 33.282749] RAX: 000000000000003f RBX: ffff9ddf0a498000 RCX: 0000000000000000 > [ 33.282751] RDX: 0000000000000002 RSI: ffffffff97a00e78 RDI: 00000000ffffffff > [ 33.282753] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 > [ 33.282755] R10: 0000000000000001 R11: 0000000000000001 R12: ffff9df257758df0 > [ 33.282757] R13: ffff9ddf0a4993f0 R14: ffffffffc0b726c0 R15: ffff9ddf05d33450 > [ 33.282758] FS: 00007f66ab8e2b40(0000) GS:ffff9deb61f80000(0000) knlGS:0000000000000000 > [ 33.282761] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 33.282763] CR2: 00007f66ab8c7c4b CR3: 000000000bc04003 CR4: 00000000003706f0 > [ 33.282765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 33.282766] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 33.282768] Call Trace: > [ 33.282771] <TASK> > [ 33.282773] ? __warn+0x90/0x1a0 > [ 33.282785] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm] > [ 33.282808] ? report_bug+0x1c3/0x1d0 > [ 33.282819] ? handle_bug+0x5b/0xa0 > [ 33.282824] ? exc_invalid_op+0x14/0x70 > [ 33.282827] ? asm_exc_invalid_op+0x16/0x20 > [ 33.282839] ? drm_crtc_vblank_on_config+0x1b5/0x210 [drm] > [ 33.282862] ? mgag200_crtc_set_gamma_linear+0x17a/0x190 [mgag200] > [ 33.282868] ? mgag200_enable_display+0x13b/0x160 [mgag200] > [ 33.282876] drm_crtc_vblank_on+0x28/0x40 [drm] > [ 33.282898] drm_atomic_helper_commit_modeset_enables+0xa6/0x240 [drm_kms_helper] > [ 33.282920] drm_atomic_helper_commit_tail+0x50/0x80 [drm_kms_helper] > [ 33.282931] mgag200_mode_config_helper_atomic_commit_tail+0x28/0x40 [mgag200] > [ 33.282951] commit_tail+0x94/0x130 [drm_kms_helper] > [ 33.282963] drm_atomic_helper_commit+0x13e/0x170 [drm_kms_helper] > [ 33.282975] drm_atomic_commit+0x97/0xb0 [drm] > [ 33.282996] ? __pfx___drm_printfn_info+0x10/0x10 [drm] > [ 33.283027] drm_client_modeset_commit_atomic+0x207/0x250 [drm] > [ 33.283060] drm_client_modeset_commit_locked+0x5b/0x190 [drm] > [ 33.283086] drm_client_modeset_commit+0x24/0x50 [drm] > [ 33.283109] __drm_fb_helper_restore_fbdev_mode_unlocked+0x95/0xd0 [drm_kms_helper] > [ 33.283122] drm_fb_helper_set_par+0x2e/0x40 [drm_kms_helper] > [ 33.283132] fbcon_init+0x2a8/0x560 > [ 33.283143] visual_init+0xc4/0x120 > [ 33.283150] do_bind_con_driver.isra.0+0x1a1/0x3d0 > [ 33.283158] do_take_over_console+0x10b/0x1a0 > [ 33.283164] do_fbcon_takeover+0x5c/0xc0 > [ 33.283167] fbcon_fb_registered+0x49/0x70 > [ 33.283170] do_register_framebuffer+0x184/0x230 > [ 33.283179] register_framebuffer+0x20/0x40 > [ 33.283182] __drm_fb_helper_initial_config_and_unlock+0x33e/0x590 [drm_kms_helper] > [ 33.283193] ? drm_client_register+0x33/0xc0 [drm] > [ 33.283222] drm_fbdev_shmem_client_hotplug+0x6c/0xc0 [drm_shmem_helper] > [ 33.283228] drm_client_register+0x7b/0xc0 [drm] > [ 33.283254] mgag200_pci_probe+0x90/0x180 [mgag200] > [ 33.283262] local_pci_probe+0x46/0xa0 > [ 33.283269] pci_device_probe+0xb5/0x220 > [ 33.283277] really_probe+0xd9/0x380 > [ 33.283288] __driver_probe_device+0x78/0x150 > [ 33.283293] driver_probe_device+0x1e/0x90 > [ 33.283297] __driver_attach+0xd6/0x1d0 > [ 33.283301] ? __pfx___driver_attach+0x10/0x10 > [ 33.283305] bus_for_each_dev+0x66/0xa0 > [ 33.283311] bus_add_driver+0x111/0x240 > [ 33.283317] driver_register+0x5c/0x120 > [ 33.283320] ? __pfx_mgag200_pci_driver_init+0x10/0x10 [mgag200] > [ 33.283326] do_one_initcall+0x62/0x3a0 > [ 33.283333] ? __kmalloc_cache_noprof+0x240/0x300 > [ 33.283343] do_init_module+0x64/0x240 > [ 33.283354] init_module_from_file+0x7a/0xa0 > [ 33.283366] idempotent_init_module+0x15f/0x260 > [ 33.283378] __x64_sys_finit_module+0x5a/0xb0 > [ 33.283383] do_syscall_64+0x73/0x190 > [ 33.283396] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 33.283399] RIP: 0033:0x7f66ac527e0d > [ 33.283403] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48 > [ 33.283406] RSP: 002b:00007ffff0c752b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 > [ 33.283410] RAX: ffffffffffffffda RBX: 0000557cd3b38d00 RCX: 00007f66ac527e0d > [ 33.283412] RDX: 0000000000000000 RSI: 00007f66ac68132c RDI: 0000000000000010 > [ 33.283414] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 > [ 33.283416] R10: 0000000000000010 R11: 0000000000000246 R12: 00007f66ac68132c > [ 33.283418] R13: 0000557cd3b18eb0 R14: 0000000000000007 R15: 0000557cd3b38f80 > [ 33.283429] </TASK> > [ 33.283431] irq event stamp: 45133 > [ 33.283433] hardirqs last enabled at (45139): [<ffffffff96187784>] vprintk_emit+0x3d4/0x3e0 > [ 33.283444] hardirqs last disabled at (45144): [<ffffffff96187737>] vprintk_emit+0x387/0x3e0 > [ 33.283448] softirqs last enabled at (44822): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 > [ 33.283456] softirqs last disabled at (44817): [<ffffffff960dac91>] __irq_exit_rcu+0xa1/0x110 > [ 33.283459] ---[ end trace 0000000000000000 ]--- > [ 33.283494] Console: switching to colour frame buffer device 128x48 > [ 33.379557] ixgbe 0000:03:00.1: MAC: 3, PHY: 0, PBA No: G36748-005 > [ 33.399852] mgag200 0000:08:00.0: [drm] fb0: mgag200drmfb frame buffer device
> Attached is a full revert of the vblank support for you to test. If that > undoes the bug, I'll post it for review to the list. Thomas. I applied that to v6.12-rc3. Builds cleanly. System boots with no warnings. MGAG device is present: $ dmesg | grep mgag [ 31.277259] mgag200 0000:08:00.0: vgaarb: deactivate vga console [ 31.298138] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 [ 31.324081] fbcon: mgag200drmfb (fb0) is primary device [ 31.414494] mgag200 0000:08:00.0: [drm] fb0: mgag200drmfb frame buffer device VGA console working. Thanks. Please apply my tags: Reported-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> -Tony
Hi Am 14.10.24 um 19:14 schrieb Luck, Tony: >> Attached is a full revert of the vblank support for you to test. If that >> undoes the bug, I'll post it for review to the list. > Thomas. > > I applied that to v6.12-rc3. Builds cleanly. > > System boots with no warnings. > > MGAG device is present: > > $ dmesg | grep mgag > [ 31.277259] mgag200 0000:08:00.0: vgaarb: deactivate vga console > [ 31.298138] [drm] Initialized mgag200 1.0.0 for 0000:08:00.0 on minor 0 > [ 31.324081] fbcon: mgag200drmfb (fb0) is primary device > [ 31.414494] mgag200 0000:08:00.0: [drm] fb0: mgag200drmfb frame buffer device > > VGA console working. > > Thanks. Please apply my tags: > > Reported-by: Tony Luck <tony.luck@intel.com> > Tested-by: Tony Luck <tony.luck@intel.com> Thanks a lot for helping. The revert is at https://lore.kernel.org/dri-devel/20241015063932.8620-1-tzimmermann@suse.de/T/#u Best regards Thomas > > -Tony >
> Thanks a lot for helping. The revert is at > > https://lore.kernel.org/dri-devel/20241015063932.8620-1-tzimmermann@suse.de/T/#u Thomas, Final closure. That patch was pulled by Linus into v6.12-rc4. I just built and booted with no problems. Thanks -Tony