Message ID | ebbcd56ac883d3c3d3024d368fab63d26e02637a@lausen.nl (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Revert "drm/msm/dp: Remove INIT_SETUP delay" | expand |
Abhinav Kumar <quic_abhinavk@quicinc.com> writes: > On 5/7/2023 7:15 PM, Bjorn Andersson wrote: >> When booting with the cable connected on my X13s, 100 is long enough for >> my display to time out and require me to disconnect and reconnect the >> cable again. >> >> Do we have any idea of why the reduction to 0 is causing an issue when >> using the internal HPD? >> >> Regards, >> Bjorn > Yes, we do know why this is causing an issue. The cleaner patch for this > will be posted this week. Great! > There is no need to add the 100ms delay back yet. > > thanks for posting this but NAK on this patch till we post the fix this > week. > > Appreciate a bit of patience till then. This regression is already part of the 6.3 stable release series. Will the new patch qualify for inclusion in 6.3.y? Or will it be part of 6.4 and this revert should go into 6.3.y? Even with this revert, there are additional regressions in 6.3 causing dpu errors and blank external display upon suspending and resuming the system while an external display is connected. Will your new patch also fix these regressions? [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] [dpu error]vblank timeout [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait for commit done returned -110 [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu error]enc33 frame done timeout followed by a kernel panic if any modification to the display settings is done, such as disabling the external display: [ 341.631287] Hardware name: Google Lazor (rev3 - 8) (DT) [ 341.631290] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 341.631296] pc : do_raw_spin_unlock+0xb8/0xc4 [ 341.631310] lr : do_raw_spin_unlock+0x78/0xc4 [ 341.631315] sp : ffffffc01100b880 [ 341.631317] x29: ffffffc01100b880 x28: 0000000000000028 x27: 0000000000000038 [ 341.631326] x26: ffffff808c89e180 x25: ffffffef33e39920 x24: 0000000000000000 [ 341.631333] x23: ffffffef33e3ca0c x22: 0000000000000002 x21: ffffff808345ded8 [ 341.631339] x20: ffffff808345ded0 x19: 000000000000001e x18: 0000000000000000 [ 341.631345] x17: 0048000000000460 x16: 0441043b04600438 x15: 04380000089807d0 [ 341.631351] x14: 07b0089807800780 x13: 0000000000000068 x12: 0000000000000001 [ 341.631357] x11: ffffffef3413bb76 x10: 0000000000000bb0 x9 : ffffffef33e3d6bc [ 341.631363] x8 : ffffff808c89ed90 x7 : ffffff80b1c9f738 x6 : 0000000000000001 [ 341.631370] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffff808345def0 [ 341.631375] x2 : 00000000dead4ead x1 : 0000000000000003 x0 : 0000000000000000 [ 341.631383] Kernel panic - not syncing: Asynchronous SError Interrupt [ 341.631386] CPU: 3 PID: 1520 Comm: kwin_wayland Not tainted 6.3.0-stb-cbq+ #2 [ 341.631390] Hardware name: Google Lazor (rev3 - 8) (DT) [ 341.631393] Call trace: [ 341.631395] dump_backtrace+0xc8/0x104 [ 341.631402] show_stack+0x20/0x30 [ 341.631407] dump_stack_lvl+0x48/0x60 [ 341.631414] dump_stack+0x18/0x24 [ 341.631419] panic+0x130/0x2fc [ 341.631425] nmi_panic+0x54/0x78 [ 341.631428] arm64_serror_panic+0x74/0x80 [ 341.631434] arm64_is_fatal_ras_serror+0x6c/0x8c [ 341.631439] do_serror+0x48/0x60 [ 341.631444] el1h_64_error_handler+0x30/0x48 [ 341.631450] el1h_64_error+0x68/0x6c [ 341.631455] do_raw_spin_unlock+0xb8/0xc4 [ 341.631460] _raw_spin_unlock_irq+0x18/0x38 [ 341.631466] __wait_for_common+0xb8/0x154 [ 341.631472] wait_for_completion_timeout+0x28/0x34 [ 341.631477] dp_ctrl_push_idle+0x3c/0x88 [ 341.631483] dp_bridge_disable+0x20/0x2c [ 341.631488] drm_atomic_bridge_chain_disable+0x8c/0xb8 [ 341.631495] drm_atomic_helper_commit_modeset_disables+0x198/0x450 [ 341.631501] msm_atomic_commit_tail+0x1c8/0x36c [ 341.631507] commit_tail+0x80/0x108 [ 341.631512] drm_atomic_helper_commit+0x114/0x118 [ 341.631516] drm_atomic_commit+0xb4/0xe0 [ 341.631522] drm_mode_atomic_ioctl+0x6b0/0x890 [ 341.631527] drm_ioctl_kernel+0xe4/0x164 [ 341.631534] drm_ioctl+0x35c/0x3bc [ 341.631539] vfs_ioctl+0x30/0x50 [ 341.631547] __arm64_sys_ioctl+0x80/0xb4 [ 341.631552] invoke_syscall+0x84/0x11c [ 341.631558] el0_svc_common.constprop.0+0xc0/0xec [ 341.631563] do_el0_svc+0x94/0xa4 [ 341.631567] el0_svc+0x2c/0x54 [ 341.631570] el0t_64_sync_handler+0x94/0x100 [ 341.631575] el0t_64_sync+0x194/0x198 [ 341.631580] SMP: stopping secondary CPUs [ 341.831615] Kernel Offset: 0x2f2b200000 from 0xffffffc008000000 [ 341.831618] PHYS_OFFSET: 0x80000000 [ 341.831620] CPU features: 0x400000,61500506,3200720b [ 341.831623] Memory Limit: none
On 5/8/2023 4:30 AM, Dmitry Baryshkov wrote: > On 08/05/2023 14:02, Leonard Lausen wrote: >> Abhinav Kumar <quic_abhinavk@quicinc.com> writes: >>> On 5/7/2023 7:15 PM, Bjorn Andersson wrote: >>>> When booting with the cable connected on my X13s, 100 is long enough >>>> for >>>> my display to time out and require me to disconnect and reconnect the >>>> cable again. >>>> >>>> Do we have any idea of why the reduction to 0 is causing an issue when >>>> using the internal HPD? >>>> >>>> Regards, >>>> Bjorn >>> Yes, we do know why this is causing an issue. The cleaner patch for this >>> will be posted this week. >> >> Great! >> >>> There is no need to add the 100ms delay back yet. >>> >>> thanks for posting this but NAK on this patch till we post the fix this >>> week. >>> >>> Appreciate a bit of patience till then. >> >> This regression is already part of the 6.3 stable release series. Will >> the new patch qualify for inclusion in 6.3.y? Or will it be part of 6.4 >> and this revert should go into 6.3.y? > > This is a tough situation, as landing a revert will break x13s, as noted > by Bjorn. Given that the workaround is known at this moment, I would > like to wait for the patch from Abhinav to appear, then we can decide > which of the fixes should go to the stable kernel. > >> >> Even with this revert, there are additional regressions in 6.3 causing >> dpu errors and blank external display upon suspending and resuming the >> system while an external display is connected. Will your new patch also >> fix these regressions? >> >> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >> [dpu error]vblank timeout >> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >> for commit done returned -110 >> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >> error]enc33 frame done timeout >> >> followed by a kernel panic if any modification to the display settings >> is done, such as disabling the external display: > > Interesting crash, thank you for the report. > This is a different crash but the root-cause of both the issues is the bridge hpd_enable/disable series. https://patchwork.freedesktop.org/patch/514414/ This is breaking the sequence and logic of internal hpd as per my discussion with kuogee. We are analyzing the issue and the fix internally first and once we figure out all the details will post it. >> >> [ 341.631287] Hardware name: Google Lazor (rev3 - 8) (DT) >> [ 341.631290] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS >> BTYPE=--) >> [ 341.631296] pc : do_raw_spin_unlock+0xb8/0xc4 >> [ 341.631310] lr : do_raw_spin_unlock+0x78/0xc4 >> [ 341.631315] sp : ffffffc01100b880 >> [ 341.631317] x29: ffffffc01100b880 x28: 0000000000000028 x27: >> 0000000000000038 >> [ 341.631326] x26: ffffff808c89e180 x25: ffffffef33e39920 x24: >> 0000000000000000 >> [ 341.631333] x23: ffffffef33e3ca0c x22: 0000000000000002 x21: >> ffffff808345ded8 >> [ 341.631339] x20: ffffff808345ded0 x19: 000000000000001e x18: >> 0000000000000000 >> [ 341.631345] x17: 0048000000000460 x16: 0441043b04600438 x15: >> 04380000089807d0 >> [ 341.631351] x14: 07b0089807800780 x13: 0000000000000068 x12: >> 0000000000000001 >> [ 341.631357] x11: ffffffef3413bb76 x10: 0000000000000bb0 x9 : >> ffffffef33e3d6bc >> [ 341.631363] x8 : ffffff808c89ed90 x7 : ffffff80b1c9f738 x6 : >> 0000000000000001 >> [ 341.631370] x5 : 0000000000000000 x4 : 0000000000000000 x3 : >> ffffff808345def0 >> [ 341.631375] x2 : 00000000dead4ead x1 : 0000000000000003 x0 : >> 0000000000000000 >> [ 341.631383] Kernel panic - not syncing: Asynchronous SError Interrupt >> [ 341.631386] CPU: 3 PID: 1520 Comm: kwin_wayland Not tainted >> 6.3.0-stb-cbq+ #2 >> [ 341.631390] Hardware name: Google Lazor (rev3 - 8) (DT) >> [ 341.631393] Call trace: >> [ 341.631395] dump_backtrace+0xc8/0x104 >> [ 341.631402] show_stack+0x20/0x30 >> [ 341.631407] dump_stack_lvl+0x48/0x60 >> [ 341.631414] dump_stack+0x18/0x24 >> [ 341.631419] panic+0x130/0x2fc >> [ 341.631425] nmi_panic+0x54/0x78 >> [ 341.631428] arm64_serror_panic+0x74/0x80 >> [ 341.631434] arm64_is_fatal_ras_serror+0x6c/0x8c >> [ 341.631439] do_serror+0x48/0x60 >> [ 341.631444] el1h_64_error_handler+0x30/0x48 >> [ 341.631450] el1h_64_error+0x68/0x6c >> [ 341.631455] do_raw_spin_unlock+0xb8/0xc4 >> [ 341.631460] _raw_spin_unlock_irq+0x18/0x38 >> [ 341.631466] __wait_for_common+0xb8/0x154 >> [ 341.631472] wait_for_completion_timeout+0x28/0x34 >> [ 341.631477] dp_ctrl_push_idle+0x3c/0x88 >> [ 341.631483] dp_bridge_disable+0x20/0x2c >> [ 341.631488] drm_atomic_bridge_chain_disable+0x8c/0xb8 >> [ 341.631495] drm_atomic_helper_commit_modeset_disables+0x198/0x450 >> [ 341.631501] msm_atomic_commit_tail+0x1c8/0x36c >> [ 341.631507] commit_tail+0x80/0x108 >> [ 341.631512] drm_atomic_helper_commit+0x114/0x118 >> [ 341.631516] drm_atomic_commit+0xb4/0xe0 >> [ 341.631522] drm_mode_atomic_ioctl+0x6b0/0x890 >> [ 341.631527] drm_ioctl_kernel+0xe4/0x164 >> [ 341.631534] drm_ioctl+0x35c/0x3bc >> [ 341.631539] vfs_ioctl+0x30/0x50 >> [ 341.631547] __arm64_sys_ioctl+0x80/0xb4 >> [ 341.631552] invoke_syscall+0x84/0x11c >> [ 341.631558] el0_svc_common.constprop.0+0xc0/0xec >> [ 341.631563] do_el0_svc+0x94/0xa4 >> [ 341.631567] el0_svc+0x2c/0x54 >> [ 341.631570] el0t_64_sync_handler+0x94/0x100 >> [ 341.631575] el0t_64_sync+0x194/0x198 >> [ 341.631580] SMP: stopping secondary CPUs >> [ 341.831615] Kernel Offset: 0x2f2b200000 from 0xffffffc008000000 >> [ 341.831618] PHYS_OFFSET: 0x80000000 >> [ 341.831620] CPU features: 0x400000,61500506,3200720b >> [ 341.831623] Memory Limit: none >
Abhinav Kumar <quic_abhinavk@quicinc.com> writes: >>>> There is no need to add the 100ms delay back yet. >>>> >>>> thanks for posting this but NAK on this patch till we post the fix this >>>> week. >>>> >>>> Appreciate a bit of patience till then. >>> >>> This regression is already part of the 6.3 stable release series. Will >>> the new patch qualify for inclusion in 6.3.y? Or will it be part of 6.4 >>> and this revert should go into 6.3.y? >> >> This is a tough situation, as landing a revert will break x13s, as noted >> by Bjorn. Given that the workaround is known at this moment, I would >> like to wait for the patch from Abhinav to appear, then we can decide >> which of the fixes should go to the stable kernel. I wasn't able to find new patches, though may have missed them. Is there a decision yet how to proceed with this regression? 6.2 now being EOL may make this a good moment to decide on the next steps. >>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>> [dpu error]vblank timeout >>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>> for commit done returned -110 >>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>> error]enc33 frame done timeout > > This is a different crash but the root-cause of both the issues is the > bridge hpd_enable/disable series. > > https://patchwork.freedesktop.org/patch/514414/ > > This is breaking the sequence and logic of internal hpd as per my > discussion with kuogee. > > We are analyzing the issue and the fix internally first and once we > figure out all the details will post it. Thank you!
Hi Leonard On 5/22/2023 7:39 PM, Leonard Lausen wrote: > Abhinav Kumar <quic_abhinavk@quicinc.com> writes: >>>>> There is no need to add the 100ms delay back yet. >>>>> >>>>> thanks for posting this but NAK on this patch till we post the fix this >>>>> week. >>>>> >>>>> Appreciate a bit of patience till then. >>>> >>>> This regression is already part of the 6.3 stable release series. Will >>>> the new patch qualify for inclusion in 6.3.y? Or will it be part of 6.4 >>>> and this revert should go into 6.3.y? >>> >>> This is a tough situation, as landing a revert will break x13s, as noted >>> by Bjorn. Given that the workaround is known at this moment, I would >>> like to wait for the patch from Abhinav to appear, then we can decide >>> which of the fixes should go to the stable kernel. > > I wasn't able to find new patches, though may have missed them. Is there > a decision yet how to proceed with this regression? 6.2 now being EOL > may make this a good moment to decide on the next steps. > Yes, the new patch to fix this issue is here https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 Apologies if you were not CCed on this, if a next version is CCed, will ask kuogee to cc you. Meanwhile, will be great if you can verify if it works for you and provide Tested-by tags. >>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>> [dpu error]vblank timeout >>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>>> for commit done returned -110 >>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>> error]enc33 frame done timeout >> >> This is a different crash but the root-cause of both the issues is the >> bridge hpd_enable/disable series. >> >> https://patchwork.freedesktop.org/patch/514414/ >> >> This is breaking the sequence and logic of internal hpd as per my >> discussion with kuogee. >> >> We are analyzing the issue and the fix internally first and once we >> figure out all the details will post it. > > Thank you!
On 5/23/2023 11:56 AM, Abhinav Kumar wrote: > Hi Leonard > > On 5/22/2023 7:39 PM, Leonard Lausen wrote: >> Abhinav Kumar <quic_abhinavk@quicinc.com> writes: >>>>>> There is no need to add the 100ms delay back yet. >>>>>> >>>>>> thanks for posting this but NAK on this patch till we post the >>>>>> fix this >>>>>> week. >>>>>> >>>>>> Appreciate a bit of patience till then. >>>>> >>>>> This regression is already part of the 6.3 stable release series. >>>>> Will >>>>> the new patch qualify for inclusion in 6.3.y? Or will it be part >>>>> of 6.4 >>>>> and this revert should go into 6.3.y? >>>> >>>> This is a tough situation, as landing a revert will break x13s, as >>>> noted >>>> by Bjorn. Given that the workaround is known at this moment, I would >>>> like to wait for the patch from Abhinav to appear, then we can decide >>>> which of the fixes should go to the stable kernel. >> >> I wasn't able to find new patches, though may have missed them. Is there >> a decision yet how to proceed with this regression? 6.2 now being EOL >> may make this a good moment to decide on the next steps. >> > > Yes, the new patch to fix this issue is here > > https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 > > Apologies if you were not CCed on this, if a next version is CCed, > will ask kuogee to cc you. > > Meanwhile, will be great if you can verify if it works for you and > provide Tested-by tags. Hi Leonard, I had cc you with v5 patches. Would you please verify it. Thanks, > >>>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>>> [dpu error]vblank timeout >>>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>>>> for commit done returned -110 >>>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>>> error]enc33 frame done timeout >>> >>> This is a different crash but the root-cause of both the issues is the >>> bridge hpd_enable/disable series. >>> >>> https://patchwork.freedesktop.org/patch/514414/ >>> >>> This is breaking the sequence and logic of internal hpd as per my >>> discussion with kuogee. >>> >>> We are analyzing the issue and the fix internally first and once we >>> figure out all the details will post it. >> >> Thank you!
>>>>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>>>> [dpu error]vblank timeout >>>>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>>>>> for commit done returned -110 >>>>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>>>> error]enc33 frame done timeout >>>> >>>> This is a different crash but the root-cause of both the issues is the >>>> bridge hpd_enable/disable series. >>>> >>>> https://patchwork.freedesktop.org/patch/514414/ >> >> Yes, the new patch to fix this issue is here >> >> https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 >> >> Apologies if you were not CCed on this, if a next version is CCed, >> will ask kuogee to cc you. >> >> Meanwhile, will be great if you can verify if it works for you and >> provide Tested-by tags. > > Hi Leonard, > > I had cc you with v5 patches. > > Would you please verify it. Hi Kuogee, thank you. Verified the v6 patch fixes the regression when ported to 6.3.3. One non-fatal issue remains: Suspending and resuming the system while USB-C DP monitor is connected triggers an error, though the system recovers within a second without the need to unplug the cable. [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) dmesg snippet related to the suspend below [ 194.066321] PM: suspend entry (deep) [ 194.178793] Filesystems sync: 0.108 seconds [ 194.184142] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/qcom/sc7180-trogdor/modem-nolte/qdsp6sw.mbn" pid=3380 cmdline="" [ 194.196934] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/qcom/sc7180-trogdor/modem-nolte/mba.mbn" pid=3387 cmdline="" [ 194.197320] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/regulatory.db-debian" pid=3390 cmdline="" [ 194.204128] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/qcom/venus-5.4/venus.mbn" pid=3380 cmdline="" [ 194.204808] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/qca/crbtfw32.tlv" pid=3380 cmdline="" [ 194.205058] LoadPin: firmware pinning-ignored obj="/usr/lib/firmware/qca/crnv32.bin" pid=3380 cmdline="" [ 194.253591] Freezing user space processes [ 194.263621] Freezing user space processes completed (elapsed 0.005 seconds) [ 194.270816] OOM killer disabled. [ 194.274165] Freezing remaining freezable tasks [ 194.281253] Freezing remaining freezable tasks completed (elapsed 0.002 seconds) [ 194.288866] printk: Suspending console(s) (use no_console_suspend to debug) [ 194.494479] Disabling non-boot CPUs ... [ 194.497569] psci: CPU1 killed (polled 1 ms) [ 194.501844] psci: CPU2 killed (polled 1 ms) [ 194.506311] psci: CPU3 killed (polled 1 ms) [ 194.510237] psci: CPU4 killed (polled 1 ms) [ 194.512854] psci: CPU5 killed (polled 1 ms) [ 194.516076] psci: CPU6 killed (polled 1 ms) [ 194.518397] psci: CPU7 killed (polled 0 ms) [ 194.520706] Enabling non-boot CPUs ... [ 194.521595] Detected VIPT I-cache on CPU1 [ 194.521664] cacheinfo: Unable to detect cache hierarchy for CPU 1 [ 194.521678] GICv3: CPU1: found redistributor 100 region 0:0x0000000017a80000 [ 194.521743] CPU1: Booted secondary processor 0x0000000100 [0x51df805e] [ 194.522829] CPU1 is up [ 194.523646] Detected VIPT I-cache on CPU2 [ 194.523701] cacheinfo: Unable to detect cache hierarchy for CPU 2 [ 194.523716] GICv3: CPU2: found redistributor 200 region 0:0x0000000017aa0000 [ 194.523775] CPU2: Booted secondary processor 0x0000000200 [0x51df805e] [ 194.524809] CPU2 is up [ 194.525537] Detected VIPT I-cache on CPU3 [ 194.525592] cacheinfo: Unable to detect cache hierarchy for CPU 3 [ 194.525611] GICv3: CPU3: found redistributor 300 region 0:0x0000000017ac0000 [ 194.525668] CPU3: Booted secondary processor 0x0000000300 [0x51df805e] [ 194.526674] CPU3 is up [ 194.527486] Detected VIPT I-cache on CPU4 [ 194.527535] cacheinfo: Unable to detect cache hierarchy for CPU 4 [ 194.527556] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000 [ 194.527612] CPU4: Booted secondary processor 0x0000000400 [0x51df805e] [ 194.528836] CPU4 is up [ 194.529553] Detected VIPT I-cache on CPU5 [ 194.529601] cacheinfo: Unable to detect cache hierarchy for CPU 5 [ 194.529623] GICv3: CPU5: found redistributor 500 region 0:0x0000000017b00000 [ 194.529675] CPU5: Booted secondary processor 0x0000000500 [0x51df805e] [ 194.530986] CPU5 is up [ 194.532280] Detected PIPT I-cache on CPU6 [ 194.532307] cacheinfo: Unable to detect cache hierarchy for CPU 6 [ 194.532322] GICv3: CPU6: found redistributor 600 region 0:0x0000000017b20000 [ 194.532358] CPU6: Booted secondary processor 0x0000000600 [0x51ff804f] [ 194.534434] CPU6 is up [ 194.535408] Detected PIPT I-cache on CPU7 [ 194.535445] cacheinfo: Unable to detect cache hierarchy for CPU 7 [ 194.535463] GICv3: CPU7: found redistributor 700 region 0:0x0000000017b40000 [ 194.535505] CPU7: Booted secondary processor 0x0000000700 [0x51ff804f] [ 194.536281] CPU7 is up [ 195.285023] onboard-usb-hub 1-1: reset high-speed USB device number 2 using xhci-hcd [ 195.541240] onboard-usb-hub 2-1: reset SuperSpeed USB device number 2 using xhci-hcd [ 195.796915] usb 1-1.4: reset high-speed USB device number 22 using xhci-hcd [ 195.972952] usb 2-1.4: reset SuperSpeed USB device number 10 using xhci-hcd [ 196.278492] usb 1-1.4.4: reset high-speed USB device number 24 using xhci-hcd [ 196.468996] usb 1-1.4.2: reset high-speed USB device number 26 using xhci-hcd [ 197.055717] usb 2-1.4.2: reset SuperSpeed USB device number 11 using xhci-hcd [ 197.845110] usb 2-1.4.4: reset SuperSpeed USB device number 12 using xhci-hcd [ 198.235191] [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) [ 198.528638] OOM killer enabled. [ 198.531866] Restarting tasks ... [ 198.531994] usb 1-1.4.4.1: USB disconnect, device number 27 [ 198.532223] usb 1-1.4.3: USB disconnect, device number 23 [ 198.532509] usb 1-1.4.2.1: USB disconnect, device number 29 [ 198.534805] r8152-cfgselector 2-1.4.4.2: USB disconnect, device number 13 [ 198.535444] done. [ 198.535536] usb 1-1.1: USB disconnect, device number 15 [ 198.567811] random: crng reseeded on system resumption [ 198.583431] PM: suspend exit
On 5/24/2023 5:58 AM, Leonard Lausen wrote: >>>>>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>>>>> [dpu error]vblank timeout >>>>>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>>>>>> for commit done returned -110 >>>>>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>>>>> error]enc33 frame done timeout >>>>> This is a different crash but the root-cause of both the issues is the >>>>> bridge hpd_enable/disable series. >>>>> >>>>> https://patchwork.freedesktop.org/patch/514414/ >>> Yes, the new patch to fix this issue is here >>> >>> https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 >>> >>> Apologies if you were not CCed on this, if a next version is CCed, >>> will ask kuogee to cc you. >>> >>> Meanwhile, will be great if you can verify if it works for you and >>> provide Tested-by tags. >> Hi Leonard, >> >> I had cc you with v5 patches. >> >> Would you please verify it. > Hi Kuogee, > > thank you. Verified the v6 patch fixes the regression when ported to > 6.3.3. One non-fatal issue remains: Suspending and resuming the system > while USB-C DP monitor is connected triggers an error, though the system > recovers within a second without the need to unplug the cable. > > [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) > > > dmesg snippet related to the suspend below > > > [ 197.845110] usb 2-1.4.4: reset SuperSpeed USB device number 12 using xhci-hcd > [ 198.235191] [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) Hi Leonard, I did not see this problem at my setup (Kodiak) during suspend/resume. Will investigate more on Trogdor device. Thanks, > [ 198.528638] OOM killer enabled. > [ 198.531866] Restarting tasks ... > [ 198.531994] usb 1-1.4.4.1: USB disconnect, device number 27 > [ 198.532223] usb 1-1.4.3: USB disconnect, device number 23 > [ 198.532509] usb 1-1.4.2.1: USB disconnect, device number 29 > [ 198.534805] r8152-cfgselector 2-1.4.4.2: USB disconnect, device number 13 > [ 198.535444] done. > [ 198.535536] usb 1-1.1: USB disconnect, device number 15 > [ 198.567811] random: crng reseeded on system resumption > [ 198.583431] PM: suspend exit
On 5/25/2023 10:57 AM, Kuogee Hsieh wrote: > > On 5/24/2023 5:58 AM, Leonard Lausen wrote: >>>>>>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>>>>>> [dpu error]vblank timeout >>>>>>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu >>>>>>>> error]wait >>>>>>>> for commit done returned -110 >>>>>>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>>>>>> error]enc33 frame done timeout >>>>>> This is a different crash but the root-cause of both the issues is >>>>>> the >>>>>> bridge hpd_enable/disable series. >>>>>> >>>>>> https://patchwork.freedesktop.org/patch/514414/ >>>> Yes, the new patch to fix this issue is here >>>> >>>> https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 >>>> >>>> Apologies if you were not CCed on this, if a next version is CCed, >>>> will ask kuogee to cc you. >>>> >>>> Meanwhile, will be great if you can verify if it works for you and >>>> provide Tested-by tags. >>> Hi Leonard, >>> >>> I had cc you with v5 patches. >>> >>> Would you please verify it. >> Hi Kuogee, >> >> thank you. Verified the v6 patch fixes the regression when ported to >> 6.3.3. One non-fatal issue remains: Suspending and resuming the system >> while USB-C DP monitor is connected triggers an error, though the system >> recovers within a second without the need to unplug the cable. >> >> [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) >> >> >> dmesg snippet related to the suspend below >> >> >> [ 197.845110] usb 2-1.4.4: reset SuperSpeed USB device number 12 >> using xhci-hcd >> [ 198.235191] [drm:drm_mode_config_helper_resume] *ERROR* Failed to >> resume (-107) > > Hi Leonard, > > I did not see this problem at my setup (Kodiak) during suspend/resume. > > Will investigate more on Trogdor device. > > Thanks, > Hi Leonard Feel free to open a bug for this and assign to me, we can check this and ask more info if needed on that bug. Thanks Abhinav > >> [ 198.528638] OOM killer enabled. >> [ 198.531866] Restarting tasks ... >> [ 198.531994] usb 1-1.4.4.1: USB disconnect, device number 27 >> [ 198.532223] usb 1-1.4.3: USB disconnect, device number 23 >> [ 198.532509] usb 1-1.4.2.1: USB disconnect, device number 29 >> [ 198.534805] r8152-cfgselector 2-1.4.4.2: USB disconnect, device >> number 13 >> [ 198.535444] done. >> [ 198.535536] usb 1-1.1: USB disconnect, device number 15 >> [ 198.567811] random: crng reseeded on system resumption >> [ 198.583431] PM: suspend exit
Hi Leonard On 5/24/2023 5:58 AM, Leonard Lausen wrote: >>>>>>> [ 275.025497] [drm:dpu_encoder_phys_vid_wait_for_commit_done:488] >>>>>>> [dpu error]vblank timeout >>>>>>> [ 275.025514] [drm:dpu_kms_wait_for_commit_done:510] [dpu error]wait >>>>>>> for commit done returned -110 >>>>>>> [ 275.064141] [drm:dpu_encoder_frame_done_timeout:2382] [dpu >>>>>>> error]enc33 frame done timeout >>>>> >>>>> This is a different crash but the root-cause of both the issues is the >>>>> bridge hpd_enable/disable series. >>>>> >>>>> https://patchwork.freedesktop.org/patch/514414/ >>> >>> Yes, the new patch to fix this issue is here >>> >>> https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 >>> >>> Apologies if you were not CCed on this, if a next version is CCed, >>> will ask kuogee to cc you. >>> >>> Meanwhile, will be great if you can verify if it works for you and >>> provide Tested-by tags. >> >> Hi Leonard, >> >> I had cc you with v5 patches. >> >> Would you please verify it. > > Hi Kuogee, > > thank you. Verified the v6 patch fixes the regression when ported to > 6.3.3. One non-fatal issue remains: Suspending and resuming the system > while USB-C DP monitor is connected triggers an error, though the system > recovers within a second without the need to unplug the cable. > > [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) > We are not able to recreate this on sc7280 chromebooks , will need to check on sc7180. This does not seem directly related to any of the hotplug changes though so needs to be checked separately. So please feel free to raise a gitlab bug for this and assign to me.
Hi Abhinav, June 1, 2023 at 3:20 PM, "Abhinav Kumar" <quic_abhinavk@quicinc.com> wrote: > > > > [drm:drm_mode_config_helper_resume] *ERROR* Failed to resume (-107) > > > > We are not able to recreate this on sc7280 chromebooks , will need to check on sc7180. This does not seem directly related to any of the hotplug changes though so needs to be checked separately. So please feel free to raise a gitlab bug for this and assign to me. Thank you for checking with sc7280. I created https://gitlab.freedesktop.org/drm/msm/-/issues/25 and CCed you. I've also verified that the error persists with v6.4.0-rc4 + Kuogee's patch (just in case you may have tested on sc7280 with 6.4). > > https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 > > Apologies if you were not CCed on this, if a next version is CCed, > > will ask kuogee to cc you. > > Meanwhile, will be great if you can verify if it works for you and > > provide Tested-by tags. I see Bjorn also tested the patch. As it fixes a serious USB-C DP regression which broke USB-C DP completely on lazor for v6.3, can it be included in upcoming 6.3.y release? Thank you Leonard
> > https://patchwork.freedesktop.org/patch/538601/?series=118148&rev=3 > > Apologies if you were not CCed on this, if a next version is CCed, > > > > will ask kuogee to cc you. > > > > Meanwhile, will be great if you can verify if it works for you and > > > > provide Tested-by tags. > > > > I see Bjorn also tested the patch. As it fixes a serious USB-C DP regression which broke USB-C DP completely on lazor for v6.3, can it be included in upcoming 6.3.y release? Kuogee's fix has since been committed to drm-tip on 2023-06-08 as a8e981ac2d0eb9dd53a4c173e29ca0c99c88abe2. Since it fixes a serious regression in 6.3 and 6.4 kernels, can we include it for the stable releases? Thank you Leonard
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index bde1a7ce442f..db9783ffd5cf 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -1506,7 +1506,7 @@ void msm_dp_irq_postinstall(struct msm_dp *dp_display) dp = container_of(dp_display, struct dp_display_private, dp_display); if (!dp_display->is_edp) - dp_add_event(dp, EV_HPD_INIT_SETUP, 0, 0); + dp_add_event(dp, EV_HPD_INIT_SETUP, 0, 100); } bool msm_dp_wide_bus_available(const struct msm_dp *dp_display)