Message ID | 20250311195624.22420-1-ville.syrjala@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk | expand |
On Tue, Mar 11, 2025 at 11:15:53PM -0000, Patchwork wrote: > == Series Details == > > Series: drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk (rev2) > URL : https://patchwork.freedesktop.org/series/139097/ > State : failure > > == Summary == > > #### Possible regressions #### > * igt@kms_addfb_basic@too-high: > - fi-kbl-8809g: NOTRUN -> [FAIL][6] +3 other tests fail > [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl-8809g/igt@kms_addfb_basic@too-high.html A bunch of stuff seems to have broken in CI: - something is now loading amdgpu when we didn't want it loaded - the full dmesg has been lost so I can't even find out when amdgpu got loaded
Hi, > -----Original Message----- > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville > Syrjälä > Sent: Wednesday, 12 March 2025 11.53 > To: intel-gfx@lists.freedesktop.org > Cc: I915-ci-infra@lists.freedesktop.org > Subject: Re: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the hibernate > vs. PCI D3 quirk (rev2) > > On Tue, Mar 11, 2025 at 11:15:53PM -0000, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk (rev2) > > URL : https://patchwork.freedesktop.org/series/139097/ > > State : failure > > > > == Summary == > > > > #### Possible regressions #### > > * igt@kms_addfb_basic@too-high: > > - fi-kbl-8809g: NOTRUN -> [FAIL][6] +3 other tests fail > > [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl- > 8809g/igt@kms_addfb_basic@too-high.html > > A bunch of stuff seems to have broken in CI: > - something is now loading amdgpu when we didn't want it loaded On boot I see <6>[ 0.000000] Command line: BOOT_IMAGE=/boot/drm_intel root=/dev/nvme0n1p2 rootwait fsck.repair=yes nmi_watchdog=panic,auto panic=5 softdog.soft_panic=5 log_buf_len=1M trace_clock=global xe.force_probe=* i915.force_probe=* drm.debug=0xe modprobe.blacklist=xe,i915,ast modprobe.blacklist=amdgpu ro Is that not enough? > - the full dmesg has been lost so I can't even find out when amdgpu got loaded CI team, can you get all logs transferred ? On digging internally I see from dmesg (start from that file) <7>[ 39.365629] [IGT] i915_module_load: executing <7>[ 39.373992] [IGT] i915_module_load: starting subtest load <7>[ 39.376091] [IGT] i915_module_load: finished subtest load, SKIP <7>[ 39.376197] [IGT] i915_module_load: exiting, ret=77 <7>[ 39.551743] [IGT] core_auth: executing <6>[ 42.196892] [drm] amdgpu kernel modesetting enabled. <7>[ 42.197065] [drm:amdgpu_acpi_detect [amdgpu]] No matching acpi device found for AMD3000 <6>[ 42.198069] amdgpu: Virtual CRAT table created for CPU <6>[ 42.198933] amdgpu: Topology: Add CPU node <6>[ 42.200595] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) <6>[ 42.201352] [drm] initializing kernel modesetting (VEGAM 0x1002:0x694C 0x8086:0x2073 0xC0). <6>[ 42.201418] [drm] register mmio base: 0xDB500000 <6>[ 42.201420] [drm] register mmio size: 262144 <6>[ 42.202307] amdgpu 0000:01:00.0: amdgpu: detected ip block number 0 <vi_common> <6>[ 42.202311] amdgpu 0000:01:00.0: amdgpu: detected ip block number 1 <gmc_v8_0> <6>[ 42.202314] amdgpu 0000:01:00.0: amdgpu: detected ip block number 2 <tonga_ih> <6>[ 42.202316] amdgpu 0000:01:00.0: amdgpu: detected ip block number 3 <gfx_v8_0> <6>[ 42.202318] amdgpu 0000:01:00.0: amdgpu: detected ip block number 4 <sdma_v3_0> <6>[ 42.202321] amdgpu 0000:01:00.0: amdgpu: detected ip block number 5 <powerplay> <6>[ 42.202323] amdgpu 0000:01:00.0: amdgpu: detected ip block number 6 <dm> <6>[ 42.202325] amdgpu 0000:01:00.0: amdgpu: detected ip block number 7 <uvd_v6_0> <6>[ 42.202327] amdgpu 0000:01:00.0: amdgpu: detected ip block number 8 <vce_v3_0> <6>[ 42.202427] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from VFCT <6>[ 42.202449] amdgpu: ATOM BIOS: 408435.180301.04s <6>[ 42.228348] [drm] UVD is enabled in VM mode <6>[ 42.228353] [drm] UVD ENC is enabled in VM mode <6>[ 42.228356] [drm] VCE enabled in VM mode <6>[ 42.228734] amdgpu 0000:01:00.0: vgaarb: deactivate vga console > > -- > Ville Syrjälä > Intel
Hi, > -----Original Message----- > From: Saarinen, Jani > Sent: Wednesday, 12 March 2025 12.06 > To: Ville Syrjälä <ville.syrjala@linux.intel.com>; intel-gfx@lists.freedesktop.org; > I915-ci-infra@lists.freedesktop.org > Subject: RE: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the hibernate > vs. PCI D3 quirk (rev2) > > Hi, > > -----Original Message----- > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of > > Ville Syrjälä > > Sent: Wednesday, 12 March 2025 11.53 > > To: intel-gfx@lists.freedesktop.org > > Cc: I915-ci-infra@lists.freedesktop.org > > Subject: Re: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the > > hibernate vs. PCI D3 quirk (rev2) > > > > On Tue, Mar 11, 2025 at 11:15:53PM -0000, Patchwork wrote: > > > == Series Details == > > > > > > Series: drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk (rev2) > > > URL : https://patchwork.freedesktop.org/series/139097/ > > > State : failure > > > > > > == Summary == > > > > > > #### Possible regressions #### > > > * igt@kms_addfb_basic@too-high: > > > - fi-kbl-8809g: NOTRUN -> [FAIL][6] +3 other tests fail > > > [6]: > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl- > > 8809g/igt@kms_addfb_basic@too-high.html > > > > A bunch of stuff seems to have broken in CI: > > - something is now loading amdgpu when we didn't want it loaded > On boot I see > <6>[ 0.000000] Command line: BOOT_IMAGE=/boot/drm_intel > root=/dev/nvme0n1p2 rootwait fsck.repair=yes nmi_watchdog=panic,auto > panic=5 softdog.soft_panic=5 log_buf_len=1M trace_clock=global > xe.force_probe=* i915.force_probe=* drm.debug=0xe > modprobe.blacklist=xe,i915,ast modprobe.blacklist=amdgpu ro > > Is that not enough? > > > - the full dmesg has been lost so I can't even find out when amdgpu > > got loaded > CI team, can you get all logs transferred ? From runner log also some data : https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl-8809g/igt_runner0.txt > On digging internally I see from dmesg (start from that file) > > <7>[ 39.365629] [IGT] i915_module_load: executing > <7>[ 39.373992] [IGT] i915_module_load: starting subtest load > <7>[ 39.376091] [IGT] i915_module_load: finished subtest load, SKIP > <7>[ 39.376197] [IGT] i915_module_load: exiting, ret=77 > <7>[ 39.551743] [IGT] core_auth: executing > <6>[ 42.196892] [drm] amdgpu kernel modesetting enabled. > <7>[ 42.197065] [drm:amdgpu_acpi_detect [amdgpu]] No matching acpi > device found for AMD3000 > <6>[ 42.198069] amdgpu: Virtual CRAT table created for CPU > <6>[ 42.198933] amdgpu: Topology: Add CPU node > <6>[ 42.200595] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) > <6>[ 42.201352] [drm] initializing kernel modesetting (VEGAM > 0x1002:0x694C 0x8086:0x2073 0xC0). > <6>[ 42.201418] [drm] register mmio base: 0xDB500000 > <6>[ 42.201420] [drm] register mmio size: 262144 > <6>[ 42.202307] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 0 <vi_common> > <6>[ 42.202311] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 1 <gmc_v8_0> > <6>[ 42.202314] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 2 <tonga_ih> > <6>[ 42.202316] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 3 <gfx_v8_0> > <6>[ 42.202318] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 4 <sdma_v3_0> > <6>[ 42.202321] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 5 <powerplay> > <6>[ 42.202323] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 6 <dm> > <6>[ 42.202325] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 7 <uvd_v6_0> > <6>[ 42.202327] amdgpu 0000:01:00.0: amdgpu: detected ip block number > 8 <vce_v3_0> > <6>[ 42.202427] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from VFCT > <6>[ 42.202449] amdgpu: ATOM BIOS: 408435.180301.04s > <6>[ 42.228348] [drm] UVD is enabled in VM mode > <6>[ 42.228353] [drm] UVD ENC is enabled in VM mode > <6>[ 42.228356] [drm] VCE enabled in VM mode > <6>[ 42.228734] amdgpu 0000:01:00.0: vgaarb: deactivate vga console > > > > > -- > > Ville Syrjälä > > Intel
Hi, and one more > -----Original Message----- > From: Saarinen, Jani > Sent: Wednesday, 12 March 2025 12.08 > To: Ville Syrjälä <ville.syrjala@linux.intel.com>; intel-gfx@lists.freedesktop.org; > I915-ci-infra@lists.freedesktop.org > Subject: RE: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the hibernate > vs. PCI D3 quirk (rev2) > > Hi, > > > > -----Original Message----- > > From: Saarinen, Jani > > Sent: Wednesday, 12 March 2025 12.06 > > To: Ville Syrjälä <ville.syrjala@linux.intel.com>; > > intel-gfx@lists.freedesktop.org; I915-ci-infra@lists.freedesktop.org > > Subject: RE: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the > > hibernate vs. PCI D3 quirk (rev2) > > > > Hi, > > > -----Original Message----- > > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf > > > Of Ville Syrjälä > > > Sent: Wednesday, 12 March 2025 11.53 > > > To: intel-gfx@lists.freedesktop.org > > > Cc: I915-ci-infra@lists.freedesktop.org > > > Subject: Re: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the > > > hibernate vs. PCI D3 quirk (rev2) > > > > > > On Tue, Mar 11, 2025 at 11:15:53PM -0000, Patchwork wrote: > > > > == Series Details == > > > > > > > > Series: drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk (rev2) > > > > URL : https://patchwork.freedesktop.org/series/139097/ > > > > State : failure > > > > > > > > == Summary == > > > > > > > > #### Possible regressions #### > > > > * igt@kms_addfb_basic@too-high: > > > > - fi-kbl-8809g: NOTRUN -> [FAIL][6] +3 other tests fail > > > > [6]: > > > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl > > > > - > > > 8809g/igt@kms_addfb_basic@too-high.html > > > > > > A bunch of stuff seems to have broken in CI: > > > - something is now loading amdgpu when we didn't want it loaded > > On boot I see > > <6>[ 0.000000] Command line: BOOT_IMAGE=/boot/drm_intel > > root=/dev/nvme0n1p2 rootwait fsck.repair=yes nmi_watchdog=panic,auto > > panic=5 softdog.soft_panic=5 log_buf_len=1M trace_clock=global > > xe.force_probe=* i915.force_probe=* drm.debug=0xe > > modprobe.blacklist=xe,i915,ast modprobe.blacklist=amdgpu ro > > > > Is that not enough? > > > > > - the full dmesg has been lost so I can't even find out when amdgpu > > > got loaded > > CI team, can you get all logs transferred ? > From runner log also some data : https://intel-gfx-ci.01.org/tree/drm- > tip/Patchwork_139097v2/fi-kbl-8809g/igt_runner0.txt > Should this fix the behavior https://patchwork.freedesktop.org/series/146170/ as we started not to blacklist snd_hda_intel at CI_DRM_16263 (deploy script change). Br, Jani > > > On digging internally I see from dmesg (start from that file) > > > > <7>[ 39.365629] [IGT] i915_module_load: executing > > <7>[ 39.373992] [IGT] i915_module_load: starting subtest load > > <7>[ 39.376091] [IGT] i915_module_load: finished subtest load, SKIP > > <7>[ 39.376197] [IGT] i915_module_load: exiting, ret=77 > > <7>[ 39.551743] [IGT] core_auth: executing > > <6>[ 42.196892] [drm] amdgpu kernel modesetting enabled. > > <7>[ 42.197065] [drm:amdgpu_acpi_detect [amdgpu]] No matching acpi > > device found for AMD3000 > > <6>[ 42.198069] amdgpu: Virtual CRAT table created for CPU > > <6>[ 42.198933] amdgpu: Topology: Add CPU node > > <6>[ 42.200595] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) > > <6>[ 42.201352] [drm] initializing kernel modesetting (VEGAM > > 0x1002:0x694C 0x8086:0x2073 0xC0). > > <6>[ 42.201418] [drm] register mmio base: 0xDB500000 > > <6>[ 42.201420] [drm] register mmio size: 262144 > > <6>[ 42.202307] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 0 <vi_common> > > <6>[ 42.202311] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 1 <gmc_v8_0> > > <6>[ 42.202314] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 2 <tonga_ih> > > <6>[ 42.202316] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 3 <gfx_v8_0> > > <6>[ 42.202318] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 4 <sdma_v3_0> > > <6>[ 42.202321] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 5 <powerplay> > > <6>[ 42.202323] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 6 <dm> > > <6>[ 42.202325] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 7 <uvd_v6_0> > > <6>[ 42.202327] amdgpu 0000:01:00.0: amdgpu: detected ip block > number > > 8 <vce_v3_0> > > <6>[ 42.202427] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from > VFCT > > <6>[ 42.202449] amdgpu: ATOM BIOS: 408435.180301.04s > > <6>[ 42.228348] [drm] UVD is enabled in VM mode > > <6>[ 42.228353] [drm] UVD ENC is enabled in VM mode > > <6>[ 42.228356] [drm] VCE enabled in VM mode > > <6>[ 42.228734] amdgpu 0000:01:00.0: vgaarb: deactivate vga console > > > > > > > > -- > > > Ville Syrjälä > > > Intel
On Wed, 2025-03-12 at 10:05 +0000, Saarinen, Jani wrote: > Hi, > > -----Original Message----- > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville > > Syrjälä > > Sent: Wednesday, 12 March 2025 11.53 > > To: intel-gfx@lists.freedesktop.org > > Cc: I915-ci-infra@lists.freedesktop.org > > Subject: Re: ✗ i915.CI.BAT: failure for drm/i915/pm: Clean up the hibernate > > vs. PCI D3 quirk (rev2) > > > > On Tue, Mar 11, 2025 at 11:15:53PM -0000, Patchwork wrote: > > > == Series Details == > > > > > > Series: drm/i915/pm: Clean up the hibernate vs. PCI D3 quirk (rev2) > > > URL : https://patchwork.freedesktop.org/series/139097/ > > > State : failure > > > > > > == Summary == > > > > > > #### Possible regressions #### > > > * igt@kms_addfb_basic@too-high: > > > - fi-kbl-8809g: NOTRUN -> [FAIL][6] +3 other tests fail > > > [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_139097v2/fi-kbl- > > 8809g/igt@kms_addfb_basic@too-high.html > > > > A bunch of stuff seems to have broken in CI: > > - something is now loading amdgpu when we didn't want it loaded > On boot I see > <6>[ 0.000000] Command line: BOOT_IMAGE=/boot/drm_intel root=/dev/nvme0n1p2 rootwait fsck.repair=yes nmi_watchdog=panic,auto panic=5 softdog.soft_panic=5 log_buf_len=1M trace_clock=global xe.force_probe=* i915.force_probe=* drm.debug=0xe modprobe.blacklist=xe,i915,ast modprobe.blacklist=amdgpu ro > > Is that not enough? It looks like removing the snd_hda_intel blacklist causes this, see: testrunner@fi-kbl-8809g:~$ lspci -v -s "01:00.1" 01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Polaris 22 HDMI Audio Subsystem: Intel Corporation Polaris 22 HDMI Audio Flags: bus master, fast devsel, latency 0, IRQ 163, IOMMU group 1 Memory at db560000 (64-bit, non-prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: snd_hda_intel Kernel modules: snd_hda_intel +Lucas, should we revert that? > > > - the full dmesg has been lost so I can't even find out when amdgpu got loaded > CI team, can you get all logs transferred ? > On digging internally I see from dmesg (start from that file) > > <7>[ 39.365629] [IGT] i915_module_load: executing > <7>[ 39.373992] [IGT] i915_module_load: starting subtest load > <7>[ 39.376091] [IGT] i915_module_load: finished subtest load, SKIP > <7>[ 39.376197] [IGT] i915_module_load: exiting, ret=77 > <7>[ 39.551743] [IGT] core_auth: executing > <6>[ 42.196892] [drm] amdgpu kernel modesetting enabled. > <7>[ 42.197065] [drm:amdgpu_acpi_detect [amdgpu]] No matching acpi device found for AMD3000 > <6>[ 42.198069] amdgpu: Virtual CRAT table created for CPU > <6>[ 42.198933] amdgpu: Topology: Add CPU node > <6>[ 42.200595] amdgpu 0000:01:00.0: enabling device (0006 -> 0007) > <6>[ 42.201352] [drm] initializing kernel modesetting (VEGAM 0x1002:0x694C 0x8086:0x2073 0xC0). > <6>[ 42.201418] [drm] register mmio base: 0xDB500000 > <6>[ 42.201420] [drm] register mmio size: 262144 > <6>[ 42.202307] amdgpu 0000:01:00.0: amdgpu: detected ip block number 0 <vi_common> > <6>[ 42.202311] amdgpu 0000:01:00.0: amdgpu: detected ip block number 1 <gmc_v8_0> > <6>[ 42.202314] amdgpu 0000:01:00.0: amdgpu: detected ip block number 2 <tonga_ih> > <6>[ 42.202316] amdgpu 0000:01:00.0: amdgpu: detected ip block number 3 <gfx_v8_0> > <6>[ 42.202318] amdgpu 0000:01:00.0: amdgpu: detected ip block number 4 <sdma_v3_0> > <6>[ 42.202321] amdgpu 0000:01:00.0: amdgpu: detected ip block number 5 <powerplay> > <6>[ 42.202323] amdgpu 0000:01:00.0: amdgpu: detected ip block number 6 <dm> > <6>[ 42.202325] amdgpu 0000:01:00.0: amdgpu: detected ip block number 7 <uvd_v6_0> > <6>[ 42.202327] amdgpu 0000:01:00.0: amdgpu: detected ip block number 8 <vce_v3_0> > <6>[ 42.202427] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from VFCT > <6>[ 42.202449] amdgpu: ATOM BIOS: 408435.180301.04s > <6>[ 42.228348] [drm] UVD is enabled in VM mode > <6>[ 42.228353] [drm] UVD ENC is enabled in VM mode > <6>[ 42.228356] [drm] VCE enabled in VM mode > <6>[ 42.228734] amdgpu 0000:01:00.0: vgaarb: deactivate vga console > > > > > -- > > Ville Syrjälä > > Intel
From: Ville Syrjälä <ville.syrjala@linux.intel.com> Attempt to make i915 rely more on the standard pci pm code instead of hand rolling a bunch of pci_save_state()+pci_set_power_state() stuff in the driver. v2: Drop the core pci changes for now since I couldn't get any real answers to them Drop some redundant pci_*() clals from the pm paths Ville Syrjälä (6): drm/i915/pm: Simplify pm hook documentation drm/i915/pm: Hoist pci_save_state()+pci_set_power_state() to the end of pm _late() hook drm/i915/pm: Move the hibernate+D3 quirk stuff into noirq() pm hooks drm/i915/pm: Do pci_restore_state() in switcheroo resume hook drm/i915/pm: Allow drivers/pci to manage our pci state normally drm/i915/pm: Drop redundant pci stuff from suspend/resume paths drivers/gpu/drm/i915/i915_driver.c | 133 +++++++++++++++-------------- 1 file changed, 69 insertions(+), 64 deletions(-)