Message ID | 20240623142137.448898081@linutronix.de (mailing list archive) |
---|---|
Headers | show |
Series | genirq, irqchip: Convert ARM MSI handling to per device MSI domains | expand |
On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > This is version 4 of the series to convert ARM MSI handling over to > per device MSI domains. Version 3 can be found here: > > https://lore.kernel.org/lkml/20240614102403.13610-1-shivamurthy.shastri@linutronix.de > > The conversion aims to replace the existing platform MSI mechanism and > enables ARM to support the future PCI/IMS mechanism. > > The infrastructure to replace the platform MSI mechanism is already > upstream and in use by RISC-V and has been tested on various ARM platforms > during the V2 development. > > Changes vs. V3: > > - Fix the conversion of the GIC V3 MBI driver - Marc > > - Dropped a few stray MSI_FLAG_PCI_MSI_MASK_PARENT flags > > - Dropped the trivial cleanup patches as they have been merged > > - Picked up tags > > The series is only lightly tested due to lack of hardware, so we rely on > the people who have access to affected machines to help with testing. > > If there are no major objections raised or testing fallout reported, I'm > aiming this series for the next merge window. > > The series is also available from git: > > git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git devmsi-arm-v4 Running this thru kernelCI has some failures on x86 QEMU boots[1]. Here's the backtrace: <1>[ 2.199948] BUG: kernel NULL pointer dereference, address: 0000000000000000 <1>[ 2.199948] #PF: supervisor instruction fetch in kernel mode <1>[ 2.199948] #PF: error_code(0x0010) - not-present page <6>[ 2.199948] PGD 0 P4D 0 <4>[ 2.199948] Oops: Oops: 0010 [#1] PREEMPT SMP NOPTI <4>[ 2.199948] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-rc3 #1 <4>[ 2.199948] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 <4>[ 2.199948] RIP: 0010:0x0 <4>[ 2.199948] Code: Unable to access opcode bytes at 0xffffffffffffffd6. <4>[ 2.199948] RSP: 0018:ffffa7ac80013a90 EFLAGS: 00000002 <4>[ 2.199948] RAX: 0000000000000000 RBX: ffffa4050333d600 RCX: 0000000000000000 <4>[ 2.199948] RDX: ffffa4050333d430 RSI: 0000000000000001 RDI: ffffa40502ff3100 <4>[ 2.199948] RBP: ffffa4050333d600 R08: ffffa405032f1c00 R09: 0000000000000000 <4>[ 2.199948] R10: 0000000000000246 R11: ffffa405032f1d80 R12: ffffa405032f1d80 <4>[ 2.199948] R13: 0000000000000001 R14: 0000000000000000 R15: ffffa4050333d760 <4>[ 2.199948] FS: 0000000000000000(0000) GS:ffffa4053e400000(0000) knlGS:0000000000000000 <4>[ 2.199948] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 2.199948] CR2: ffffffffffffffd6 CR3: 000000002a22e000 CR4: 00000000000006f0 <4>[ 2.199948] Call Trace: <4>[ 2.199948] <TASK> <4>[ 2.199948] ? __die+0x1f/0x70 <4>[ 2.199948] ? page_fault_oops+0x155/0x440 <4>[ 2.199948] ? ondemand_readahead+0x2c0/0x370 <4>[ 2.199948] ? bitmap_find_next_zero_area_off+0x7b/0x90 <4>[ 2.199948] ? exc_page_fault+0x69/0x150 <4>[ 2.199948] ? asm_exc_page_fault+0x26/0x30 <4>[ 2.199948] pci_irq_unmask_msix+0x53/0x60 <4>[ 2.199948] irq_enable+0x32/0x80 <4>[ 2.199948] __irq_startup+0x51/0x70 <4>[ 2.199948] irq_startup+0x62/0x120 <4>[ 2.199948] __setup_irq+0x326/0x730 <4>[ 2.199948] ? __pfx_vp_config_changed+0x10/0x10 <4>[ 2.199948] request_threaded_irq+0x10b/0x180 <4>[ 2.199948] vp_find_vqs_msix+0x16b/0x470 <4>[ 2.199948] vp_find_vqs+0x34/0x1a0 <4>[ 2.199948] vp_modern_find_vqs+0x16/0x60 <4>[ 2.199948] init_vqs+0x3ee/0x690 <4>[ 2.199948] virtnet_probe+0x50c/0xd10 <4>[ 2.199948] virtio_dev_probe+0x1dd/0x2b0 <4>[ 2.199948] really_probe+0xbc/0x2b0 <4>[ 2.199948] __driver_probe_device+0x6e/0x120 <4>[ 2.199948] driver_probe_device+0x19/0xe0 <4>[ 2.199948] __driver_attach+0x85/0x180 <4>[ 2.199948] ? __pfx___driver_attach+0x10/0x10 <4>[ 2.199948] bus_for_each_dev+0x76/0xd0 <4>[ 2.199948] bus_add_driver+0xe3/0x210 <4>[ 2.199948] driver_register+0x5b/0x110 <4>[ 2.199948] ? __pfx_virtio_net_driver_init+0x10/0x10 <4>[ 2.199948] virtio_net_driver_init+0x8b/0xb0 <4>[ 2.199948] ? __pfx_virtio_net_driver_init+0x10/0x10 <4>[ 2.199948] do_one_initcall+0x43/0x210 <4>[ 2.199948] kernel_init_freeable+0x19b/0x2d0 <4>[ 2.199948] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.199948] kernel_init+0x15/0x1c0 <4>[ 2.199948] ret_from_fork+0x2f/0x50 <4>[ 2.199948] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.199948] ret_from_fork_asm+0x1a/0x30 <4>[ 2.199948] </TASK> <4>[ 2.199948] Modules linked in: <4>[ 2.199948] CR2: 0000000000000000 <4>[ 2.199948] ---[ end trace 0000000000000000 ]--- Rob [1] https://linux.kernelci.org/test/job/robh/branch/for-kernelci/kernel/v6.10-rc3-21-gd27f9f4a2dd80/plan/baseline/
On Tue, Jun 25 2024 at 13:46, Rob Herring wrote: > Running this thru kernelCI has some failures on x86 QEMU boots[1]. oops > <4>[ 2.199948] pci_irq_unmask_msix+0x53/0x60 > <4>[ 2.199948] irq_enable+0x32/0x80 I'm sure that I fixed that before. Updated branch: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git devmsi-arm-v4-1 Thanks, tglx
On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > This is version 4 of the series to convert ARM MSI handling over to > per device MSI domains. > The conversion aims to replace the existing platform MSI mechanism and > enables ARM to support the future PCI/IMS mechanism. > The series is only lightly tested due to lack of hardware, so we rely on > the people who have access to affected machines to help with testing. > > If there are no major objections raised or testing fallout reported, I'm > aiming this series for the next merge window. This series only showed up in linux-next last Friday and broke interrupt handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) and x1e80100 that use the GIC ITS for PCIe MSIs. I've applied the series (21 commits from linux-next) on top of 6.10 and can confirm that the breakage is caused by commits: 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") Applying the series up until the change before 3d1c927c08fc unbreaks the wifi on one machine: ath11k_pci 0006:01:00.0: failed to enable msi: -22 ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 and backing up until the commit before 233db05bc37f makes the NVMe come up again during boot on another. I have not tried to debug this further. Johan
On Mon, 15 Jul 2024 12:18:47 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > This is version 4 of the series to convert ARM MSI handling over to > > per device MSI domains. > > > The conversion aims to replace the existing platform MSI mechanism and > > enables ARM to support the future PCI/IMS mechanism. > > > The series is only lightly tested due to lack of hardware, so we rely on > > the people who have access to affected machines to help with testing. > > > > If there are no major objections raised or testing fallout reported, I'm > > aiming this series for the next merge window. > > This series only showed up in linux-next last Friday and broke interrupt > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > and x1e80100 that use the GIC ITS for PCIe MSIs. > > I've applied the series (21 commits from linux-next) on top of 6.10 and > can confirm that the breakage is caused by commits: > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > Applying the series up until the change before 3d1c927c08fc unbreaks the > wifi on one machine: > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > and backing up until the commit before 233db05bc37f makes the NVMe come > up again during boot on another. > > I have not tried to debug this further. I need a few things from you though, because you're not giving much to help you (and I'm travelling, which doesn't help). Can you at least investigate what in ath11k_pci_alloc_msi() causes the wifi driver to be upset? Does it normally use a single MSI vector or MSI-X? How about your nVME device? It would also help if you could define the DEBUG symbol at the very top of irq-gic-v3-its.c and report the debug information that the ITS driver dumps. Thanks, M.
On Mon, Jul 15 2024 at 13:18, Johan Hovold wrote: > I've applied the series (21 commits from linux-next) on top of 6.10 and > can confirm that the breakage is caused by commits: > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > Applying the series up until the change before 3d1c927c08fc unbreaks the > wifi on one machine: > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 3d1c927c08fc converts the platform MSI stuff over which is unrelated to PCI/MSI. I'm confused how this affects PCI/MSI of the WIFI card. > and backing up until the commit before 233db05bc37f makes the NVMe come > up again during boot on another. So that undoes the PCI/MSI change. Hrm. > I have not tried to debug this further. Any hint would be appreciated. Thanks, tglx
On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > On Mon, 15 Jul 2024 12:18:47 +0100, > Johan Hovold <johan@kernel.org> wrote: > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > > This is version 4 of the series to convert ARM MSI handling over to > > > per device MSI domains. > > This series only showed up in linux-next last Friday and broke interrupt > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > can confirm that the breakage is caused by commits: > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > wifi on one machine: > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > > > and backing up until the commit before 233db05bc37f makes the NVMe come > > up again during boot on another. > > > > I have not tried to debug this further. > > I need a few things from you though, because you're not giving much to > help you (and I'm travelling, which doesn't help). Yeah, this was just an early heads up. > Can you at least investigate what in ath11k_pci_alloc_msi() causes the > wifi driver to be upset? Does it normally use a single MSI vector or > MSI-X? How about your nVME device? It uses multiple vectors, but now it falls back to trying to allocate a single one and even that fails with -ENOSPC: ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 Similar for the NVMe, it uses multiple vectors normally, but now only the AER interrupts appears to be allocated for each controller and there is a GICv3 interrupt for the NVMe: 208: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv 212: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv 214: 161 0 0 0 0 0 0 0 GICv3 562 Level nvme0q0, nvme0q1 215: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv Next boot, after disabling PCIe controller async probing, it's an MSI-X?!: 201: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv 203: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv 205: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv 206: 0 0 0 0 0 0 0 0 ITS-PCI-MSIX-0002:01:00.0 0 Edge nvme0q0 This time ath11k vector allocation succeeded, but the driver times out eventually: [ 8.984619] ath11k_pci 0006:01:00.0: MSI vectors: 32 [ 29.690841] ath11k_pci 0006:01:00.0: failed to power up mhi: -110 [ 29.697136] ath11k_pci 0006:01:00.0: failed to start mhi: -110 [ 29.703153] ath11k_pci 0006:01:00.0: failed to power up :-110 [ 29.732144] ath11k_pci 0006:01:00.0: failed to create soc core: -110 [ 29.738694] ath11k_pci 0006:01:00.0: failed to init core: -110 [ 32.841758] ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -110 > It would also help if you could define the DEBUG symbol at the very > top of irq-gic-v3-its.c and report the debug information that the ITS > driver dumps. See below (with synchronous probing of the pcie controllers). Johan [ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0 [ 0.000000] GICv3: 960 SPIs implemented [ 0.000000] GICv3: 0 Extended SPIs implemented [ 0.000000] Root IRQ handler: gic_handle_irq [ 0.000000] GICv3: GICv3 features: 16 PPIs [ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x0000000017a60000 [ 0.000000] ITS [mem 0x17a40000-0x17a5ffff] [ 0.000000] ITS@0x0000000017a40000: allocated 8192 Devices @100100000 (indirect, esz 8, psz 64K, shr 1) [ 0.000000] ITS@0x0000000017a40000: allocated 32768 Interrupt Collections @100110000 (flat, esz 2, psz 64K, shr 1) [ 0.000000] GICv3: using LPI property table @0x0000000100120000 [ 0.000000] ITS: Allocator initialized for 57344 LPIs [ 0.000000] GICv3: CPU0: using allocated LPI pending table @0x0000000100130000 [ 0.010428] GICv3: CPU1: found redistributor 100 region 0:0x0000000017a80000 [ 0.010438] GICv3: CPU1: using allocated LPI pending table @0x0000000100140000 [ 0.010477] CPU1: Booted secondary processor 0x0000000100 [0x410fd4b0] [ 0.011496] Detected PIPT I-cache on CPU2 [ 0.011535] GICv3: CPU2: found redistributor 200 region 0:0x0000000017aa0000 [ 0.011545] GICv3: CPU2: using allocated LPI pending table @0x0000000100150000 [ 0.011576] CPU2: Booted secondary processor 0x0000000200 [0x410fd4b0] [ 0.012593] Detected PIPT I-cache on CPU3 [ 0.012631] GICv3: CPU3: found redistributor 300 region 0:0x0000000017ac0000 [ 0.012641] GICv3: CPU3: using allocated LPI pending table @0x0000000100160000 [ 0.012671] CPU3: Booted secondary processor 0x0000000300 [0x410fd4b0] [ 0.015590] Detected PIPT I-cache on CPU4 [ 0.015637] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000 [ 0.015647] GICv3: CPU4: using allocated LPI pending table @0x0000000100170000 [ 0.015675] CPU4: Booted secondary processor 0x0000000400 [0x410fd4c0] [ 0.016698] Detected PIPT I-cache on CPU5 [ 0.016733] GICv3: CPU5: found redistributor 500 region 0:0x0000000017b00000 [ 0.016742] GICv3: CPU5: using allocated LPI pending table @0x0000000100180000 [ 0.016772] CPU5: Booted secondary processor 0x0000000500 [0x410fd4c0] [ 0.020807] Detected PIPT I-cache on CPU6 [ 0.020841] GICv3: CPU6: found redistributor 600 region 0:0x0000000017b20000 [ 0.020851] GICv3: CPU6: using allocated LPI pending table @0x0000000100190000 [ 0.020879] CPU6: Booted secondary processor 0x0000000600 [0x410fd4c0] [ 0.021878] Detected PIPT I-cache on CPU7 [ 0.021914] GICv3: CPU7: found redistributor 700 region 0:0x0000000017b40000 [ 0.021922] GICv3: CPU7: using allocated LPI pending table @0x00000001001a0000 [ 0.021952] CPU7: Booted secondary processor 0x0000000700 [0x410fd4c0] [ 8.358586] qcom-pcie 1c00000.pcie: host bridge /soc@0/pcie@1c00000 ranges: [ 8.365787] qcom-pcie 1c00000.pcie: IO 0x0030200000..0x00302fffff -> 0x0000000000 [ 8.381670] qcom-pcie 1c00000.pcie: MEM 0x0030300000..0x0031ffffff -> 0x0030300000 [ 8.507519] qcom-pcie 1c00000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.603797] qcom-pcie 1c00000.pcie: PCIe Gen.2 x1 link up [ 8.610023] qcom-pcie 1c00000.pcie: PCI host bridge to bus 0006:00 [ 8.616805] pci_bus 0006:00: root bus resource [bus 00-ff] [ 8.622872] pci_bus 0006:00: root bus resource [io 0x0000-0xfffff] [ 8.629844] pci_bus 0006:00: root bus resource [mem 0x30300000-0x31ffffff] [ 8.636981] pci 0006:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.655493] pci 0006:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.672909] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.688721] pci 0006:00:00.0: bridge window [io 0x0000-0x0fff] [ 8.703805] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.719789] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.736680] pci 0006:00:00.0: PME# supported from D0 D3hot D3cold [ 8.745548] pci 0006:01:00.0: [17cb:1103] type 00 class 0x028000 PCIe Endpoint [ 8.745646] pci 0006:01:00.0: BAR 0 [mem 0x00000000-0x001fffff 64bit] [ 8.746274] pci 0006:01:00.0: PME# supported from D0 D3hot D3cold [ 8.746442] pci 0006:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0006:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link) [ 8.836195] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff]: assigned [ 8.853287] pci 0006:00:00.0: BAR 0 [mem 0x30300000-0x30300fff]: assigned [ 8.870163] pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 8.887617] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.902850] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff] [ 8.933586] ITS: alloc 8192:32 [ 8.933599] ITT 32 entries, 5 bits [ 8.951573] ID:0 pID:8192 vID:201 [ 8.951585] ID:1 pID:8193 vID:202 [ 8.951591] ID:2 pID:8194 vID:203 [ 8.951597] ID:3 pID:8195 vID:204 [ 8.951603] ID:4 pID:8196 vID:205 [ 8.951609] ID:5 pID:8197 vID:206 [ 8.951615] ID:6 pID:8198 vID:207 [ 8.951621] ID:7 pID:8199 vID:208 [ 8.951627] ID:8 pID:8200 vID:209 [ 8.951633] ID:9 pID:8201 vID:210 [ 8.951639] ID:10 pID:8202 vID:211 [ 8.951645] ID:11 pID:8203 vID:212 [ 8.951650] ID:12 pID:8204 vID:213 [ 8.951656] ID:13 pID:8205 vID:214 [ 8.951662] ID:14 pID:8206 vID:215 [ 8.951667] ID:15 pID:8207 vID:216 [ 8.951673] ID:16 pID:8208 vID:217 [ 8.951679] ID:17 pID:8209 vID:218 [ 8.951685] ID:18 pID:8210 vID:219 [ 8.951691] ID:19 pID:8211 vID:220 [ 8.951696] ID:20 pID:8212 vID:221 [ 8.951702] ID:21 pID:8213 vID:222 [ 8.951708] ID:22 pID:8214 vID:223 [ 8.951714] ID:23 pID:8215 vID:224 [ 8.951720] ID:24 pID:8216 vID:225 [ 8.951725] ID:25 pID:8217 vID:226 [ 8.951772] ID:26 pID:8218 vID:227 [ 8.951778] ID:27 pID:8219 vID:228 [ 8.951784] ID:28 pID:8220 vID:229 [ 8.951790] ID:29 pID:8221 vID:230 [ 8.951796] ID:30 pID:8222 vID:231 [ 8.951802] ID:31 pID:8223 vID:232 [ 8.951919] IRQ201 -> 0-7 CPU0 [ 8.951940] IRQ202 -> 0-7 CPU1 [ 8.951952] IRQ203 -> 0-7 CPU2 [ 8.951963] IRQ204 -> 0-7 CPU3 [ 8.951975] IRQ205 -> 0-7 CPU4 [ 8.951987] IRQ206 -> 0-7 CPU5 [ 8.951998] IRQ207 -> 0-7 CPU6 [ 8.952010] IRQ208 -> 0-7 CPU7 [ 8.952022] IRQ209 -> 0-7 CPU0 [ 8.952033] IRQ210 -> 0-7 CPU1 [ 8.952045] IRQ211 -> 0-7 CPU2 [ 8.952056] IRQ212 -> 0-7 CPU3 [ 8.952068] IRQ213 -> 0-7 CPU4 [ 8.952079] IRQ214 -> 0-7 CPU5 [ 8.952091] IRQ215 -> 0-7 CPU6 [ 8.952103] IRQ216 -> 0-7 CPU7 [ 8.952115] IRQ217 -> 0-7 CPU0 [ 8.952126] IRQ218 -> 0-7 CPU1 [ 8.952138] IRQ219 -> 0-7 CPU2 [ 8.952150] IRQ220 -> 0-7 CPU3 [ 8.952162] IRQ221 -> 0-7 CPU4 [ 8.952174] IRQ222 -> 0-7 CPU5 [ 8.952185] IRQ223 -> 0-7 CPU6 [ 8.952197] IRQ224 -> 0-7 CPU7 [ 8.952209] IRQ225 -> 0-7 CPU0 [ 8.952220] IRQ226 -> 0-7 CPU1 [ 8.952232] IRQ227 -> 0-7 CPU2 [ 8.952244] IRQ228 -> 0-7 CPU3 [ 8.952255] IRQ229 -> 0-7 CPU4 [ 8.952267] IRQ230 -> 0-7 CPU5 [ 8.952278] IRQ231 -> 0-7 CPU6 [ 8.952290] IRQ232 -> 0-7 CPU7 [ 8.954072] ITS: alloc 8192:32 [ 8.954081] ITT 32 entries, 5 bits [ 8.954128] ID:0 pID:8192 vID:201 [ 8.954137] IRQ201 -> 0-7 CPU0 [ 8.954328] IRQ201 -> 0-7 CPU0 [ 8.954357] pcieport 0006:00:00.0: PME: Signaling with IRQ 201 [ 8.960980] pcieport 0006:00:00.0: AER: enabled with IRQ 201 [ 8.967607] ath11k_pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 8.976146] ath11k_pci 0006:01:00.0: enabling device (0000 -> 0002) [ 8.983071] ITS: alloc 8224:32 [ 8.983080] ITT 32 entries, 5 bits [ 8.983842] ID:0 pID:8224 vID:202 [ 8.983849] ID:1 pID:8225 vID:203 [ 8.983855] ID:2 pID:8226 vID:204 [ 8.983861] ID:3 pID:8227 vID:205 [ 8.983867] ID:4 pID:8228 vID:206 [ 8.983873] ID:5 pID:8229 vID:207 [ 8.983878] ID:6 pID:8230 vID:208 [ 8.983884] ID:7 pID:8231 vID:209 [ 8.983890] ID:8 pID:8232 vID:210 [ 8.983895] ID:9 pID:8233 vID:211 [ 8.983901] ID:10 pID:8234 vID:212 [ 8.983907] ID:11 pID:8235 vID:213 [ 8.983913] ID:12 pID:8236 vID:214 [ 8.983919] ID:13 pID:8237 vID:215 [ 8.983925] ID:14 pID:8238 vID:216 [ 8.983931] ID:15 pID:8239 vID:217 [ 8.983937] ID:16 pID:8240 vID:218 [ 8.983942] ID:17 pID:8241 vID:219 [ 8.983948] ID:18 pID:8242 vID:220 [ 8.983954] ID:19 pID:8243 vID:221 [ 8.983960] ID:20 pID:8244 vID:222 [ 8.983965] ID:21 pID:8245 vID:223 [ 8.983971] ID:22 pID:8246 vID:224 [ 8.983977] ID:23 pID:8247 vID:225 [ 8.983983] ID:24 pID:8248 vID:226 [ 8.983989] ID:25 pID:8249 vID:227 [ 8.983995] ID:26 pID:8250 vID:228 [ 8.984000] ID:27 pID:8251 vID:229 [ 8.984006] ID:28 pID:8252 vID:230 [ 8.984012] ID:29 pID:8253 vID:231 [ 8.984018] ID:30 pID:8254 vID:232 [ 8.984024] ID:31 pID:8255 vID:233 [ 8.984102] IRQ202 -> 0-7 CPU1 [ 8.984148] IRQ203 -> 0-7 CPU2 [ 8.984160] IRQ204 -> 0-7 CPU3 [ 8.984172] IRQ205 -> 0-7 CPU4 [ 8.984184] IRQ206 -> 0-7 CPU5 [ 8.984196] IRQ207 -> 0-7 CPU6 [ 8.984208] IRQ208 -> 0-7 CPU7 [ 8.984220] IRQ209 -> 0-7 CPU0 [ 8.984231] IRQ210 -> 0-7 CPU1 [ 8.984243] IRQ211 -> 0-7 CPU2 [ 8.984255] IRQ212 -> 0-7 CPU3 [ 8.984267] IRQ213 -> 0-7 CPU4 [ 8.984279] IRQ214 -> 0-7 CPU5 [ 8.984291] IRQ215 -> 0-7 CPU6 [ 8.984303] IRQ216 -> 0-7 CPU7 [ 8.984315] IRQ217 -> 0-7 CPU0 [ 8.984326] IRQ218 -> 0-7 CPU1 [ 8.984338] IRQ219 -> 0-7 CPU2 [ 8.984350] IRQ220 -> 0-7 CPU3 [ 8.984362] IRQ221 -> 0-7 CPU4 [ 8.984373] IRQ222 -> 0-7 CPU5 [ 8.984385] IRQ223 -> 0-7 CPU6 [ 8.984398] IRQ224 -> 0-7 CPU7 [ 8.984409] IRQ225 -> 0-7 CPU0 [ 8.984422] IRQ226 -> 0-7 CPU1 [ 8.984434] IRQ227 -> 0-7 CPU2 [ 8.984445] IRQ228 -> 0-7 CPU3 [ 8.984457] IRQ229 -> 0-7 CPU4 [ 8.984469] IRQ230 -> 0-7 CPU5 [ 8.984481] IRQ231 -> 0-7 CPU6 [ 8.984492] IRQ232 -> 0-7 CPU7 [ 8.984504] IRQ233 -> 0-7 CPU0 [ 8.984619] ath11k_pci 0006:01:00.0: MSI vectors: 32 [ 8.990070] ath11k_pci 0006:01:00.0: wcn6855 hw2.0 [ 8.998289] IRQ202 -> 0-7 CPU1 [ 8.998348] IRQ203 -> 0-7 CPU2 [ 8.998376] IRQ204 -> 0-7 CPU3 [ 9.001890] IRQ205 -> 0-7 CPU4 [ 9.001923] IRQ206 -> 0-7 CPU5 [ 9.001953] IRQ207 -> 0-7 CPU6 [ 9.001977] IRQ208 -> 0-7 CPU7 [ 9.002003] IRQ209 -> 0-7 CPU0 [ 9.002031] IRQ210 -> 0-7 CPU1 [ 9.002055] IRQ211 -> 0-7 CPU2 [ 9.002117] IRQ216 -> 0-7 CPU7 [ 9.002168] IRQ217 -> 0-7 CPU0 [ 9.002210] IRQ218 -> 0-7 CPU1 [ 9.002257] IRQ220 -> 0-7 CPU3 [ 9.002296] IRQ221 -> 0-7 CPU4 [ 9.002337] IRQ222 -> 0-7 CPU5 [ 9.002381] IRQ223 -> 0-7 CPU6 [ 9.002421] IRQ224 -> 0-7 CPU7 [ 9.002460] IRQ225 -> 0-7 CPU0 [ 9.002499] IRQ226 -> 0-7 CPU1 [ 9.162382] mhi mhi0: Requested to power ON [ 9.167114] mhi mhi0: Power on setup success [ 29.680356] mhi mhi0: Device link is not accessible [ 29.685437] mhi mhi0: MHI did not enter READY state [ 29.690841] ath11k_pci 0006:01:00.0: failed to power up mhi: -110 [ 29.697136] ath11k_pci 0006:01:00.0: failed to start mhi: -110 [ 29.703153] ath11k_pci 0006:01:00.0: failed to power up :-110 [ 29.732144] ath11k_pci 0006:01:00.0: failed to create soc core: -110 [ 29.738694] ath11k_pci 0006:01:00.0: failed to init core: -110 [ 32.841758] ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -110 [ 32.852799] qcom-pcie 1c10000.pcie: supply vdda not found, using dummy regulator [ 32.860924] qcom-pcie 1c10000.pcie: host bridge /soc@0/pcie@1c10000 ranges: [ 32.868157] qcom-pcie 1c10000.pcie: IO 0x0034200000..0x00342fffff -> 0x0000000000 [ 32.876428] qcom-pcie 1c10000.pcie: MEM 0x0034300000..0x0035ffffff -> 0x0034300000 [ 33.001705] qcom-pcie 1c10000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 33.111456] qcom-pcie 1c10000.pcie: PCIe Gen.3 x2 link up [ 33.117554] qcom-pcie 1c10000.pcie: PCI host bridge to bus 0004:00 [ 33.124000] pci_bus 0004:00: root bus resource [bus 00-ff] [ 33.129745] pci_bus 0004:00: root bus resource [io 0x100000-0x1fffff] (bus address [0x0000-0xfffff]) [ 33.139324] pci_bus 0004:00: root bus resource [mem 0x34300000-0x35ffffff] [ 33.146525] pci 0004:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 33.154167] pci 0004:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 33.160373] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 33.165804] pci 0004:00:00.0: bridge window [io 0x100000-0x100fff] [ 33.172482] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 33.179515] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 33.187622] pci 0004:00:00.0: PME# supported from D0 D3hot D3cold [ 33.195555] pci 0004:01:00.0: [17cb:0306] type 00 class 0xff0000 PCIe Endpoint [ 33.203462] pci 0004:01:00.0: BAR 0 [mem 0x00000000-0x00000fff 64bit] [ 33.210163] pci 0004:01:00.0: BAR 2 [mem 0x00000000-0x00000fff 64bit] [ 33.217379] pci 0004:01:00.0: PME# supported from D0 D3hot D3cold [ 33.223825] pci 0004:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0004:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link) [ 33.251876] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff]: assigned [ 33.259599] pci 0004:00:00.0: BAR 0 [mem 0x34400000-0x34400fff]: assigned [ 33.266621] pci 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 33.274186] pci 0004:01:00.0: BAR 2 [mem 0x34301000-0x34301fff 64bit]: assigned [ 33.281748] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 33.287133] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff] [ 33.294322] Reusing ITT for devID 0 [ 33.296005] Reusing ITT for devID 0 [ 33.296053] ID:1 pID:8193 vID:203 [ 33.296066] IRQ203 -> 0-7 CPU1 [ 33.296176] IRQ203 -> 0-7 CPU1 [ 33.296240] pcieport 0004:00:00.0: PME: Signaling with IRQ 203 [ 33.302538] pcieport 0004:00:00.0: AER: enabled with IRQ 203 [ 33.308587] mhi-pci-generic 0004:01:00.0: MHI PCI device found: foxconn-sdx55 [ 33.315945] mhi-pci-generic 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 33.324583] mhi-pci-generic 0004:01:00.0: enabling device (0000 -> 0002) [ 33.331610] ITS: alloc 8224:8 [ 33.331619] ITT 8 entries, 3 bits [ 33.331750] ID:0 pID:8224 vID:204 [ 33.331756] ID:1 pID:8225 vID:205 [ 33.331762] ID:2 pID:8226 vID:206 [ 33.331769] ID:3 pID:8227 vID:207 [ 33.331774] ID:4 pID:8228 vID:208 [ 33.331791] IRQ204 -> 0-7 CPU2 [ 33.331837] IRQ205 -> 0-7 CPU3 [ 33.331848] IRQ206 -> 0-7 CPU4 [ 33.331860] IRQ207 -> 0-7 CPU5 [ 33.331872] IRQ208 -> 0-7 CPU6 [ 33.332711] IRQ204 -> 0-7 CPU2 [ 33.333016] IRQ205 -> 0-7 CPU3 [ 33.333042] IRQ206 -> 0-7 CPU4 [ 33.333066] IRQ207 -> 0-7 CPU5 [ 33.333090] IRQ208 -> 0-7 CPU6 [ 33.335976] mhi mhi0: Requested to power ON [ 33.340327] mhi mhi0: Power on setup success [ 54.242353] mhi-pci-generic 0004:01:00.0: failed to power up MHI controller [ 54.251547] mhi-pci-generic 0004:01:00.0: probe with driver mhi-pci-generic failed with error -110 [ 54.262662] qcom-pcie 1c20000.pcie: supply vdda not found, using dummy regulator [ 54.270794] qcom-pcie 1c20000.pcie: host bridge /soc@0/pcie@1c20000 ranges: [ 54.278042] qcom-pcie 1c20000.pcie: IO 0x003c200000..0x003c2fffff -> 0x0000000000 [ 54.286340] qcom-pcie 1c20000.pcie: MEM 0x003c300000..0x003dffffff -> 0x003c300000 [ 54.409356] qcom-pcie 1c20000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 54.519604] qcom-pcie 1c20000.pcie: PCIe Gen.3 x4 link up [ 54.525609] qcom-pcie 1c20000.pcie: PCI host bridge to bus 0002:00 [ 54.532017] pci_bus 0002:00: root bus resource [bus 00-ff] [ 54.537732] pci_bus 0002:00: root bus resource [io 0x200000-0x2fffff] (bus address [0x0000-0xfffff]) [ 54.547830] pci_bus 0002:00: root bus resource [mem 0x3c300000-0x3dffffff] [ 54.555523] pci 0002:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 54.563629] pci 0002:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 54.570244] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 54.576099] pci 0002:00:00.0: bridge window [io 0x200000-0x200fff] [ 54.583121] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 54.590473] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 54.598841] pci 0002:00:00.0: PME# supported from D0 D3hot D3cold [ 54.606657] pci 0002:01:00.0: [1e0f:0001] type 00 class 0x010802 PCIe Endpoint [ 54.614458] pci 0002:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit] [ 54.621900] pci 0002:01:00.0: PME# supported from D0 D3hot [ 54.635232] sd 0:0:0:0: [sda] Starting disk [ 54.641117] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff]: assigned [ 54.649086] pci 0002:00:00.0: BAR 0 [mem 0x3c400000-0x3c400fff]: assigned [ 54.656299] pci 0002:01:00.0: BAR 0 [mem 0x3c300000-0x3c303fff 64bit]: assigned [ 54.664083] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 54.669688] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff] [ 54.677113] Reusing ITT for devID 0 [ 54.678960] Reusing ITT for devID 0 [ 54.678994] ID:2 pID:8194 vID:205 [ 54.679005] IRQ205 -> 0-7 CPU2 [ 54.679103] IRQ205 -> 0-7 CPU2 [ 54.679123] pcieport 0002:00:00.0: PME: Signaling with IRQ 205 [ 54.685994] pcieport 0002:00:00.0: AER: enabled with IRQ 205 [ 54.693042] nvme nvme0: pci function 0002:01:00.0 [ 54.698150] nvme 0002:01:00.0: enabling device (0000 -> 0002) [ 54.704457] Reusing ITT for devID 100 [ 54.704500] ID:0 pID:8224 vID:206 [ 54.704509] IRQ206 -> 0-7 CPU3 [ 54.706919] IRQ206 -> 0-7 CPU3 [ 115.695904] nvme nvme0: I/O tag 0 (1000) QID 0 timeout, completion polled [ 177.135829] nvme nvme0: I/O tag 1 (1001) QID 0 timeout, completion polled [ 238.575830] nvme nvme0: I/O tag 2 (1002) QID 0 timeout, completion polled [ 300.023834] nvme nvme0: I/O tag 3 (1003) QID 0 timeout, completion polled [ 300.055992] nvme nvme0: allocated 61 MiB host memory buffer.
On Mon, 15 Jul 2024 15:10:01 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > On Mon, 15 Jul 2024 12:18:47 +0100, > > Johan Hovold <johan@kernel.org> wrote: > > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > > > This is version 4 of the series to convert ARM MSI handling over to > > > > per device MSI domains. > > > > This series only showed up in linux-next last Friday and broke interrupt > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > can confirm that the breakage is caused by commits: > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > wifi on one machine: > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > > > > > and backing up until the commit before 233db05bc37f makes the NVMe come > > > up again during boot on another. > > > > > > I have not tried to debug this further. > > > > I need a few things from you though, because you're not giving much to > > help you (and I'm travelling, which doesn't help). > > Yeah, this was just an early heads up. > > > Can you at least investigate what in ath11k_pci_alloc_msi() causes the > > wifi driver to be upset? Does it normally use a single MSI vector or > > MSI-X? How about your nVME device? > > It uses multiple vectors, but now it falls back to trying to allocate a > single one and even that fails with -ENOSPC: > > ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 > > Similar for the NVMe, it uses multiple vectors normally, but now only > the AER interrupts appears to be allocated for each controller and there > is a GICv3 interrupt for the NVMe: > > 208: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > 212: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > 214: 161 0 0 0 0 0 0 0 GICv3 562 Level nvme0q0, nvme0q1 > 215: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > That's an indication of the driver having failed its MSI allocation and gone back to INTx signalling. > Next boot, after disabling PCIe controller async probing, it's an MSI-X?!: > > 201: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > 203: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > 205: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > 206: 0 0 0 0 0 0 0 0 ITS-PCI-MSIX-0002:01:00.0 0 Edge nvme0q0 > So is this issue actually tied to the async probing? Does it always work if you disable it? > This time ath11k vector allocation succeeded, but the driver times out > eventually: > > [ 8.984619] ath11k_pci 0006:01:00.0: MSI vectors: 32 > [ 29.690841] ath11k_pci 0006:01:00.0: failed to power up mhi: -110 > [ 29.697136] ath11k_pci 0006:01:00.0: failed to start mhi: -110 > [ 29.703153] ath11k_pci 0006:01:00.0: failed to power up :-110 > [ 29.732144] ath11k_pci 0006:01:00.0: failed to create soc core: -110 > [ 29.738694] ath11k_pci 0006:01:00.0: failed to init core: -110 > [ 32.841758] ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -110 > > > It would also help if you could define the DEBUG symbol at the very > > top of irq-gic-v3-its.c and report the debug information that the ITS > > driver dumps. > > See below (with synchronous probing of the pcie controllers). I don't see much going wrong there, and the ITS driver correctly dishes out interrupts. I'll take the current -next for a ride on my own HW and see what happens. M.
On Tue, Jul 16, 2024 at 11:30:05AM +0100, Marc Zyngier wrote: > On Mon, 15 Jul 2024 15:10:01 +0100, > Johan Hovold <johan@kernel.org> wrote: > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > > On Mon, 15 Jul 2024 12:18:47 +0100, > > > Johan Hovold <johan@kernel.org> wrote: > > > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > > > > This is version 4 of the series to convert ARM MSI handling over to > > > > > per device MSI domains. > > > > > > This series only showed up in linux-next last Friday and broke interrupt > > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > > can confirm that the breakage is caused by commits: > > > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > > wifi on one machine: > > > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 Correction, this doesn't fix the wifi, but I'm not seeing these errors with the commit before cc23d1dfc959 as the ath11k driver doesn't get this far (or doesn't probe at all). > > > > and backing up until the commit before 233db05bc37f makes the NVMe come > > > > up again during boot on another. > > > > > > > > I have not tried to debug this further. > > > > > > I need a few things from you though, because you're not giving much to > > > help you (and I'm travelling, which doesn't help). > > > > Yeah, this was just an early heads up. > > > > > Can you at least investigate what in ath11k_pci_alloc_msi() causes the > > > wifi driver to be upset? Does it normally use a single MSI vector or > > > MSI-X? How about your nVME device? > > > > It uses multiple vectors, but now it falls back to trying to allocate a > > single one and even that fails with -ENOSPC: > > > > ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 > > > > Similar for the NVMe, it uses multiple vectors normally, but now only > > the AER interrupts appears to be allocated for each controller and there > > is a GICv3 interrupt for the NVMe: > > > > 208: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > > 212: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > > 214: 161 0 0 0 0 0 0 0 GICv3 562 Level nvme0q0, nvme0q1 > > 215: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > > > > That's an indication of the driver having failed its MSI allocation > and gone back to INTx signalling. > > > Next boot, after disabling PCIe controller async probing, it's an MSI-X?!: > > > > 201: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv > > 203: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv > > 205: 0 0 0 0 0 0 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv > > 206: 0 0 0 0 0 0 0 0 ITS-PCI-MSIX-0002:01:00.0 0 Edge nvme0q0 > > > > So is this issue actually tied to the async probing? Does it always > work if you disable it? There seem to multiple issues here. With the full series applied and normal async (i.e. parallel) probing of the PCIe controllers I sometimes see allocation failing with -ENOSPC (e.g. the above ath11k errors). This seems to indicate broken locking somewhere. With synchronous probing, allocation always seems to succeed but the ath11k (and modem) drivers time out as no interrupts are received. The NVMe driver sometimes falls back to INTx signalling and can access the drive, but often end up with an MSIX (?!) allocation and then fails to probe: [ 132.084740] nvme nvme0: I/O tag 17 (1011) QID 0 timeout, completion polled Johan
[Dropping shivamurthy.shastri@linutronix.de who is now bouncing...] On Tue, 16 Jul 2024 15:53:28 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Tue, Jul 16, 2024 at 11:30:05AM +0100, Marc Zyngier wrote: > > On Mon, 15 Jul 2024 15:10:01 +0100, > > Johan Hovold <johan@kernel.org> wrote: > > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > > > On Mon, 15 Jul 2024 12:18:47 +0100, > > > > Johan Hovold <johan@kernel.org> wrote: > > > > > On Sun, Jun 23, 2024 at 05:18:31PM +0200, Thomas Gleixner wrote: > > > > > > This is version 4 of the series to convert ARM MSI handling over to > > > > > > per device MSI domains. > > > > > > > > This series only showed up in linux-next last Friday and broke interrupt > > > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > > > can confirm that the breakage is caused by commits: > > > > > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > > > wifi on one machine: > > > > > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > Correction, this doesn't fix the wifi, but I'm not seeing these errors > with the commit before cc23d1dfc959 as the ath11k driver doesn't get > this far (or doesn't probe at all). I think we need to track one thing at a time. The wifi and nvme problems seem subtly different... Which is the exact commit that breaks nvme on your machine? [...] > > So is this issue actually tied to the async probing? Does it always > > work if you disable it? > > There seem to multiple issues here. > > With the full series applied and normal async (i.e. parallel) probing of > the PCIe controllers I sometimes see allocation failing with -ENOSPC > (e.g. the above ath11k errors). This seems to indicate broken locking > somewhere. Your log doesn't support this theory. At least not from an ITS perspective, as it keeps dishing out INTIDs (and it is very hard to run out of IRQs with the ITS). > > With synchronous probing, allocation always seems to succeed but the > ath11k (and modem) drivers time out as no interrupts are received. > > The NVMe driver sometimes falls back to INTx signalling and can access > the drive, but often end up with an MSIX (?!) allocation and then fails > to probe: > > [ 132.084740] nvme nvme0: I/O tag 17 (1011) QID 0 timeout, completion polled So one of my test boxes (ThunderX) fails this exact way, while another (Synquacer) is pretty happy. Still trying to understand the difference in behaviour. How do you enforce synchronous probing? M.
On Tue, Jul 16, 2024 at 07:21:39PM +0100, Marc Zyngier wrote: > On Tue, 16 Jul 2024 15:53:28 +0100, > Johan Hovold <johan@kernel.org> wrote: > > On Tue, Jul 16, 2024 at 11:30:05AM +0100, Marc Zyngier wrote: > > > On Mon, 15 Jul 2024 15:10:01 +0100, > > > Johan Hovold <johan@kernel.org> wrote: > > > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > > > > On Mon, 15 Jul 2024 12:18:47 +0100, > > > > > Johan Hovold <johan@kernel.org> wrote: > > > > > > This series only showed up in linux-next last Friday and broke interrupt > > > > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > > > > can confirm that the breakage is caused by commits: > > > > > > > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > > > > wifi on one machine: > > > > > > > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > > > Correction, this doesn't fix the wifi, but I'm not seeing these errors > > with the commit before cc23d1dfc959 as the ath11k driver doesn't get [ This was supposed to say 3d1c927c08fc, which is the mainline hash, sorry. ] > > this far (or doesn't probe at all). > > I think we need to track one thing at a time. The wifi and nvme > problems seem subtly different... Which is the exact commit that > breaks nvme on your machine? Yeah, forget about 3d1c927c08fc for now, which may have been a red herring since we're also appear to be dealing with some sort of race and (some) symptoms keep changing from boot to boot. The only thing that for certain is that the series breaks MSI and that the NVMe breaks with commit 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]"). > > > So is this issue actually tied to the async probing? Does it always > > > work if you disable it? > > > > There seem to multiple issues here. > > > > With the full series applied and normal async (i.e. parallel) probing of > > the PCIe controllers I sometimes see allocation failing with -ENOSPC > > (e.g. the above ath11k errors). This seems to indicate broken locking > > somewhere. > > Your log doesn't support this theory. At least not from an ITS > perspective, as it keeps dishing out INTIDs (and it is very hard to > run out of IRQs with the ITS). The log I shared was with synchronous probing which takes parallel allocation out of the equation (and gives more readable logs) so that is expected. See below for a log with normal async probing that may give some more insight into the race as well (i.e. when ath11k allocation fails with -ENOSPC.) > > With synchronous probing, allocation always seems to succeed but the > > ath11k (and modem) drivers time out as no interrupts are received. > > > > The NVMe driver sometimes falls back to INTx signalling and can access > > the drive, but often end up with an MSIX (?!) allocation and then fails > > to probe: > > > > [ 132.084740] nvme nvme0: I/O tag 17 (1011) QID 0 timeout, completion polled > > So one of my test boxes (ThunderX) fails this exact way, while another > (Synquacer) is pretty happy. Still trying to understand the difference > in behaviour. > > How do you enforce synchronous probing? I believe there is a kernel parameter for this (e.g. module.async_probe), but I just disable async probing for the Qualcomm PCIe driver I'm using: --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -1684,7 +1684,7 @@ static struct platform_driver qcom_pcie_driver = { .name = "qcom-pcie", .of_match_table = qcom_pcie_match, .pm = &qcom_pcie_pm_ops, - .probe_type = PROBE_PREFER_ASYNCHRONOUS, + //.probe_type = PROBE_PREFER_ASYNCHRONOUS, }, }; Johan [ 8.323957] qcom-pcie 1c00000.pcie: host bridge /soc@0/pcie@1c00000 ranges: [ 8.334800] qcom-pcie 1c00000.pcie: IO 0x0030200000..0x00302fffff -> 0x0000000000 [ 8.348124] qcom-pcie 1c00000.pcie: MEM 0x0030300000..0x0031ffffff -> 0x0030300000 [ 8.378334] qcom-pcie 1c10000.pcie: host bridge /soc@0/pcie@1c10000 ranges: [ 8.378632] qcom-pcie 1c20000.pcie: host bridge /soc@0/pcie@1c20000 ranges: [ 8.378654] qcom-pcie 1c20000.pcie: IO 0x003c200000..0x003c2fffff -> 0x0000000000 [ 8.378666] qcom-pcie 1c20000.pcie: MEM 0x003c300000..0x003dffffff -> 0x003c300000 [ 8.391084] qcom-pcie 1c10000.pcie: IO 0x0034200000..0x00342fffff -> 0x0000000000 [ 8.419252] qcom-pcie 1c10000.pcie: MEM 0x0034300000..0x0035ffffff -> 0x0034300000 [ 8.477255] qcom-pcie 1c00000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.497259] qcom-pcie 1c20000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.537258] qcom-pcie 1c10000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.583746] qcom-pcie 1c00000.pcie: PCIe Gen.2 x1 link up [ 8.590079] qcom-pcie 1c00000.pcie: PCI host bridge to bus 0006:00 [ 8.596838] pci_bus 0006:00: root bus resource [bus 00-ff] [ 8.602874] pci_bus 0006:00: root bus resource [io 0x0000-0xfffff] [ 8.603809] qcom-pcie 1c20000.pcie: PCIe Gen.3 x4 link up [ 8.609322] pci_bus 0006:00: root bus resource [mem 0x30300000-0x31ffffff] [ 8.609393] pci 0006:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.615040] qcom-pcie 1c20000.pcie: PCI host bridge to bus 0002:00 [ 8.621951] pci 0006:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.629452] pci_bus 0002:00: root bus resource [bus 00-ff] [ 8.629706] pci_bus 0002:00: root bus resource [io 0x100000-0x1fffff] (bus address [0x0000-0xfffff]) [ 8.635822] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.641903] pci_bus 0002:00: root bus resource [mem 0x3c300000-0x3dffffff] [ 8.643728] qcom-pcie 1c10000.pcie: PCIe Gen.3 x2 link up [ 8.643851] qcom-pcie 1c10000.pcie: PCI host bridge to bus 0004:00 [ 8.643854] pci_bus 0004:00: root bus resource [bus 00-ff] [ 8.643857] pci_bus 0004:00: root bus resource [io 0x200000-0x2fffff] (bus address [0x0000-0xfffff]) [ 8.643859] pci_bus 0004:00: root bus resource [mem 0x34300000-0x35ffffff] [ 8.643873] pci 0004:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.643881] pci 0004:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.643890] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 8.643894] pci 0004:00:00.0: bridge window [io 0x200000-0x200fff] [ 8.643897] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.643903] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.643982] pci 0004:00:00.0: PME# supported from D0 D3hot D3cold [ 8.644933] pci 0004:01:00.0: [17cb:0306] type 00 class 0xff0000 PCIe Endpoint [ 8.645012] pci 0004:01:00.0: BAR 0 [mem 0x00000000-0x00000fff 64bit] [ 8.645063] pci 0004:01:00.0: BAR 2 [mem 0x00000000-0x00000fff 64bit] [ 8.645614] pci 0004:01:00.0: PME# supported from D0 D3hot D3cold [ 8.645768] pci 0004:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0004:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link) [ 8.647523] pci 0006:00:00.0: bridge window [io 0x0000-0x0fff] [ 8.659851] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff]: assigned [ 8.659862] pci 0002:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.659873] pci 0002:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.659883] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 8.659889] pci 0002:00:00.0: bridge window [io 0x100000-0x100fff] [ 8.659893] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.659900] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.659962] pci 0002:00:00.0: PME# supported from D0 D3hot D3cold [ 8.661170] pci 0002:01:00.0: [1e0f:0001] type 00 class 0x010802 PCIe Endpoint [ 8.661259] pci 0002:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit] [ 8.661825] pci 0002:01:00.0: PME# supported from D0 D3hot [ 8.662365] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.669410] pci 0004:00:00.0: BAR 0 [mem 0x34400000-0x34400fff]: assigned [ 8.671873] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff]: assigned [ 8.671879] pci 0002:00:00.0: BAR 0 [mem 0x3c400000-0x3c400fff]: assigned [ 8.671887] pci 0002:01:00.0: BAR 0 [mem 0x3c300000-0x3c303fff 64bit]: assigned [ 8.671931] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 8.671936] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff] [ 8.672719] ITS: alloc 8192:32 [ 8.672737] ITT 32 entries, 5 bits [ 8.674420] ID:0 pID:8192 vID:196 [ 8.674436] ID:1 pID:8193 vID:197 [ 8.674444] ID:2 pID:8194 vID:198 [ 8.674452] ID:3 pID:8195 vID:199 [ 8.674461] ID:4 pID:8196 vID:200 [ 8.674469] ID:5 pID:8197 vID:201 [ 8.674476] ID:6 pID:8198 vID:202 [ 8.674485] ID:7 pID:8199 vID:203 [ 8.674493] ID:8 pID:8200 vID:204 [ 8.674501] ID:9 pID:8201 vID:205 [ 8.674508] ID:10 pID:8202 vID:206 [ 8.674517] ID:11 pID:8203 vID:207 [ 8.674525] ID:12 pID:8204 vID:208 [ 8.674532] ID:13 pID:8205 vID:209 [ 8.674540] ID:14 pID:8206 vID:210 [ 8.674548] ID:15 pID:8207 vID:211 [ 8.674556] ID:16 pID:8208 vID:212 [ 8.674564] ID:17 pID:8209 vID:213 [ 8.674572] ID:18 pID:8210 vID:214 [ 8.674580] ID:19 pID:8211 vID:215 [ 8.674588] ID:20 pID:8212 vID:216 [ 8.674596] ID:21 pID:8213 vID:217 [ 8.674604] ID:22 pID:8214 vID:218 [ 8.674612] ID:23 pID:8215 vID:219 [ 8.674620] ID:24 pID:8216 vID:220 [ 8.674628] ID:25 pID:8217 vID:221 [ 8.674636] ID:26 pID:8218 vID:222 [ 8.674643] ID:27 pID:8219 vID:223 [ 8.674651] ID:28 pID:8220 vID:224 [ 8.674659] ID:29 pID:8221 vID:225 [ 8.674667] ID:30 pID:8222 vID:226 [ 8.674675] ID:31 pID:8223 vID:227 [ 8.674824] IRQ196 -> 0-7 CPU0 [ 8.674850] IRQ197 -> 0-7 CPU1 [ 8.674864] IRQ198 -> 0-7 CPU2 [ 8.674878] IRQ199 -> 0-7 CPU3 [ 8.674891] IRQ200 -> 0-7 CPU4 [ 8.674905] IRQ201 -> 0-7 CPU5 [ 8.674918] IRQ202 -> 0-7 CPU6 [ 8.674932] IRQ203 -> 0-7 CPU7 [ 8.674945] IRQ204 -> 0-7 CPU0 [ 8.674951] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.675005] pci 0006:00:00.0: PME# supported from D0 D3hot D3cold [ 8.675887] pci 0006:01:00.0: [17cb:1103] type 00 class 0x028000 PCIe Endpoint [ 8.675983] pci 0006:01:00.0: BAR 0 [mem 0x00000000-0x001fffff 64bit] [ 8.676613] pci 0006:01:00.0: PME# supported from D0 D3hot D3cold [ 8.676779] pci 0006:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0006:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link) [ 8.681292] pci 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 8.681332] pci 0004:01:00.0: BAR 2 [mem 0x34301000-0x34301fff 64bit]: assigned [ 8.686968] IRQ205 -> 0-7 CPU1 [ 8.691823] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff]: assigned [ 8.691825] pci 0006:00:00.0: BAR 0 [mem 0x30300000-0x30300fff]: assigned [ 8.691829] pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 8.691877] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.691880] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff] [ 8.692011] Reusing ITT for devID 0 [ 8.693668] Reusing ITT for devID 0 [ 8.693871] pcieport 0006:00:00.0: PME: Signaling with IRQ 228 [ 8.694116] pcieport 0006:00:00.0: AER: enabled with IRQ 228 [ 8.696453] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 8.703760] IRQ206 -> 0-7 CPU2 [ 8.710986] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff] [ 8.711136] Reusing ITT for devID 0 [ 8.717093] IRQ207 -> 0-7 CPU3 [ 8.723889] Reusing ITT for devID 0 [ 8.729600] IRQ208 -> 0-7 CPU4 [ 8.736507] pcieport 0004:00:00.0: PME: Signaling with IRQ 229 [ 8.744261] IRQ209 -> 0-7 CPU5 [ 8.750757] pcieport 0004:00:00.0: AER: enabled with IRQ 229 [ 8.758038] IRQ210 -> 0-7 CPU6 [ 9.071793] IRQ211 -> 0-7 CPU7 [ 9.071807] IRQ212 -> 0-7 CPU0 [ 9.071819] IRQ213 -> 0-7 CPU1 [ 9.071831] IRQ214 -> 0-7 CPU2 [ 9.071842] IRQ215 -> 0-7 CPU3 [ 9.071852] IRQ216 -> 0-7 CPU4 [ 9.071863] IRQ217 -> 0-7 CPU5 [ 9.071875] IRQ218 -> 0-7 CPU6 [ 9.071886] IRQ219 -> 0-7 CPU7 [ 9.071897] IRQ220 -> 0-7 CPU0 [ 9.071907] IRQ221 -> 0-7 CPU1 [ 9.071920] IRQ222 -> 0-7 CPU2 [ 9.071930] IRQ223 -> 0-7 CPU3 [ 9.071941] IRQ224 -> 0-7 CPU4 [ 9.071952] IRQ225 -> 0-7 CPU5 [ 9.071962] IRQ226 -> 0-7 CPU6 [ 9.071973] IRQ227 -> 0-7 CPU7 [ 9.073568] Reusing ITT for devID 0 [ 9.073607] ID:0 pID:8192 vID:196 [ 9.073618] IRQ196 -> 0-7 CPU0 [ 9.073717] IRQ196 -> 0-7 CPU0 [ 9.073737] pcieport 0002:00:00.0: PME: Signaling with IRQ 196 [ 9.086532] pcieport 0002:00:00.0: AER: enabled with IRQ 196 [ 9.102057] mhi-pci-generic 0004:01:00.0: MHI PCI device found: foxconn-sdx55 [ 9.109830] mhi-pci-generic 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 9.119027] mhi-pci-generic 0004:01:00.0: enabling device (0000 -> 0002) [ 9.127271] ITS: alloc 8224:8 [ 9.141500] ITT 8 entries, 3 bits [ 9.144502] ID:0 pID:8224 vID:198 [ 9.144597] ID:1 pID:8225 vID:199 [ 9.144605] ID:2 pID:8226 vID:200 [ 9.144612] ID:3 pID:8227 vID:201 [ 9.144619] ID:4 pID:8228 vID:202 [ 9.144689] IRQ198 -> 0-7 CPU1 [ 9.144888] IRQ199 -> 0-7 CPU2 [ 9.144901] IRQ200 -> 0-7 CPU3 [ 9.144914] IRQ201 -> 0-7 CPU4 [ 9.144927] IRQ202 -> 0-7 CPU5 [ 9.151264] IRQ198 -> 0-7 CPU1 [ 9.151479] IRQ199 -> 0-7 CPU2 [ 9.151673] IRQ200 -> 0-7 CPU3 [ 9.151849] IRQ201 -> 0-7 CPU4 [ 9.152056] IRQ202 -> 0-7 CPU5 [ 9.159972] mhi mhi0: Requested to power ON [ 9.165275] mhi mhi0: Power on setup success [ 9.279951] ath11k_pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 9.288208] ath11k_pci 0006:01:00.0: enabling device (0000 -> 0002) [ 9.301708] nvme nvme0: pci function 0002:01:00.0 [ 9.307052] Reusing ITT for devID 100 [ 9.315457] nvme 0002:01:00.0: enabling device (0000 -> 0002) [ 9.326554] Reusing ITT for devID 100 [ 9.336332] ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 [ 9.344362] Reusing ITT for devID 100 [ 9.351639] ath11k_pci 0006:01:00.0: failed to enable msi: -22 [ 9.351866] ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 [ 9.360327] Reusing ITT for devID 100 [ 9.654429] nvme nvme0: allocated 61 MiB host memory buffer. [ 9.814664] Reusing ITT for devID 100 [ 9.815000] Reusing ITT for devID 100 [ 9.815553] Reusing ITT for devID 100 [ 9.843417] nvme nvme0: 1/0/0 default/read/poll queues [ 9.875782] nvme0n1: p1 [ 29.666877] mhi-pci-generic 0004:01:00.0: failed to power up MHI controller [ 29.681492] mhi-pci-generic 0004:01:00.0: probe with driver mhi-pci-generic failed with error -110
On Wed, 17 Jul 2024 08:23:39 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Tue, Jul 16, 2024 at 07:21:39PM +0100, Marc Zyngier wrote: > > On Tue, 16 Jul 2024 15:53:28 +0100, > > Johan Hovold <johan@kernel.org> wrote: > > > On Tue, Jul 16, 2024 at 11:30:05AM +0100, Marc Zyngier wrote: > > > > On Mon, 15 Jul 2024 15:10:01 +0100, > > > > Johan Hovold <johan@kernel.org> wrote: > > > > > On Mon, Jul 15, 2024 at 01:58:13PM +0100, Marc Zyngier wrote: > > > > > > On Mon, 15 Jul 2024 12:18:47 +0100, > > > > > > Johan Hovold <johan@kernel.org> wrote: > > > > > > > > This series only showed up in linux-next last Friday and broke interrupt > > > > > > > handling on Qualcomm platforms like sc8280xp (e.g. Lenovo ThinkPad X13s) > > > > > > > and x1e80100 that use the GIC ITS for PCIe MSIs. > > > > > > > > > > > > > > I've applied the series (21 commits from linux-next) on top of 6.10 and > > > > > > > can confirm that the breakage is caused by commits: > > > > > > > > > > > > > > 3d1c927c08fc ("irqchip/gic-v3-its: Switch platform MSI to MSI parent") > > > > > > > 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]") > > > > > > > > > > > > > > Applying the series up until the change before 3d1c927c08fc unbreaks the > > > > > > > wifi on one machine: > > > > > > > > > > > > > > ath11k_pci 0006:01:00.0: failed to enable msi: -22 > > > > > > > ath11k_pci 0006:01:00.0: probe with driver ath11k_pci failed with error -22 > > > > > > Correction, this doesn't fix the wifi, but I'm not seeing these errors > > > with the commit before cc23d1dfc959 as the ath11k driver doesn't get > > [ This was supposed to say 3d1c927c08fc, which is the mainline hash, > sorry. ] > > > > this far (or doesn't probe at all). > > > > I think we need to track one thing at a time. The wifi and nvme > > problems seem subtly different... Which is the exact commit that > > breaks nvme on your machine? > > Yeah, forget about 3d1c927c08fc for now, which may have been a red > herring since we're also appear to be dealing with some sort of race and > (some) symptoms keep changing from boot to boot. The only thing that for > certain is that the series breaks MSI and that the NVMe breaks with > commit 233db05bc37f ("irqchip/gic-v3-its: Provide MSI parent for > PCI/MSI[-X]"). > > > > > So is this issue actually tied to the async probing? Does it always > > > > work if you disable it? > > > > > > There seem to multiple issues here. > > > > > > With the full series applied and normal async (i.e. parallel) probing of > > > the PCIe controllers I sometimes see allocation failing with -ENOSPC > > > (e.g. the above ath11k errors). This seems to indicate broken locking > > > somewhere. > > > > Your log doesn't support this theory. At least not from an ITS > > perspective, as it keeps dishing out INTIDs (and it is very hard to > > run out of IRQs with the ITS). > > The log I shared was with synchronous probing which takes parallel > allocation out of the equation (and gives more readable logs) so that is > expected. See below for a log with normal async probing that may give > some more insight into the race as well (i.e. when ath11k allocation > fails with -ENOSPC.) Huh, this log is actually pointing at something very ugly. Not a race, but some horrible ID confusion. See below. > > > > With synchronous probing, allocation always seems to succeed but the > > > ath11k (and modem) drivers time out as no interrupts are received. > > > > > > The NVMe driver sometimes falls back to INTx signalling and can access > > > the drive, but often end up with an MSIX (?!) allocation and then fails > > > to probe: > > > > > > [ 132.084740] nvme nvme0: I/O tag 17 (1011) QID 0 timeout, completion polled > > > > So one of my test boxes (ThunderX) fails this exact way, while another > > (Synquacer) is pretty happy. Still trying to understand the difference > > in behaviour. > > > > How do you enforce synchronous probing? > > I believe there is a kernel parameter for this (e.g. > module.async_probe), but I just disable async probing for the Qualcomm > PCIe driver I'm using: I had tried this module parameter, but it didn't change anything on my end. > > --- a/drivers/pci/controller/dwc/pcie-qcom.c > +++ b/drivers/pci/controller/dwc/pcie-qcom.c > @@ -1684,7 +1684,7 @@ static struct platform_driver qcom_pcie_driver = { > .name = "qcom-pcie", > .of_match_table = qcom_pcie_match, > .pm = &qcom_pcie_pm_ops, > - .probe_type = PROBE_PREFER_ASYNCHRONOUS, > + //.probe_type = PROBE_PREFER_ASYNCHRONOUS, > }, > }; I'll have a look whether the TX1 PCIe driver uses this. It's positively ancient, so I wouldn't bet that it has been touched significantly in the past 5 years. [...] > [ 8.692011] Reusing ITT for devID 0 > [ 8.693668] Reusing ITT for devID 0 This is really odd. It indicates that you have several devices sharing the same DeviceID, which I seriously doubt it is the case in a laptop. Do you have any non-transparent bridge here? lspci would help. > [ 8.693871] pcieport 0006:00:00.0: PME: Signaling with IRQ 228 > [ 8.694116] pcieport 0006:00:00.0: AER: enabled with IRQ 228 > [ 8.696453] pci 0004:00:00.0: PCI bridge to [bus 01-ff] > [ 8.703760] IRQ206 -> 0-7 CPU2 > [ 8.710986] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff] > [ 8.711136] Reusing ITT for devID 0 Where is the bus number gone? > [ 8.717093] IRQ207 -> 0-7 CPU3 > [ 8.723889] Reusing ITT for devID 0 > [ 8.729600] IRQ208 -> 0-7 CPU4 > [ 8.736507] pcieport 0004:00:00.0: PME: Signaling with IRQ 229 > [ 8.744261] IRQ209 -> 0-7 CPU5 > [ 8.750757] pcieport 0004:00:00.0: AER: enabled with IRQ 229 > [ 8.758038] IRQ210 -> 0-7 CPU6 > [ 9.071793] IRQ211 -> 0-7 CPU7 > [ 9.071807] IRQ212 -> 0-7 CPU0 > [ 9.071819] IRQ213 -> 0-7 CPU1 > [ 9.071831] IRQ214 -> 0-7 CPU2 > [ 9.071842] IRQ215 -> 0-7 CPU3 > [ 9.071852] IRQ216 -> 0-7 CPU4 > [ 9.071863] IRQ217 -> 0-7 CPU5 > [ 9.071875] IRQ218 -> 0-7 CPU6 > [ 9.071886] IRQ219 -> 0-7 CPU7 > [ 9.071897] IRQ220 -> 0-7 CPU0 > [ 9.071907] IRQ221 -> 0-7 CPU1 > [ 9.071920] IRQ222 -> 0-7 CPU2 > [ 9.071930] IRQ223 -> 0-7 CPU3 > [ 9.071941] IRQ224 -> 0-7 CPU4 > [ 9.071952] IRQ225 -> 0-7 CPU5 > [ 9.071962] IRQ226 -> 0-7 CPU6 > [ 9.071973] IRQ227 -> 0-7 CPU7 > [ 9.073568] Reusing ITT for devID 0 > [ 9.073607] ID:0 pID:8192 vID:196 > [ 9.073618] IRQ196 -> 0-7 CPU0 > [ 9.073717] IRQ196 -> 0-7 CPU0 > [ 9.073737] pcieport 0002:00:00.0: PME: Signaling with IRQ 196 > [ 9.086532] pcieport 0002:00:00.0: AER: enabled with IRQ 196 > [ 9.102057] mhi-pci-generic 0004:01:00.0: MHI PCI device found: foxconn-sdx55 > [ 9.109830] mhi-pci-generic 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned > [ 9.119027] mhi-pci-generic 0004:01:00.0: enabling device (0000 -> 0002) > [ 9.127271] ITS: alloc 8224:8 > [ 9.141500] ITT 8 entries, 3 bits > [ 9.144502] ID:0 pID:8224 vID:198 > [ 9.144597] ID:1 pID:8225 vID:199 > [ 9.144605] ID:2 pID:8226 vID:200 > [ 9.144612] ID:3 pID:8227 vID:201 > [ 9.144619] ID:4 pID:8228 vID:202 > [ 9.144689] IRQ198 -> 0-7 CPU1 > [ 9.144888] IRQ199 -> 0-7 CPU2 > [ 9.144901] IRQ200 -> 0-7 CPU3 > [ 9.144914] IRQ201 -> 0-7 CPU4 > [ 9.144927] IRQ202 -> 0-7 CPU5 > [ 9.151264] IRQ198 -> 0-7 CPU1 > [ 9.151479] IRQ199 -> 0-7 CPU2 > [ 9.151673] IRQ200 -> 0-7 CPU3 > [ 9.151849] IRQ201 -> 0-7 CPU4 > [ 9.152056] IRQ202 -> 0-7 CPU5 > [ 9.159972] mhi mhi0: Requested to power ON > [ 9.165275] mhi mhi0: Power on setup success > [ 9.279951] ath11k_pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned > [ 9.288208] ath11k_pci 0006:01:00.0: enabling device (0000 -> 0002) > [ 9.301708] nvme nvme0: pci function 0002:01:00.0 > [ 9.307052] Reusing ITT for devID 100 > [ 9.315457] nvme 0002:01:00.0: enabling device (0000 -> 0002) This is device 0002:01:00.0... > [ 9.326554] Reusing ITT for devID 100 ... seen as device 0000:01:00.0. WTF??? > [ 9.336332] ath11k_pci 0006:01:00.0: ath11k_pci_alloc_msi - requesting one vector failed: -28 I'm starting to suspect that the new code doesn't carry all the required bits for the DevID, and that we end-up trying to allocated interrupts from the pool allocated to another device, which can never be a good thing, and would explain why everything dies a painful death. Can you run the same trace with the whole thing reverted? I think we're on something here. Thanks, M.
On Wed, Jul 17, 2024 at 01:54:40PM +0100, Marc Zyngier wrote: > On Wed, 17 Jul 2024 08:23:39 +0100, > Johan Hovold <johan@kernel.org> wrote: > > I believe there is a kernel parameter for this (e.g. > > module.async_probe), but I just disable async probing for the Qualcomm > > PCIe driver I'm using: > > I had tried this module parameter, but it didn't change anything on my > end. > I'll have a look whether the TX1 PCIe driver uses this. It's > positively ancient, so I wouldn't bet that it has been touched > significantly in the past 5 years. Perhaps async probing just changes the symptoms, the NVMe and wifi doesn't work in either case. > > [ 8.692011] Reusing ITT for devID 0 > > [ 8.693668] Reusing ITT for devID 0 > > This is really odd. It indicates that you have several devices sharing > the same DeviceID, which I seriously doubt it is the case in a > laptop. Do you have any non-transparent bridge here? lspci would help. Yeah, and these messages do not show up without the series (see log below). They are there in the previous synchronous log however. 0002:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port 0002:01:00.0 Non-Volatile memory controller: KIOXIA Corporation NVMe SSD Controller BG4 (DRAM-less) 0004:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port 0004:01:00.0 Unassigned class [ff00]: Qualcomm Technologies, Inc SDX55 [Snapdragon X55 5G] 0006:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port 0006:01:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765 Wireless Network Adapter (rev 01) > I'm starting to suspect that the new code doesn't carry all the > required bits for the DevID, and that we end-up trying to allocated > interrupts from the pool allocated to another device, which can never > be a good thing, and would explain why everything dies a painful > death. > > Can you run the same trace with the whole thing reverted? I think > we're on something here. See below, using normal asynchronous probing like the previous log. Johan [ 8.129424] qcom-pcie 1c10000.pcie: host bridge /soc@0/pcie@1c10000 ranges: [ 8.136886] qcom-pcie 1c10000.pcie: IO 0x0034200000..0x00342fffff -> 0x0000000000 [ 8.145351] qcom-pcie 1c00000.pcie: host bridge /soc@0/pcie@1c00000 ranges: [ 8.145372] qcom-pcie 1c10000.pcie: MEM 0x0034300000..0x0035ffffff -> 0x0034300000 [ 8.146042] qcom-pcie 1c20000.pcie: host bridge /soc@0/pcie@1c20000 ranges: [ 8.146063] qcom-pcie 1c20000.pcie: IO 0x003c200000..0x003c2fffff -> 0x0000000000 [ 8.146073] qcom-pcie 1c20000.pcie: MEM 0x003c300000..0x003dffffff -> 0x003c300000 [ 8.152546] qcom-pcie 1c00000.pcie: IO 0x0030200000..0x00302fffff -> 0x0000000000 [ 8.176372] qcom-pcie 1c00000.pcie: MEM 0x0030300000..0x0031ffffff -> 0x0030300000 [ 8.266560] qcom-pcie 1c20000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.298587] qcom-pcie 1c10000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.318753] qcom-pcie 1c00000.pcie: iATU: unroll T, 8 ob, 8 ib, align 4K, limit 1024G [ 8.377720] qcom-pcie 1c20000.pcie: PCIe Gen.3 x4 link up [ 8.384650] qcom-pcie 1c20000.pcie: PCI host bridge to bus 0002:00 [ 8.392099] pci_bus 0002:00: root bus resource [bus 00-ff] [ 8.398766] pci_bus 0002:00: root bus resource [io 0x100000-0x1fffff] (bus address [0x0000-0xfffff]) [ 8.405033] qcom-pcie 1c10000.pcie: PCIe Gen.3 x2 link up [ 8.408250] pci_bus 0002:00: root bus resource [mem 0x3c300000-0x3dffffff] [ 8.413899] qcom-pcie 1c10000.pcie: PCI host bridge to bus 0004:00 [ 8.420959] pci 0002:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.427201] pci_bus 0004:00: root bus resource [bus 00-ff] [ 8.427204] pci_bus 0004:00: root bus resource [io 0x0000-0xfffff] [ 8.427206] pci_bus 0004:00: root bus resource [mem 0x34300000-0x35ffffff] [ 8.427219] pci 0004:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.430158] qcom-pcie 1c00000.pcie: PCIe Gen.2 x1 link up [ 8.430263] qcom-pcie 1c00000.pcie: PCI host bridge to bus 0006:00 [ 8.430266] pci_bus 0006:00: root bus resource [bus 00-ff] [ 8.430269] pci_bus 0006:00: root bus resource [io 0x200000-0x2fffff] (bus address [0x0000-0xfffff]) [ 8.430271] pci_bus 0006:00: root bus resource [mem 0x30300000-0x31ffffff] [ 8.430285] pci 0006:00:00.0: [17cb:010e] type 01 class 0x060400 PCIe Root Port [ 8.430297] pci 0006:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.430307] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.430313] pci 0006:00:00.0: bridge window [io 0x200000-0x200fff] [ 8.430317] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.430324] pci 0006:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.430414] pci 0006:00:00.0: PME# supported from D0 D3hot D3cold [ 8.431430] pci 0006:01:00.0: [17cb:1103] type 00 class 0x028000 PCIe Endpoint [ 8.431526] pci 0006:01:00.0: BAR 0 [mem 0x00000000-0x001fffff 64bit] [ 8.432154] pci 0006:01:00.0: PME# supported from D0 D3hot D3cold [ 8.432320] pci 0006:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at 0006:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link) [ 8.434723] pci 0002:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.440358] pci 0004:00:00.0: BAR 0 [mem 0x00000000-0x00000fff] [ 8.445157] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff]: assigned [ 8.445160] pci 0006:00:00.0: BAR 0 [mem 0x30300000-0x30300fff]: assigned [ 8.445163] pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 8.445211] pci 0006:00:00.0: PCI bridge to [bus 01-ff] [ 8.445214] pci 0006:00:00.0: bridge window [mem 0x30400000-0x305fffff] [ 8.445526] ITS: alloc 8192:32 [ 8.445537] ITT 32 entries, 5 bits [ 8.446675] ID:0 pID:8192 vID:196 [ 8.446697] ID:1 pID:8193 vID:197 [ 8.446702] ID:2 pID:8194 vID:198 [ 8.446707] ID:3 pID:8195 vID:199 [ 8.446712] ID:4 pID:8196 vID:200 [ 8.446718] ID:5 pID:8197 vID:201 [ 8.446722] ID:6 pID:8198 vID:202 [ 8.446727] ID:7 pID:8199 vID:203 [ 8.446732] ID:8 pID:8200 vID:204 [ 8.446738] ID:9 pID:8201 vID:205 [ 8.446743] ID:10 pID:8202 vID:206 [ 8.446748] ID:11 pID:8203 vID:207 [ 8.446753] ID:12 pID:8204 vID:208 [ 8.446758] ID:13 pID:8205 vID:209 [ 8.446763] ID:14 pID:8206 vID:210 [ 8.446768] ID:15 pID:8207 vID:211 [ 8.446773] ID:16 pID:8208 vID:212 [ 8.446777] ID:17 pID:8209 vID:213 [ 8.446783] ID:18 pID:8210 vID:214 [ 8.446788] ID:19 pID:8211 vID:215 [ 8.446805] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 8.446812] pci 0002:00:00.0: bridge window [io 0x100000-0x100fff] [ 8.446817] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.446827] pci 0002:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.446899] pci 0002:00:00.0: PME# supported from D0 D3hot D3cold [ 8.448399] pci 0002:01:00.0: [1e0f:0001] type 00 class 0x010802 PCIe Endpoint [ 8.448489] pci 0002:01:00.0: BAR 0 [mem 0x00000000-0x00003fff 64bit] [ 8.449076] pci 0002:01:00.0: PME# supported from D0 D3hot [ 8.453855] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 8.453860] pci 0004:00:00.0: bridge window [io 0x0000-0x0fff] [ 8.461133] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff]: assigned [ 8.461137] pci 0002:00:00.0: BAR 0 [mem 0x3c400000-0x3c400fff]: assigned [ 8.461141] pci 0002:01:00.0: BAR 0 [mem 0x3c300000-0x3c303fff 64bit]: assigned [ 8.461182] pci 0002:00:00.0: PCI bridge to [bus 01-ff] [ 8.461185] pci 0002:00:00.0: bridge window [mem 0x3c300000-0x3c3fffff] [ 8.461378] ID:20 pID:8212 vID:216 [ 8.466916] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff] [ 8.473265] ID:21 pID:8213 vID:217 [ 8.478893] pci 0004:00:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] [ 8.488351] ID:22 pID:8214 vID:218 [ 8.495446] pci 0004:00:00.0: PME# supported from D0 D3hot D3cold [ 8.502905] ID:23 pID:8215 vID:219 [ 8.509868] pci 0004:01:00.0: [17cb:0306] type 00 class 0xff0000 PCIe Endpoint [ 8.514345] ID:24 pID:8216 vID:220 [ 8.521029] pci 0004:01:00.0: BAR 0 [mem 0x00000000-0x00000fff 64bit] [ 8.527916] ID:25 pID:8217 vID:221 [ 8.535900] pci 0004:01:00.0: BAR 2 [mem 0x00000000-0x00000fff 64bit] [ 8.542116] ID:26 pID:8218 vID:222 [ 8.550074] pci 0004:01:00.0: PME# supported from D0 D3hot D3cold [ 8.556138] ID:27 pID:8219 vID:223 [ 8.562538] pci 0004:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0004:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link) [ 8.577637] ID:28 pID:8220 vID:224 [ 8.597112] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff]: assigned [ 8.597753] ID:29 pID:8221 vID:225 [ 8.604711] pci 0004:00:00.0: BAR 0 [mem 0x34400000-0x34400fff]: assigned [ 8.612214] ID:30 pID:8222 vID:226 [ 8.617572] pci 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 8.624536] ID:31 pID:8223 vID:227 [ 8.624836] pci 0004:01:00.0: BAR 2 [mem 0x34301000-0x34301fff 64bit]: assigned [ 8.625174] IRQ196 -> 0-7 CPU0 [ 8.625221] ITS: alloc 8224:32 [ 8.625230] ITT 32 entries, 5 bits [ 8.625370] pci 0004:00:00.0: PCI bridge to [bus 01-ff] [ 8.625633] IRQ197 -> 0-7 CPU1 [ 8.625888] pci 0004:00:00.0: bridge window [mem 0x34300000-0x343fffff] [ 8.626014] ID:0 pID:8224 vID:229 [ 8.626020] ID:1 pID:8225 vID:230 [ 8.626025] ID:2 pID:8226 vID:231 [ 8.626031] ID:3 pID:8227 vID:232 [ 8.626036] ID:4 pID:8228 vID:233 [ 8.626041] ID:5 pID:8229 vID:234 [ 8.626046] ID:6 pID:8230 vID:235 [ 8.626051] ID:7 pID:8231 vID:236 [ 8.626056] ID:8 pID:8232 vID:237 [ 8.626061] ID:9 pID:8233 vID:238 [ 8.626066] ID:10 pID:8234 vID:239 [ 8.626071] ID:11 pID:8235 vID:240 [ 8.626076] ID:12 pID:8236 vID:241 [ 8.626081] ID:13 pID:8237 vID:242 [ 8.626086] ID:14 pID:8238 vID:243 [ 8.626092] ID:15 pID:8239 vID:244 [ 8.626097] ID:16 pID:8240 vID:245 [ 8.626102] ID:17 pID:8241 vID:246 [ 8.626107] ID:18 pID:8242 vID:247 [ 8.626112] ID:19 pID:8243 vID:248 [ 8.626117] ID:20 pID:8244 vID:249 [ 8.626122] ID:21 pID:8245 vID:250 [ 8.626127] ID:22 pID:8246 vID:251 [ 8.626132] ID:23 pID:8247 vID:252 [ 8.626137] ID:24 pID:8248 vID:253 [ 8.626143] ID:25 pID:8249 vID:254 [ 8.626148] ID:26 pID:8250 vID:255 [ 8.626153] ID:27 pID:8251 vID:256 [ 8.626158] ID:28 pID:8252 vID:257 [ 8.626166] IRQ198 -> 0-7 CPU2 [ 8.626177] IRQ199 -> 0-7 CPU3 [ 8.626188] IRQ200 -> 0-7 CPU4 [ 8.626199] IRQ201 -> 0-7 CPU5 [ 8.626210] IRQ202 -> 0-7 CPU6 [ 8.626221] IRQ203 -> 0-7 CPU7 [ 8.626232] IRQ204 -> 0-7 CPU0 [ 8.626243] IRQ205 -> 0-7 CPU1 [ 8.626254] IRQ206 -> 0-7 CPU2 [ 8.626264] IRQ207 -> 0-7 CPU3 [ 8.626275] IRQ208 -> 0-7 CPU4 [ 8.626286] IRQ209 -> 0-7 CPU5 [ 8.626297] IRQ210 -> 0-7 CPU6 [ 8.626308] IRQ211 -> 0-7 CPU7 [ 8.626319] IRQ212 -> 0-7 CPU0 [ 8.626330] IRQ213 -> 0-7 CPU1 [ 8.626341] IRQ214 -> 0-7 CPU2 [ 8.626352] IRQ215 -> 0-7 CPU3 [ 8.626363] IRQ216 -> 0-7 CPU4 [ 8.626374] IRQ217 -> 0-7 CPU5 [ 8.626385] IRQ218 -> 0-7 CPU6 [ 8.626396] IRQ219 -> 0-7 CPU7 [ 8.626407] IRQ220 -> 0-7 CPU0 [ 8.626418] IRQ221 -> 0-7 CPU1 [ 8.626429] IRQ222 -> 0-7 CPU2 [ 8.626704] ID:29 pID:8253 vID:258 [ 8.626965] IRQ223 -> 0-7 CPU3 [ 8.627214] ID:30 pID:8254 vID:259 [ 8.627467] IRQ224 -> 0-7 CPU4 [ 8.627722] ID:31 pID:8255 vID:260 [ 8.627977] IRQ225 -> 0-7 CPU5 [ 8.628312] IRQ229 -> 0-7 CPU5 [ 8.628372] ITS: alloc 8256:32 [ 8.628380] ITT 32 entries, 5 bits [ 8.628479] IRQ226 -> 0-7 CPU6 [ 8.628723] IRQ230 -> 0-7 CPU6 [ 8.628957] IRQ227 -> 0-7 CPU7 [ 8.629094] ID:0 pID:8256 vID:262 [ 8.629099] ID:1 pID:8257 vID:263 [ 8.629104] ID:2 pID:8258 vID:264 [ 8.629109] ID:3 pID:8259 vID:265 [ 8.629114] ID:4 pID:8260 vID:266 [ 8.629119] ID:5 pID:8261 vID:267 [ 8.629124] ID:6 pID:8262 vID:268 [ 8.629129] ID:7 pID:8263 vID:269 [ 8.629134] ID:8 pID:8264 vID:270 [ 8.629139] ID:9 pID:8265 vID:271 [ 8.629144] ID:10 pID:8266 vID:272 [ 8.629149] ID:11 pID:8267 vID:273 [ 8.629153] ID:12 pID:8268 vID:274 [ 8.629158] ID:13 pID:8269 vID:275 [ 8.629163] ID:14 pID:8270 vID:276 [ 8.629168] ID:15 pID:8271 vID:277 [ 8.629173] ID:16 pID:8272 vID:278 [ 8.629178] ID:17 pID:8273 vID:279 [ 8.629183] ID:18 pID:8274 vID:280 [ 8.629188] ID:19 pID:8275 vID:281 [ 8.629200] IRQ231 -> 0-7 CPU7 [ 8.629211] IRQ232 -> 0-7 CPU0 [ 8.629222] IRQ233 -> 0-7 CPU1 [ 8.629233] IRQ234 -> 0-7 CPU2 [ 8.629244] IRQ235 -> 0-7 CPU3 [ 8.629255] IRQ236 -> 0-7 CPU4 [ 8.629266] IRQ237 -> 0-7 CPU7 [ 8.629277] IRQ238 -> 0-7 CPU0 [ 8.629287] IRQ239 -> 0-7 CPU1 [ 8.629298] IRQ240 -> 0-7 CPU2 [ 8.629309] IRQ241 -> 0-7 CPU3 [ 8.629319] IRQ242 -> 0-7 CPU4 [ 8.629336] IRQ243 -> 0-7 CPU5 [ 8.629346] IRQ244 -> 0-7 CPU6 [ 8.629357] IRQ245 -> 0-7 CPU7 [ 8.629368] IRQ246 -> 0-7 CPU0 [ 8.629379] IRQ247 -> 0-7 CPU1 [ 8.629390] IRQ248 -> 0-7 CPU2 [ 8.629401] IRQ249 -> 0-7 CPU3 [ 8.629411] IRQ250 -> 0-7 CPU4 [ 8.629422] IRQ251 -> 0-7 CPU5 [ 8.629433] IRQ252 -> 0-7 CPU6 [ 8.629670] ID:20 pID:8276 vID:282 [ 8.629908] IRQ253 -> 0-7 CPU0 [ 8.630134] ID:21 pID:8277 vID:283 [ 8.635511] IRQ254 -> 0-7 CPU1 [ 8.642115] ID:22 pID:8278 vID:284 [ 8.649085] IRQ255 -> 0-7 CPU2 [ 8.657029] ID:23 pID:8279 vID:285 [ 8.663285] IRQ256 -> 0-7 CPU3 [ 8.670689] ID:24 pID:8280 vID:286 [ 8.677302] IRQ257 -> 0-7 CPU4 [ 8.682925] ID:25 pID:8281 vID:287 [ 8.688293] IRQ258 -> 0-7 CPU5 [ 8.694547] ID:26 pID:8282 vID:288 [ 8.702234] IRQ259 -> 0-7 CPU6 [ 8.709197] ID:27 pID:8283 vID:289 [ 8.709204] ID:28 pID:8284 vID:290 [ 8.716722] IRQ260 -> 0-7 CPU7 [ 8.722081] ID:29 pID:8285 vID:291 [ 8.842813] ID:30 pID:8286 vID:292 [ 8.842818] ID:31 pID:8287 vID:293 [ 8.842966] IRQ262 -> 0-7 CPU0 [ 8.842982] IRQ263 -> 0-7 CPU1 [ 8.842993] IRQ264 -> 0-7 CPU2 [ 8.843004] IRQ265 -> 0-7 CPU3 [ 8.843016] IRQ266 -> 0-7 CPU4 [ 8.843028] IRQ267 -> 0-7 CPU5 [ 8.843040] IRQ268 -> 0-7 CPU6 [ 8.843051] IRQ269 -> 0-7 CPU7 [ 8.843063] IRQ270 -> 0-7 CPU0 [ 8.843075] IRQ271 -> 0-7 CPU1 [ 8.843087] IRQ272 -> 0-7 CPU2 [ 8.843098] IRQ273 -> 0-7 CPU3 [ 8.843110] IRQ274 -> 0-7 CPU4 [ 8.843122] IRQ275 -> 0-7 CPU5 [ 8.843133] IRQ276 -> 0-7 CPU6 [ 8.843145] IRQ277 -> 0-7 CPU7 [ 8.843157] IRQ278 -> 0-7 CPU0 [ 8.843168] IRQ279 -> 0-7 CPU1 [ 8.843180] IRQ280 -> 0-7 CPU2 [ 8.843192] IRQ281 -> 0-7 CPU3 [ 8.843203] IRQ282 -> 0-7 CPU4 [ 8.843215] IRQ283 -> 0-7 CPU5 [ 8.843227] IRQ284 -> 0-7 CPU6 [ 8.843238] IRQ285 -> 0-7 CPU7 [ 8.843250] IRQ286 -> 0-7 CPU0 [ 8.843262] IRQ287 -> 0-7 CPU1 [ 8.843273] IRQ288 -> 0-7 CPU2 [ 8.843284] IRQ289 -> 0-7 CPU3 [ 8.843296] IRQ290 -> 0-7 CPU4 [ 8.843308] IRQ291 -> 0-7 CPU5 [ 8.843319] IRQ292 -> 0-7 CPU6 [ 8.843331] IRQ293 -> 0-7 CPU7 [ 8.844444] ITS: alloc 8192:1 [ 8.844455] ITT 1 entries, 0 bits [ 8.845389] ID:0 pID:8192 vID:196 [ 8.845395] ITS: alloc 8193:1 [ 8.845403] IRQ196 -> 0-7 CPU0 [ 8.845405] ITT 1 entries, 0 bits [ 8.845604] IRQ196 -> 0-7 CPU0 [ 8.845631] pcieport 0006:00:00.0: PME: Signaling with IRQ 196 [ 8.846380] ID:0 pID:8193 vID:197 [ 8.846414] ITS: alloc 8194:1 [ 8.846423] ITT 1 entries, 0 bits [ 8.857408] IRQ197 -> 0-7 CPU1 [ 8.857440] ID:0 pID:8194 vID:198 [ 8.857450] IRQ198 -> 0-7 CPU2 [ 8.857499] IRQ197 -> 0-7 CPU1 [ 8.857515] pcieport 0002:00:00.0: PME: Signaling with IRQ 197 [ 8.857529] IRQ198 -> 0-7 CPU2 [ 8.858291] pcieport 0006:00:00.0: AER: enabled with IRQ 196 [ 8.866563] pcieport 0002:00:00.0: AER: enabled with IRQ 197 [ 8.872342] pcieport 0004:00:00.0: PME: Signaling with IRQ 198 [ 8.885618] pcieport 0004:00:00.0: AER: enabled with IRQ 198 [ 8.909946] mhi-pci-generic 0004:01:00.0: MHI PCI device found: foxconn-sdx55 [ 8.914659] nvme nvme0: pci function 0002:01:00.0 [ 8.917541] mhi-pci-generic 0004:01:00.0: BAR 0 [mem 0x34300000-0x34300fff 64bit]: assigned [ 8.922185] nvme 0002:01:00.0: enabling device (0000 -> 0002) [ 8.930939] mhi-pci-generic 0004:01:00.0: enabling device (0000 -> 0002) [ 8.937318] ITS: alloc 8195:1 [ 8.944985] ITT 1 entries, 0 bits [ 8.945289] ITS: alloc 8196:8 [ 8.945303] ITT 8 entries, 3 bits [ 8.947818] ID:0 pID:8195 vID:201 [ 8.947910] IRQ201 -> 0-7 CPU3 [ 8.948702] ID:0 pID:8196 vID:202 [ 8.948720] ID:1 pID:8197 vID:203 [ 8.950480] IRQ201 -> 0-7 CPU3 [ 8.965330] ID:2 pID:8198 vID:204 [ 8.974909] ID:3 pID:8199 vID:205 [ 8.987215] ID:4 pID:8200 vID:206 [ 9.001562] IRQ202 -> 0-7 CPU4 [ 9.001759] IRQ203 -> 0-7 CPU5 [ 9.001771] IRQ204 -> 0-7 CPU6 [ 9.001849] IRQ205 -> 0-7 CPU7 [ 9.001862] IRQ206 -> 0-7 CPU0 [ 9.003223] IRQ202 -> 0-7 CPU4 [ 9.003449] IRQ203 -> 0-7 CPU5 [ 9.003638] IRQ204 -> 0-7 CPU6 [ 9.003836] IRQ205 -> 0-7 CPU7 [ 9.004007] IRQ206 -> 0-7 CPU0 [ 9.005127] mhi mhi0: Requested to power ON [ 9.009901] mhi mhi0: Power on setup success [ 9.015403] nvme nvme0: allocated 61 MiB host memory buffer. [ 9.169296] ITS: alloc 8204:16 [ 9.169319] ITT 16 entries, 4 bits [ 9.169492] ID:0 pID:8204 vID:201 [ 9.169516] IRQ201 -> 0-7 CPU3 [ 9.169620] ID:1 pID:8205 vID:211 [ 9.169633] IRQ211 -> 0-7 CPU0 [ 9.169702] ID:2 pID:8206 vID:212 [ 9.169713] IRQ212 -> 0-7 CPU1 [ 9.169904] ID:3 pID:8207 vID:213 [ 9.169917] IRQ213 -> 0-7 CPU2 [ 9.169982] ID:4 pID:8208 vID:214 [ 9.169993] IRQ214 -> 0-7 CPU3 [ 9.170070] ID:5 pID:8209 vID:215 [ 9.170082] IRQ215 -> 0-7 CPU4 [ 9.170143] ID:6 pID:8210 vID:216 [ 9.170155] IRQ216 -> 0-7 CPU5 [ 9.170221] ID:7 pID:8211 vID:217 [ 9.170232] IRQ217 -> 0-7 CPU6 [ 9.170294] ID:8 pID:8212 vID:218 [ 9.170319] IRQ218 -> 0-7 CPU7 [ 9.170460] IRQ201 -> 0-7 CPU3 [ 9.179969] IRQ211 -> 0 CPU0 [ 9.180329] IRQ212 -> 1 CPU1 [ 9.180663] IRQ213 -> 2 CPU2 [ 9.181001] IRQ214 -> 3 CPU3 [ 9.181355] IRQ215 -> 4 CPU4 [ 9.181702] IRQ216 -> 5 CPU5 [ 9.188542] IRQ217 -> 6 CPU6 [ 9.196576] IRQ218 -> 7 CPU7 [ 9.196623] nvme nvme0: 8/0/0 default/read/poll queues [ 9.206751] nvme0n1: p1 [ 9.278797] ath11k_pci 0006:01:00.0: BAR 0 [mem 0x30400000-0x305fffff 64bit]: assigned [ 9.294555] ath11k_pci 0006:01:00.0: enabling device (0000 -> 0002) [ 9.295634] wwan wwan0: port wwan0qcdm0 attached [ 9.296105] wwan wwan0: port wwan0mbim0 attached [ 9.296789] wwan wwan0: port wwan0at0 attached [ 9.304915] ITS: alloc 8220:32 [ 9.314316] ITT 32 entries, 5 bits [ 9.324270] ID:0 pID:8220 vID:262 [ 9.338759] ID:1 pID:8221 vID:263 [ 9.338765] ID:2 pID:8222 vID:264 [ 9.338770] ID:3 pID:8223 vID:265 [ 9.338775] ID:4 pID:8224 vID:266 [ 9.338779] ID:5 pID:8225 vID:267 [ 9.338784] ID:6 pID:8226 vID:268 [ 9.338789] ID:7 pID:8227 vID:269 [ 9.338794] ID:8 pID:8228 vID:270 [ 9.338798] ID:9 pID:8229 vID:271 [ 9.338803] ID:10 pID:8230 vID:272 [ 9.338808] ID:11 pID:8231 vID:273 [ 9.338812] ID:12 pID:8232 vID:274 [ 9.338817] ID:13 pID:8233 vID:275 [ 9.338821] ID:14 pID:8234 vID:276 [ 9.338826] ID:15 pID:8235 vID:277 [ 9.338831] ID:16 pID:8236 vID:278 [ 9.338836] ID:17 pID:8237 vID:279 [ 9.338841] ID:18 pID:8238 vID:280 [ 9.338845] ID:19 pID:8239 vID:281 [ 9.338850] ID:20 pID:8240 vID:282 [ 9.338855] ID:21 pID:8241 vID:283 [ 9.338859] ID:22 pID:8242 vID:284 [ 9.338864] ID:23 pID:8243 vID:285 [ 9.338868] ID:24 pID:8244 vID:286 [ 9.338873] ID:25 pID:8245 vID:287 [ 9.338877] ID:26 pID:8246 vID:288 [ 9.338882] ID:27 pID:8247 vID:289 [ 9.338887] ID:28 pID:8248 vID:290 [ 9.338891] ID:29 pID:8249 vID:291 [ 9.338896] ID:30 pID:8250 vID:292 [ 9.338900] ID:31 pID:8251 vID:293 [ 9.338980] IRQ262 -> 0-7 CPU1 [ 9.362613] IRQ263 -> 0-7 CPU2 [ 9.370142] IRQ264 -> 0-7 CPU3 [ 9.377656] IRQ265 -> 0-7 CPU4 [ 9.400274] IRQ266 -> 0-7 CPU5 [ 9.409009] IRQ267 -> 0-7 CPU6 [ 9.409021] IRQ268 -> 0-7 CPU7 [ 9.409033] IRQ269 -> 0-7 CPU0 [ 9.409044] IRQ270 -> 0-7 CPU1 [ 9.409056] IRQ271 -> 0-7 CPU2 [ 9.409067] IRQ272 -> 0-7 CPU3 [ 9.409078] IRQ273 -> 0-7 CPU4 [ 9.409089] IRQ274 -> 0-7 CPU5 [ 9.409100] IRQ275 -> 0-7 CPU6 [ 9.409111] IRQ276 -> 0-7 CPU7 [ 9.409123] IRQ277 -> 0-7 CPU0 [ 9.409134] IRQ278 -> 0-7 CPU1 [ 9.409145] IRQ279 -> 0-7 CPU2 [ 9.409157] IRQ280 -> 0-7 CPU3 [ 9.409168] IRQ281 -> 0-7 CPU4 [ 9.409179] IRQ282 -> 0-7 CPU5 [ 9.409190] IRQ283 -> 0-7 CPU6 [ 9.409201] IRQ284 -> 0-7 CPU7 [ 9.409213] IRQ285 -> 0-7 CPU0 [ 9.409224] IRQ286 -> 0-7 CPU1 [ 9.409235] IRQ287 -> 0-7 CPU2 [ 9.409247] IRQ288 -> 0-7 CPU3 [ 9.409258] IRQ289 -> 0-7 CPU4 [ 9.409270] IRQ290 -> 0-7 CPU5 [ 9.409281] IRQ291 -> 0-7 CPU6 [ 9.409292] IRQ292 -> 0-7 CPU7 [ 9.409303] IRQ293 -> 0-7 CPU0 [ 9.409438] ath11k_pci 0006:01:00.0: MSI vectors: 32 [ 9.426507] ath11k_pci 0006:01:00.0: wcn6855 hw2.0 [ 9.456885] IRQ262 -> 0-7 CPU1 [ 9.467067] IRQ263 -> 0-7 CPU2 [ 9.481466] IRQ264 -> 0-7 CPU3 [ 9.630594] IRQ265 -> 0-7 CPU4 [ 9.630629] IRQ266 -> 0-7 CPU5 [ 9.630655] IRQ267 -> 0-7 CPU6 [ 9.630682] IRQ268 -> 0-7 CPU7 [ 9.630709] IRQ269 -> 0-7 CPU0 [ 9.630735] IRQ270 -> 0-7 CPU1 [ 9.630764] IRQ271 -> 0-7 CPU2 [ 9.640971] IRQ276 -> 0-7 CPU7 [ 9.641039] IRQ277 -> 0-7 CPU0 [ 9.641088] IRQ278 -> 0-7 CPU1 [ 9.641138] IRQ280 -> 0-7 CPU3 [ 9.641182] IRQ281 -> 0-7 CPU4 [ 9.641227] IRQ282 -> 0-7 CPU5 [ 9.651400] IRQ283 -> 0-7 CPU6 [ 9.651442] IRQ284 -> 0-7 CPU7 [ 9.651490] IRQ285 -> 0-7 CPU0 [ 9.651534] IRQ286 -> 0-7 CPU1 [ 9.813900] mhi mhi1: Requested to power ON [ 9.818607] mhi mhi1: Power on setup success [ 10.017482] mhi mhi1: Wait for device to enter SBL or Mission mode [ 10.862765] ath11k_pci 0006:01:00.0: chip_id 0x2 chip_family 0xb board_id 0x8c soc_id 0x400c0200 [ 10.872101] ath11k_pci 0006:01:00.0: fw_version 0x1106196e fw_build_timestamp 2024-01-12 11:30 fw_build_id WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37
On Wed, 17 Jul 2024 14:38:59 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Wed, Jul 17, 2024 at 01:54:40PM +0100, Marc Zyngier wrote: > > On Wed, 17 Jul 2024 08:23:39 +0100, > > Johan Hovold <johan@kernel.org> wrote: > > > > I believe there is a kernel parameter for this (e.g. > > > module.async_probe), but I just disable async probing for the Qualcomm > > > PCIe driver I'm using: > > > > I had tried this module parameter, but it didn't change anything on my > > end. > > > I'll have a look whether the TX1 PCIe driver uses this. It's > > positively ancient, so I wouldn't bet that it has been touched > > significantly in the past 5 years. > > Perhaps async probing just changes the symptoms, the NVMe and wifi > doesn't work in either case. Yeah, my impression is that this changes the order in which LPIs get allocated, but the core symptom is the same. > > > > [ 8.692011] Reusing ITT for devID 0 > > > [ 8.693668] Reusing ITT for devID 0 > > > > This is really odd. It indicates that you have several devices sharing > > the same DeviceID, which I seriously doubt it is the case in a > > laptop. Do you have any non-transparent bridge here? lspci would help. > > Yeah, and these messages do not show up without the series (see log > below). They are there in the previous synchronous log however. > > 0002:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port > 0002:01:00.0 Non-Volatile memory controller: KIOXIA Corporation NVMe SSD Controller BG4 (DRAM-less) > 0004:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port > 0004:01:00.0 Unassigned class [ff00]: Qualcomm Technologies, Inc SDX55 [Snapdragon X55 5G] > 0006:00:00.0 PCI bridge: Qualcomm Technologies, Inc SC8280XP PCI Express Root Port > 0006:01:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765 Wireless Network Adapter (rev 01) Right, this is a very straightforward setup, Design-crap-ware-style. Nothing that would alias any device. > > > I'm starting to suspect that the new code doesn't carry all the > > required bits for the DevID, and that we end-up trying to allocated > > interrupts from the pool allocated to another device, which can never > > be a good thing, and would explain why everything dies a painful > > death. > > > > Can you run the same trace with the whole thing reverted? I think > > we're on something here. > > See below, using normal asynchronous probing like the previous log. And as expected, no aliasing showing up in this log. Somehow, we're not able to distinguish between the different PCI domains anymore, leading to all sorts of funnies. For the record, I've added some extra debug in the its driver and ran the result on TX1, old and new kernels. Before this series: [ 10.139806] nvme nvme0: pci function 0006:58:00.0 [ 10.158599] nvme 0006:58:00.0: devid = 35800 With this series: [ 10.143729] nvme nvme0: pci function 0006:58:00.0 [ 10.181775] nvme 0006:58:00.0: devid = 5800 Clearly, we've lost something in the battle. I'll keep digging. M.
On Wed, 17 Jul 2024 14:38:59 +0100, Johan Hovold <johan@kernel.org> wrote: > > On Wed, Jul 17, 2024 at 01:54:40PM +0100, Marc Zyngier wrote: > > On Wed, 17 Jul 2024 08:23:39 +0100, > > Johan Hovold <johan@kernel.org> wrote: > > > > [ 8.692011] Reusing ITT for devID 0 > > > [ 8.693668] Reusing ITT for devID 0 > > > > This is really odd. It indicates that you have several devices sharing > > the same DeviceID, which I seriously doubt it is the case in a > > laptop. Do you have any non-transparent bridge here? lspci would help. > > Yeah, and these messages do not show up without the series (see log > below). They are there in the previous synchronous log however. I think I've finally nailed the sucker, and posted a potential fix[1]. It definitely restore my TX1 to a state that is no worse than normal, so something must be less wrong there. I'm pretty sure that the platform-msi equivalent is equally broken, but I don't have the energy to verify/debug that tonight. Thomas, feel free to squash this into your series or keep it as is, as you prefer. M. [1] https://lore.kernel.org/r/20240717195937.2240400-1-maz@kernel.org
On Wed, Jul 17, 2024 at 09:10:02PM +0100, Marc Zyngier wrote: > I think I've finally nailed the sucker, and posted a potential fix[1]. > > It definitely restore my TX1 to a state that is no worse than normal, > so something must be less wrong there. I'm pretty sure that the > platform-msi equivalent is equally broken, but I don't have the energy > to verify/debug that tonight. > [1] https://lore.kernel.org/r/20240717195937.2240400-1-maz@kernel.org This seems to fix the regression here too, thanks! 201: 0 ... 0 ITS-PCI-MSI-0006:00:00.0 0 Edge PCIe PME, aerdrv 202: 0 0 ITS-PCI-MSI-0006:01:00.0 0 Edge bhi 203: 0 0 ITS-PCI-MSI-0006:01:00.0 1 Edge mhi 204: 0 0 ITS-PCI-MSI-0006:01:00.0 2 Edge mhi 205: 0 0 ITS-PCI-MSI-0006:01:00.0 3 Edge ce0 206: 0 0 ITS-PCI-MSI-0006:01:00.0 4 Edge ce1 207: 0 0 ITS-PCI-MSI-0006:01:00.0 5 Edge ce2 208: 0 2 ITS-PCI-MSI-0006:01:00.0 6 Edge ce3 209: 2 0 ITS-PCI-MSI-0006:01:00.0 7 Edge ce5 210: 0 0 ITS-PCI-MSI-0006:01:00.0 8 Edge ce7 211: 0 0 ITS-PCI-MSI-0006:01:00.0 9 Edge ce8 216: 0 0 ITS-PCI-MSI-0006:01:00.0 14 Edge DP_EXT_IRQ 217: 0 0 ITS-PCI-MSI-0006:01:00.0 15 Edge DP_EXT_IRQ 218: 0 0 ITS-PCI-MSI-0006:01:00.0 16 Edge DP_EXT_IRQ 220: 0 0 ITS-PCI-MSI-0006:01:00.0 18 Edge DP_EXT_IRQ 221: 0 0 ITS-PCI-MSI-0006:01:00.0 19 Edge DP_EXT_IRQ 222: 0 0 ITS-PCI-MSI-0006:01:00.0 20 Edge DP_EXT_IRQ 223: 0 0 ITS-PCI-MSI-0006:01:00.0 21 Edge DP_EXT_IRQ 224: 0 0 ITS-PCI-MSI-0006:01:00.0 22 Edge DP_EXT_IRQ 225: 0 0 ITS-PCI-MSI-0006:01:00.0 23 Edge DP_EXT_IRQ 226: 0 0 ITS-PCI-MSI-0006:01:00.0 24 Edge DP_EXT_IRQ 235: 0 0 ITS-PCI-MSI-0004:00:00.0 0 Edge PCIe PME, aerdrv 236: 0 0 ITS-PCI-MSI-0004:01:00.0 0 Edge bhi 237: 0 0 ITS-PCI-MSI-0004:01:00.0 1 Edge mhi 238: 0 0 ITS-PCI-MSI-0004:01:00.0 2 Edge mhi 239: 0 0 ITS-PCI-MSI-0004:01:00.0 3 Edge mhi 240: 0 0 ITS-PCI-MSI-0004:01:00.0 4 Edge mhi 242: 0 0 ITS-PCI-MSI-0002:00:00.0 0 Edge PCIe PME, aerdrv 243: 22 0 ITS-PCI-MSIX-0002:01:00.0 0 Edge nvme0q0 244: 0 0 ITS-PCI-MSIX-0002:01:00.0 1 Edge nvme0q1 245: 0 0 ITS-PCI-MSIX-0002:01:00.0 2 Edge nvme0q2 246: 0 0 ITS-PCI-MSIX-0002:01:00.0 3 Edge nvme0q3 247: 0 0 ITS-PCI-MSIX-0002:01:00.0 4 Edge nvme0q4 248: 0 0 ITS-PCI-MSIX-0002:01:00.0 5 Edge nvme0q5 249: 0 0 ITS-PCI-MSIX-0002:01:00.0 6 Edge nvme0q6 250: 0 0 ITS-PCI-MSIX-0002:01:00.0 7 Edge nvme0q7 251: 0 0 ITS-PCI-MSIX-0002:01:00.0 8 Edge nvme0q8 Johan