Message ID | 3b6c8d973aedc48860640a9d75d20528336f1f3c.1742669372.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Krzysztof Wilczyński |
Headers | show |
Series | PCI/bwctrl: Fix NULL pointer deref on bus number exhaustion | expand |
Hello, > When BIOS neglects to assign bus numbers to PCI bridges, the kernel > attempts to correct that during PCI device enumeration. If it runs out > of bus numbers, no pci_bus is allocated and the "subordinate" pointer in > the bridge's pci_dev remains NULL. > > The PCIe bandwidth controller erroneously does not check for a NULL > subordinate pointer and dereferences it on probe. > > Bandwidth control of unusable devices below the bridge is of questionable > utility, so simply error out instead. This mirrors what PCIe hotplug does > since commit 62e4492c3063 ("PCI: Prevent NULL dereference during pciehp > probe"). > > The PCI core emits a message with KERN_INFO severity if it has run out of > bus numbers. PCIe hotplug emits an additional message with KERN_ERR > severity to inform the user that hotplug functionality is disabled at the > bridge. A similar message for bandwidth control does not seem merited, > given that its only purpose so far is to expose an up-to-date link speed > in sysfs and throttle the link speed on certain laptops with limited > Thermal Design Power. So error out silently. > > User-visible messages: > > pci 0000:16:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [...] > pci_bus 0000:45: busn_res: [bus 45-74] end is updated to 74 > pci 0000:16:02.0: devices behind bridge are unusable because [bus 45-74] cannot be assigned for them > [...] > pcieport 0000:16:02.0: pciehp: Hotplug bridge without secondary bus, ignoring > [...] > BUG: kernel NULL pointer dereference > RIP: pcie_update_link_speed > pcie_bwnotif_enable > pcie_bwnotif_probe > pcie_port_probe_service > really_probe Applied to bwctrl, thank you! Krzysztof
Hello, > The patch is working, no kernel crashes and shutting down after hotplugging the Thunderbolt 4 dock does not hang anymore, thanks! Thank you for testing! Appreciated! > I still see messages in the kernel log that 'devices behind bridge are unusable' because 'no bus can be assigned to them', 'Hotplug bridge without secondary bus, ignoring', etc. These all refer to the Thunderbolt 4 bridge. Adding "pci=hpbussize=0x33" to the kernel doesn't make a difference. Adding "pci=realloc,asssign-busses,hpbussize=0x33" actually does 'fix' the bus allocation failures (or at least I don't see any messages in the log anymore), but then amdgpu fails to initialize the IGP. We can fix any outstanding issues or drop this patch. I leave it to hot plug experts like Ilpo, Bjorn and Lukas to make the call here. > Anyway, all devices connected to the Thunderbolt 4 dock appear to work (USB, screens, ethernet) even despite these bus allocation failures, so I will just ignore them. > > Thanks again! Would you be happy to provide your "Tested-by:" tag? Krzysztof
Hi Krzysztof, > Would you be happy to provide your "Tested-by:" tag? You can use 'Tested-by: Wouter Bijlsma <wouter@wouterbijlsma.nl>' -Wouter On Sun, Mar 23, 2025, at 1:56 PM, Krzysztof Wilczyński wrote: > Hello, > > > The patch is working, no kernel crashes and shutting down after hotplugging the Thunderbolt 4 dock does not hang anymore, thanks! > > Thank you for testing! Appreciated! > > > I still see messages in the kernel log that 'devices behind bridge are unusable' because 'no bus can be assigned to them', 'Hotplug bridge without secondary bus, ignoring', etc. These all refer to the Thunderbolt 4 bridge. Adding "pci=hpbussize=0x33" to the kernel doesn't make a difference. Adding "pci=realloc,asssign-busses,hpbussize=0x33" actually does 'fix' the bus allocation failures (or at least I don't see any messages in the log anymore), but then amdgpu fails to initialize the IGP. > > We can fix any outstanding issues or drop this patch. I leave it to hot > plug experts like Ilpo, Bjorn and Lukas to make the call here. > > > Anyway, all devices connected to the Thunderbolt 4 dock appear to work (USB, screens, ethernet) even despite these bus allocation failures, so I will just ignore them. > > > > Thanks again! > > Would you be happy to provide your "Tested-by:" tag? > > Krzysztof >
Hello, > > Would you be happy to provide your "Tested-by:" tag? > > You can use 'Tested-by: Wouter Bijlsma <wouter@wouterbijlsma.nl>' Added to the other tags. Thank you! Krzysztof
diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c index 0a5e7efbce2c..0e5807ae4083 100644 --- a/drivers/pci/pcie/bwctrl.c +++ b/drivers/pci/pcie/bwctrl.c @@ -294,6 +294,10 @@ static int pcie_bwnotif_probe(struct pcie_device *srv) struct pci_dev *port = srv->port; int ret; + /* Can happen if we run out of bus numbers during enumeration */ + if (!port->subordinate) + return -ENODEV; + struct pcie_bwctrl_data *data = devm_kzalloc(&srv->device, sizeof(*data), GFP_KERNEL); if (!data)
When BIOS neglects to assign bus numbers to PCI bridges, the kernel attempts to correct that during PCI device enumeration. If it runs out of bus numbers, no pci_bus is allocated and the "subordinate" pointer in the bridge's pci_dev remains NULL. The PCIe bandwidth controller erroneously does not check for a NULL subordinate pointer and dereferences it on probe. Bandwidth control of unusable devices below the bridge is of questionable utility, so simply error out instead. This mirrors what PCIe hotplug does since commit 62e4492c3063 ("PCI: Prevent NULL dereference during pciehp probe"). The PCI core emits a message with KERN_INFO severity if it has run out of bus numbers. PCIe hotplug emits an additional message with KERN_ERR severity to inform the user that hotplug functionality is disabled at the bridge. A similar message for bandwidth control does not seem merited, given that its only purpose so far is to expose an up-to-date link speed in sysfs and throttle the link speed on certain laptops with limited Thermal Design Power. So error out silently. User-visible messages: pci 0000:16:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring [...] pci_bus 0000:45: busn_res: [bus 45-74] end is updated to 74 pci 0000:16:02.0: devices behind bridge are unusable because [bus 45-74] cannot be assigned for them [...] pcieport 0000:16:02.0: pciehp: Hotplug bridge without secondary bus, ignoring [...] BUG: kernel NULL pointer dereference RIP: pcie_update_link_speed pcie_bwnotif_enable pcie_bwnotif_probe pcie_port_probe_service really_probe Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") Reported-by: Wouter Bijlsma <wouter@wouterbijlsma.nl> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219906 Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v6.13+ --- drivers/pci/pcie/bwctrl.c | 4 ++++ 1 file changed, 4 insertions(+)