Message ID | 8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.de (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Krzysztof Wilczyński |
Headers | show |
Series | PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal | expand |
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote: > Keith reports a use-after-free when a DPC event occurs concurrently to > hot-removal of the same portion of the hierarchy: > > The dpc_handler() awaits readiness of the secondary bus below the > Downstream Port where the DPC event occurred. To do so, it polls the > config space of the first child device on the secondary bus. If that > child device is concurrently removed, accesses to its struct pci_dev > cause the kernel to oops. > > That's because pci_bridge_wait_for_secondary_bus() neglects to hold a > reference on the child device. Before v6.3, the function was only > called on resume from system sleep or on runtime resume. Holding a > reference wasn't necessary back then because the pciehp IRQ thread > could never run concurrently. (On resume from system sleep, IRQs are > not enabled until after the resume_noirq phase. And runtime resume is > always awaited before a PCI device is removed.) > > However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also > called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness > of secondary bus after reset"), which introduced that, failed to > appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a > reference on the child device because dpc_handler() and pciehp may > indeed run concurrently. The commit was backported to v5.10+ stable > kernels, so that's the oldest one affected. > > Add the missing reference acquisition. > > Abridged stack trace: > > BUG: unable to handle page fault for address: 00000000091400c0 > CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 > RIP: pci_bus_read_config_dword+0x17/0x50 > pci_dev_wait() > pci_bridge_wait_for_secondary_bus() > dpc_reset_link() > pcie_do_recovery() > dpc_handler() > > Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset") > Reported-by: Keith Busch <kbusch@kernel.org> > Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/ > Tested-by: Keith Busch <kbusch@kernel.org> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote: > However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also > called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness > of secondary bus after reset"), which introduced that, failed to > appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a > reference on the child device because dpc_handler() and pciehp may > indeed run concurrently. The commit was backported to v5.10+ stable > kernels, so that's the oldest one affected. Caution on applying this to 5.10 and 5.15 stable branches: they don't have the fancy "__free" cleanup you're using here. The newer active stables are okay, though. > int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) > { > - struct pci_dev *child; > + struct pci_dev *child __free(pci_dev_put) = NULL; > int delay;
On Tue, Jun 18, 2024 at 10:12:32AM -0600, Keith Busch wrote: > On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote: > > However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also > > called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness > > of secondary bus after reset"), which introduced that, failed to > > appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a > > reference on the child device because dpc_handler() and pciehp may > > indeed run concurrently. The commit was backported to v5.10+ stable > > kernels, so that's the oldest one affected. > > Caution on applying this to 5.10 and 5.15 stable branches: they don't > have the fancy "__free" cleanup you're using here. The newer active > stables are okay, though. I'll let Greg & Sasha know when they start applying this to stable kernels that ced085ef369a is a prerequisite for v5.10-stable and v5.15-stable. I can rework the patch if they don't want to apply ced085ef369a to these older versions. Thanks, Lukas
On Tue, Jun 18, 2024 at 12:54:55PM +0200, Lukas Wunner wrote: > Add the missing reference acquisition. > > Abridged stack trace: > > BUG: unable to handle page fault for address: 00000000091400c0 > CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 > RIP: pci_bus_read_config_dword+0x17/0x50 > pci_dev_wait() > pci_bridge_wait_for_secondary_bus() > dpc_reset_link() > pcie_do_recovery() > dpc_handler() > > Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset") > Reported-by: Keith Busch <kbusch@kernel.org> > Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/ > Tested-by: Keith Busch <kbusch@kernel.org> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Reviewed-by: Keith Busch <kbusch@kernel.org> > Cc: stable@vger.kernel.org # v5.10+ Hey, it's been some time, so just wanted to check on this patch status. It's still a good fix.
Hello, > Keith reports a use-after-free when a DPC event occurs concurrently to > hot-removal of the same portion of the hierarchy: > > The dpc_handler() awaits readiness of the secondary bus below the > Downstream Port where the DPC event occurred. To do so, it polls the > config space of the first child device on the secondary bus. If that > child device is concurrently removed, accesses to its struct pci_dev > cause the kernel to oops. > > That's because pci_bridge_wait_for_secondary_bus() neglects to hold a > reference on the child device. Before v6.3, the function was only > called on resume from system sleep or on runtime resume. Holding a > reference wasn't necessary back then because the pciehp IRQ thread > could never run concurrently. (On resume from system sleep, IRQs are > not enabled until after the resume_noirq phase. And runtime resume is > always awaited before a PCI device is removed.) > > However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also > called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness > of secondary bus after reset"), which introduced that, failed to > appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a > reference on the child device because dpc_handler() and pciehp may > indeed run concurrently. The commit was backported to v5.10+ stable > kernels, so that's the oldest one affected. > > Add the missing reference acquisition. > > Abridged stack trace: > > BUG: unable to handle page fault for address: 00000000091400c0 > CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 > RIP: pci_bus_read_config_dword+0x17/0x50 > pci_dev_wait() > pci_bridge_wait_for_secondary_bus() > dpc_reset_link() > pcie_do_recovery() > dpc_handler() Applied to dpc, thank you! [1/1] PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal https://git.kernel.org/pci/pci/c/11a1f4bc4736 Krzysztof
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d2c388761ba9..0e7cb507a5d6 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4753,7 +4753,7 @@ static int pci_bus_max_d3cold_delay(const struct pci_bus *bus) */ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) { - struct pci_dev *child; + struct pci_dev *child __free(pci_dev_put) = NULL; int delay; if (pci_dev_is_disconnected(dev)) @@ -4782,8 +4782,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type) return 0; } - child = list_first_entry(&dev->subordinate->devices, struct pci_dev, - bus_list); + child = pci_dev_get(list_first_entry(&dev->subordinate->devices, + struct pci_dev, bus_list)); up_read(&pci_bus_sem); /*