Message ID | alpine.DEB.2.21.2209211921250.29493@angie.orcam.me.uk (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [v3] PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge | expand |
On Wed, Sep 21, 2022 at 08:49:16PM +0100, Maciej W. Rozycki wrote: > Fix an issue with the Tyan Tomcat IV S1564D system, the BIOS of which > does not assign PCI buses beyond #2, where our resource reallocation > code preserves the reset default of an I/O BAR assignment outside its > upstream PCI-to-PCI bridge's I/O forwarding range: > > pci 0000:06:08.0: BAR 4: no space for [io size 0x0020] > pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff] > pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff] > [...] > pci_bus 0000:06: resource 0 [io 0x2000-0x2fff] > > Consequently when the device driver tries to access 06:08.0 according to > its designated address range it pokes at an unassigned I/O location, > likely subtractively decoded by the southbridge and forwarded to ISA, > causing the driver to become confused and bail out: > > uhci_hcd 0000:06:08.0: host system error, PCI problems? > uhci_hcd 0000:06:08.0: host controller process error, something bad happened! > uhci_hcd 0000:06:08.0: host controller halted, very bad! > uhci_hcd 0000:06:08.0: HCRESET not completed yet! > uhci_hcd 0000:06:08.0: HC died; cleaning up > > if good luck happens or if bad luck does, an infinite flood of messages: > > uhci_hcd 0000:06:08.0: host system error, PCI problems? > uhci_hcd 0000:06:08.0: host controller process error, something bad happened! > > making the system virtually unusable. > > This is because we try to retain any BAR assignment the firmware may > have made here, which may be necessary for devices on the root bus with > some systems, but cannot work for devices that are behind a PCI-to-PCI > bridge where the BAR assignment is outside the upstream bridge's > forwarding range. > > Make sure then for a device behind a PCI-to-PCI bridge that any firmware > assignment is within the bridge's relevant forwarding window or do not > restore the assignment, fixing the system concerned as follows: > > pci 0000:06:08.0: BAR 4: no space for [io size 0x0020] > pci 0000:06:08.0: BAR 4: failed to assign [io 0xfce0-0xfcff] > [...] > pci 0000:06:08.0: BAR 4: assigned [io 0x2000-0x201f] > > and making device 06:08.0 work correctly. > > Cf. <https://bugzilla.kernel.org/show_bug.cgi?id=16263> > > Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> > Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203012338460.46819@angie.orcam.me.uk > Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses") > Cc: stable@vger.kernel.org # v2.6.35+ > --- > Hi Bjorn, > > I have trimmed the change description down as you requested and left the > change proper unmodified, as discussed in my earlier response. I think this is great. It shouldn't have taken me this long, so thanks for persevering. I think we can use pci_upstream_bridge() as below. Let me know if not. Here it is as I applied to pci/resource for v6.1: commit 0e3281839742 ("PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge") Author: Maciej W. Rozycki <macro@orcam.me.uk> Date: Wed Sep 21 20:49:16 2022 +0100 PCI: Sanitise firmware BAR assignments behind a PCI-PCI bridge When pci_assign_resource() is unable to assign resources to a BAR, it uses pci_revert_fw_address() to fall back to a firmware assignment (if any). Previously pci_revert_fw_address() assumed all addresses could reach the device, but this is not true if the device is below a bridge that only forwards addresses within its windows. This problem was observed on a Tyan Tomcat IV S1564D system where the BIOS did not assign valid addresses to several bridges and USB devices: pci 0000:00:11.0: PCI-to-PCIe bridge to [bus 01-ff] pci 0000:00:11.0: bridge window [io 0xe000-0xefff] pci 0000:01:00.0: PCIe Upstream Port to [bus 02-ff] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff] # unreachable pci 0000:02:02.0: PCIe Downstream Port to [bus 05-ff] pci 0000:02:02.0: bridge window [io 0x0000-0x0fff] # unreachable pci 0000:05:00.0: PCIe-to-PCI bridge to [bus 06-ff] pci 0000:05:00.0: bridge window [io 0x0000-0x0fff] # unreachable pci 0000:06:08.0: USB UHCI 1.1 pci 0000:06:08.0: BAR 4: [io 0xfce0-0xfcff] # unreachable pci 0000:06:08.1: USB UHCI 1.1 pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] # unreachable pci 0000:06:08.0: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window pci 0000:06:08.1: can't claim BAR 4 [io 0xfce0-0xfcff]: no compatible bridge window During the first pass of assigning unassigned resources, there was not enough I/O space available, so we couldn't assign the 06:08.0 BAR and reverted to the firmware assignment (still unreachable). Reverting the 06:08.1 assignment failed because it conflicted with 06:08.0: pci 0000:00:11.0: bridge window [io 0xe000-0xefff] pci 0000:01:00.0: no space for bridge window [io size 0x2000] pci 0000:02:02.0: no space for bridge window [io size 0x1000] pci 0000:05:00.0: no space for bridge window [io size 0x1000] pci 0000:06:08.0: BAR 4: no space for [io size 0x0020] pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff] pci 0000:06:08.1: BAR 4: no space for [io size 0x0020] pci 0000:06:08.1: BAR 4: trying firmware assignment [io 0xfce0-0xfcff] pci 0000:06:08.1: BAR 4: [io 0xfce0-0xfcff] conflicts with 0000:06:08.0 [io 0xfce0-0xfcff] A subsequent pass assigned valid bridge windows and a valid 06:08.1 BAR, but left the 06:08.0 BAR alone, so the UHCI device was still unusable: pci 0000:00:11.0: bridge window [io 0xe000-0xefff] released pci 0000:00:11.0: bridge window [io 0x1000-0x2fff] # reassigned pci 0000:01:00.0: bridge window [io 0x1000-0x2fff] # reassigned pci 0000:02:02.0: bridge window [io 0x2000-0x2fff] # reassigned pci 0000:05:00.0: bridge window [io 0x2000-0x2fff] # reassigned pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff] # left alone pci 0000:06:08.1: BAR 4: assigned [io 0x2000-0x201f] ... uhci_hcd 0000:06:08.0: host system error, PCI problems? uhci_hcd 0000:06:08.0: host controller process error, something bad happened! uhci_hcd 0000:06:08.0: host controller halted, very bad! uhci_hcd 0000:06:08.0: HCRESET not completed yet! uhci_hcd 0000:06:08.0: HC died; cleaning up If the address assigned by firmware is not reachable because it's not within upstream bridge windows, fail instead of assigning the unusable address from firmware. [bhelgaas: commit log, use pci_upstream_bridge()] Link: https://bugzilla.kernel.org/show_bug.cgi?id=16263 Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203012338460.46819@angie.orcam.me.uk Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209211921250.29493@angie.orcam.me.uk Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.6.35+ diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c index 439ac5f5907a..b492e67c3d87 100644 --- a/drivers/pci/setup-res.c +++ b/drivers/pci/setup-res.c @@ -214,6 +214,17 @@ static int pci_revert_fw_address(struct resource *res, struct pci_dev *dev, root = pci_find_parent_resource(dev, res); if (!root) { + /* + * If dev is behind a bridge, accesses will only reach it + * if res is inside the relevant bridge window. + */ + if (pci_upstream_bridge(dev)) + return -ENXIO; + + /* + * On the root bus, assume the host bridge will forward + * everything. + */ if (res->flags & IORESOURCE_IO) root = &ioport_resource; else
On Wed, 21 Sep 2022, Bjorn Helgaas wrote: > > I have trimmed the change description down as you requested and left the > > change proper unmodified, as discussed in my earlier response. > > I think this is great. It shouldn't have taken me this long, so > thanks for persevering. No worries, owing to various distractions it took me way too long to iterate over this change too. > I think we can use pci_upstream_bridge() as below. Let me know if > not. Based on how the helper has been documented I think you are right. Also I have verified your branch with my hardware and it still works. > Here it is as I applied to pci/resource for v6.1: Thank you! I find your rewritten change description a pleasure to read. I guess at this stage we'll never find out what the exact configuration was that has lead to commit 351fc6d1a517 ("PCI: Fix starting basis for resource requests"). At least it does not stand in the way. What a mess it was with the firmware side of the earlier PCI systems even once they've sorted the teething problems of the hardware side! I still need to figure out how to get PCI interrupt routing to automagically work in the I/O APIC mode with this x86 machine and its broken MP-table which reports PCI interrupts as ISA interrupts, and then INTA only (at least our PIRQ router code now handles PCI-to-PCI bridges in the PIC mode). And it would have been so easy to get it right even with a fixed table (the BIOS seems to build the table dynamically for no good reason even though the wiring is fixed in hardware)! Maciej
Index: linux-macro/drivers/pci/setup-res.c =================================================================== --- linux-macro.orig/drivers/pci/setup-res.c +++ linux-macro/drivers/pci/setup-res.c @@ -212,9 +212,19 @@ static int pci_revert_fw_address(struct res->end = res->start + size - 1; res->flags &= ~IORESOURCE_UNSET; + /* + * If we're behind a P2P or CardBus bridge, make sure we're + * inside the relevant forwarding window, or otherwise the + * assignment must have been bogus and accesses intended for + * the range assigned would not reach the device anyway. + * On the root bus accept anything under the assumption the + * host bridge will let it through. + */ root = pci_find_parent_resource(dev, res); if (!root) { - if (res->flags & IORESOURCE_IO) + if (dev->bus->parent) + return -ENXIO; + else if (res->flags & IORESOURCE_IO) root = &ioport_resource; else root = &iomem_resource;
Fix an issue with the Tyan Tomcat IV S1564D system, the BIOS of which does not assign PCI buses beyond #2, where our resource reallocation code preserves the reset default of an I/O BAR assignment outside its upstream PCI-to-PCI bridge's I/O forwarding range: pci 0000:06:08.0: BAR 4: no space for [io size 0x0020] pci 0000:06:08.0: BAR 4: trying firmware assignment [io 0xfce0-0xfcff] pci 0000:06:08.0: BAR 4: assigned [io 0xfce0-0xfcff] [...] pci_bus 0000:06: resource 0 [io 0x2000-0x2fff] Consequently when the device driver tries to access 06:08.0 according to its designated address range it pokes at an unassigned I/O location, likely subtractively decoded by the southbridge and forwarded to ISA, causing the driver to become confused and bail out: uhci_hcd 0000:06:08.0: host system error, PCI problems? uhci_hcd 0000:06:08.0: host controller process error, something bad happened! uhci_hcd 0000:06:08.0: host controller halted, very bad! uhci_hcd 0000:06:08.0: HCRESET not completed yet! uhci_hcd 0000:06:08.0: HC died; cleaning up if good luck happens or if bad luck does, an infinite flood of messages: uhci_hcd 0000:06:08.0: host system error, PCI problems? uhci_hcd 0000:06:08.0: host controller process error, something bad happened! making the system virtually unusable. This is because we try to retain any BAR assignment the firmware may have made here, which may be necessary for devices on the root bus with some systems, but cannot work for devices that are behind a PCI-to-PCI bridge where the BAR assignment is outside the upstream bridge's forwarding range. Make sure then for a device behind a PCI-to-PCI bridge that any firmware assignment is within the bridge's relevant forwarding window or do not restore the assignment, fixing the system concerned as follows: pci 0000:06:08.0: BAR 4: no space for [io size 0x0020] pci 0000:06:08.0: BAR 4: failed to assign [io 0xfce0-0xfcff] [...] pci 0000:06:08.0: BAR 4: assigned [io 0x2000-0x201f] and making device 06:08.0 work correctly. Cf. <https://bugzilla.kernel.org/show_bug.cgi?id=16263> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203012338460.46819@angie.orcam.me.uk Fixes: 58c84eda0756 ("PCI: fall back to original BIOS BAR addresses") Cc: stable@vger.kernel.org # v2.6.35+ --- Hi Bjorn, I have trimmed the change description down as you requested and left the change proper unmodified, as discussed in my earlier response. Let me know if you have any other concerns with this fix. Maciej Changes from v2: - Change description trimmed and rephrased, link to a full bootstrap log earlier on in discussion added. Changes from v1: - Do restore firmware BAR assignments behind a PCI-PCI bridge, but only if within the bridge's forwarding window. - Update the change description and heading accordingly (was: PCI: Do not restore firmware BAR assignments behind a PCI-PCI bridge). --- drivers/pci/setup-res.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) linux-pci-setup-res-fw-address-nobridge.diff