Message ID | 20240507213125.804474-1-alex.williamson@redhat.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI: Release unused bridge resources during resize | expand |
On Tue, 7 May 2024, Alex Williamson wrote: > Resizing BARs can be blocked when a device in the bridge hierarchy > itself consumes resources from the resized range. This scenario is > common with Intel Arc DG2 GPUs where the following is a typical > topology: > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > Here the system BIOS has provided a large 64bit, prefetchable window: > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > But only a small portion is programmed into the root port aperture: > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > The upstream port then provides the following aperture: > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > With the missing range found to be consumed by the switch port itself: > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > The downstream port above the GPU provides the same aperture as upstream: > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > Which is entirely consumed by the GPU: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > In summary, iomem reports the following: > > b000000000-bfffffffff : PCI Bus 0000:5d > bfe0000000-bff07fffff : PCI Bus 0000:5e > bfe0000000-bfefffffff : PCI Bus 0000:5f > bfe0000000-bfefffffff : PCI Bus 0000:60 > bfe0000000-bfefffffff : 0000:60:00.0 > bff0000000-bff07fffff : 0000:5e:00.0 > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > Capabilities: [420 v1] Physical Resizable BAR > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > However when attempting a resize we get -ENOSPC: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > In this example we need to resize all the way up to the root port > aperture, but we refuse to change the root port aperture while resources > are allocated for the upstream port BAR. > > The solution proposed here builds on the idea in commit 91fa127794ac > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > be resized while there is no driver attached. In this case, when there > is no driver bound to the upstream switch port we'll release resources > of the bridge which match the reallocation. Therefore we can achieve > the below successful resize operation by unbinding 0000:5e:00.0 from the > pcieport driver before invoking the resource2_resize interface on the > GPU at 0000:60:00.0. > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Yes. Looks another case where an already assigned resource prevents some operation from succeeding. > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > index 909e6a7c3cc3..15fc8e4e84c9 100644 > --- a/drivers/pci/setup-bus.c > +++ b/drivers/pci/setup-bus.c > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > } > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > +{ > + int i; > + > + if (!device_trylock(&pdev->dev)) > + return; > + > + if (pdev->dev.driver) Isn't portdrv bound to bridges so how does this ends up working?
On Mon, 13 May 2024 16:46:09 +0300 (EEST) Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote: > On Tue, 7 May 2024, Alex Williamson wrote: > > > Resizing BARs can be blocked when a device in the bridge hierarchy > > itself consumes resources from the resized range. This scenario is > > common with Intel Arc DG2 GPUs where the following is a typical > > topology: > > > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > > > Here the system BIOS has provided a large 64bit, prefetchable window: > > > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > > > But only a small portion is programmed into the root port aperture: > > > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > > > The upstream port then provides the following aperture: > > > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > With the missing range found to be consumed by the switch port itself: > > > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > > > The downstream port above the GPU provides the same aperture as upstream: > > > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > Which is entirely consumed by the GPU: > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > In summary, iomem reports the following: > > > > b000000000-bfffffffff : PCI Bus 0000:5d > > bfe0000000-bff07fffff : PCI Bus 0000:5e > > bfe0000000-bfefffffff : PCI Bus 0000:5f > > bfe0000000-bfefffffff : PCI Bus 0000:60 > > bfe0000000-bfefffffff : 0000:60:00.0 > > bff0000000-bff07fffff : 0000:5e:00.0 > > > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > > > Capabilities: [420 v1] Physical Resizable BAR > > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > > > However when attempting a resize we get -ENOSPC: > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > > > In this example we need to resize all the way up to the root port > > aperture, but we refuse to change the root port aperture while resources > > are allocated for the upstream port BAR. > > > > The solution proposed here builds on the idea in commit 91fa127794ac > > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > > be resized while there is no driver attached. In this case, when there > > is no driver bound to the upstream switch port we'll release resources > > of the bridge which match the reallocation. Therefore we can achieve > > the below successful resize operation by unbinding 0000:5e:00.0 from the > > pcieport driver before invoking the resource2_resize interface on the > > GPU at 0000:60:00.0. > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > Yes. Looks another case where an already assigned resource prevents some > operation from succeeding. > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > > index 909e6a7c3cc3..15fc8e4e84c9 100644 > > --- a/drivers/pci/setup-bus.c > > +++ b/drivers/pci/setup-bus.c > > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > > } > > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > > +{ > > + int i; > > + > > + if (!device_trylock(&pdev->dev)) > > + return; > > + > > + if (pdev->dev.driver) > > Isn't portdrv bound to bridges so how does this ends up working? The user will need to unbind the bridge from the driver, just like they'd need to unbind the endpoint from a driver to resize a BAR through sysfs. I'm not sure how else to avoid races with drivers requesting resources other than to assert that there is no driver for the device. Do you have an alternative suggestion? Thanks, Alex
On Thu, 16 May 2024, Alex Williamson wrote: > On Mon, 13 May 2024 16:46:09 +0300 (EEST) > Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote: > > > On Tue, 7 May 2024, Alex Williamson wrote: > > > > > Resizing BARs can be blocked when a device in the bridge hierarchy > > > itself consumes resources from the resized range. This scenario is > > > common with Intel Arc DG2 GPUs where the following is a typical > > > topology: > > > > > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > > > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > > > > > Here the system BIOS has provided a large 64bit, prefetchable window: > > > > > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > > > > > But only a small portion is programmed into the root port aperture: > > > > > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > > > > > The upstream port then provides the following aperture: > > > > > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > > > With the missing range found to be consumed by the switch port itself: > > > > > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > > > > > The downstream port above the GPU provides the same aperture as upstream: > > > > > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > > > Which is entirely consumed by the GPU: > > > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > > > In summary, iomem reports the following: > > > > > > b000000000-bfffffffff : PCI Bus 0000:5d > > > bfe0000000-bff07fffff : PCI Bus 0000:5e > > > bfe0000000-bfefffffff : PCI Bus 0000:5f > > > bfe0000000-bfefffffff : PCI Bus 0000:60 > > > bfe0000000-bfefffffff : 0000:60:00.0 > > > bff0000000-bff07fffff : 0000:5e:00.0 > > > > > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > > > > > Capabilities: [420 v1] Physical Resizable BAR > > > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > > > > > However when attempting a resize we get -ENOSPC: > > > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > > > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > > > > > In this example we need to resize all the way up to the root port > > > aperture, but we refuse to change the root port aperture while resources > > > are allocated for the upstream port BAR. > > > > > > The solution proposed here builds on the idea in commit 91fa127794ac > > > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > > > be resized while there is no driver attached. In this case, when there > > > is no driver bound to the upstream switch port we'll release resources > > > of the bridge which match the reallocation. Therefore we can achieve > > > the below successful resize operation by unbinding 0000:5e:00.0 from the > > > pcieport driver before invoking the resource2_resize interface on the > > > GPU at 0000:60:00.0. > > > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > > > Yes. Looks another case where an already assigned resource prevents some > > operation from succeeding. > > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > > > index 909e6a7c3cc3..15fc8e4e84c9 100644 > > > --- a/drivers/pci/setup-bus.c > > > +++ b/drivers/pci/setup-bus.c > > > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > > > } > > > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > > > > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > > > +{ > > > + int i; > > > + > > > + if (!device_trylock(&pdev->dev)) > > > + return; > > > + > > > + if (pdev->dev.driver) > > > > Isn't portdrv bound to bridges so how does this ends up working? > > The user will need to unbind the bridge from the driver, just like > they'd need to unbind the endpoint from a driver to resize a BAR > through sysfs. I'm not sure how else to avoid races with drivers > requesting resources other than to assert that there is no driver for > the device. Do you have an alternative suggestion? Thanks, Okay, understood. It just wasn't immediately obvious there was this additional requirement related to unbinding the portdrv.
Ping. Any further thoughts on this approach or suggestions for something different? Thanks, Alex On Tue, 7 May 2024 15:31:23 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > Resizing BARs can be blocked when a device in the bridge hierarchy > itself consumes resources from the resized range. This scenario is > common with Intel Arc DG2 GPUs where the following is a typical > topology: > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > Here the system BIOS has provided a large 64bit, prefetchable window: > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > But only a small portion is programmed into the root port aperture: > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > The upstream port then provides the following aperture: > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > With the missing range found to be consumed by the switch port itself: > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > The downstream port above the GPU provides the same aperture as upstream: > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > Which is entirely consumed by the GPU: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > In summary, iomem reports the following: > > b000000000-bfffffffff : PCI Bus 0000:5d > bfe0000000-bff07fffff : PCI Bus 0000:5e > bfe0000000-bfefffffff : PCI Bus 0000:5f > bfe0000000-bfefffffff : PCI Bus 0000:60 > bfe0000000-bfefffffff : 0000:60:00.0 > bff0000000-bff07fffff : 0000:5e:00.0 > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > Capabilities: [420 v1] Physical Resizable BAR > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > However when attempting a resize we get -ENOSPC: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > In this example we need to resize all the way up to the root port > aperture, but we refuse to change the root port aperture while resources > are allocated for the upstream port BAR. > > The solution proposed here builds on the idea in commit 91fa127794ac > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > be resized while there is no driver attached. In this case, when there > is no driver bound to the upstream switch port we'll release resources > of the bridge which match the reallocation. Therefore we can achieve > the below successful resize operation by unbinding 0000:5e:00.0 from the > pcieport driver before invoking the resource2_resize interface on the > GPU at 0000:60:00.0. > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > drivers/pci/setup-bus.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > index 909e6a7c3cc3..15fc8e4e84c9 100644 > --- a/drivers/pci/setup-bus.c > +++ b/drivers/pci/setup-bus.c > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > } > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > +{ > + int i; > + > + if (!device_trylock(&pdev->dev)) > + return; > + > + if (pdev->dev.driver) > + goto unlock; > + > + for (i = 0; i < PCI_STD_NUM_BARS; i++) { > + if (pci_resource_len(pdev, i) && > + !((pci_resource_flags(pdev, i) ^ type) & PCI_RES_TYPE_MASK)) > + pci_release_resource(pdev, i); > + } > + > +unlock: > + device_unlock(&pdev->dev); > +} > + > int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > { > struct pci_dev_resource *dev_res; > @@ -2260,8 +2280,10 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > > pci_info(bridge, "%s %pR: releasing\n", res_name, res); > > - if (res->parent) > + if (res->parent) { > release_resource(res); > + pci_release_resource_type(bridge, type); > + } > res->start = 0; > res->end = 0; > break;
On Tue, May 07, 2024 at 03:31:23PM -0600, Alex Williamson wrote: > Resizing BARs can be blocked when a device in the bridge hierarchy > itself consumes resources from the resized range. This scenario is > common with Intel Arc DG2 GPUs where the following is a typical > topology: > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > Here the system BIOS has provided a large 64bit, prefetchable window: > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > But only a small portion is programmed into the root port aperture: > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > The upstream port then provides the following aperture: > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > With the missing range found to be consumed by the switch port itself: > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > The downstream port above the GPU provides the same aperture as upstream: > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > Which is entirely consumed by the GPU: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > In summary, iomem reports the following: > > b000000000-bfffffffff : PCI Bus 0000:5d > bfe0000000-bff07fffff : PCI Bus 0000:5e > bfe0000000-bfefffffff : PCI Bus 0000:5f > bfe0000000-bfefffffff : PCI Bus 0000:60 > bfe0000000-bfefffffff : 0000:60:00.0 > bff0000000-bff07fffff : 0000:5e:00.0 > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > Capabilities: [420 v1] Physical Resizable BAR > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > However when attempting a resize we get -ENOSPC: > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > In this example we need to resize all the way up to the root port > aperture, but we refuse to change the root port aperture while resources > are allocated for the upstream port BAR. > > The solution proposed here builds on the idea in commit 91fa127794ac > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > be resized while there is no driver attached. In this case, when there > is no driver bound to the upstream switch port we'll release resources > of the bridge which match the reallocation. Therefore we can achieve > the below successful resize operation by unbinding 0000:5e:00.0 from the > pcieport driver before invoking the resource2_resize interface on the > GPU at 0000:60:00.0. resource2_resize? Oh, I guess this is the sysfs interface (resourceN_resize, which leads to pci_resize_resource(), and in this case we're resizing BAR 2 to 8GB, so it must have been something like this? (2 ^ (13+20) == 8G) # echo 0000:5f:01.0 > /sys/bus/pci/drivers/pcieport/unbind # echo 0000:5e:00.0 > /sys/bus/pci/drivers/pcieport/unbind # echo 0000:5d:00.0 > /sys/bus/pci/drivers/pcieport/unbind # echo 13 > /sys/bus/pci/devices/0000:60:00.0/resource2_resize (Maybe we don't need 5d:00.0, since that looks like a Root Port and doesn't have a BAR that needs to be released?) And I guess we probably need to rebind pcieport afterwards so hotplug, etc will work again? > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > drivers/pci/setup-bus.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > index 909e6a7c3cc3..15fc8e4e84c9 100644 > --- a/drivers/pci/setup-bus.c > +++ b/drivers/pci/setup-bus.c > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > } > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > +{ > + int i; > + > + if (!device_trylock(&pdev->dev)) > + return; > + > + if (pdev->dev.driver) > + goto unlock; > + > + for (i = 0; i < PCI_STD_NUM_BARS; i++) { > + if (pci_resource_len(pdev, i) && > + !((pci_resource_flags(pdev, i) ^ type) & PCI_RES_TYPE_MASK)) > + pci_release_resource(pdev, i); > + } > + > +unlock: > + device_unlock(&pdev->dev); > +} > + > int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > { > struct pci_dev_resource *dev_res; > @@ -2260,8 +2280,10 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > > pci_info(bridge, "%s %pR: releasing\n", res_name, res); > > - if (res->parent) > + if (res->parent) { > release_resource(res); > + pci_release_resource_type(bridge, type); > + } > res->start = 0; > res->end = 0; > break; > -- > 2.44.0 >
On Fri, 7 Jun 2024 17:33:20 -0500 Bjorn Helgaas <helgaas@kernel.org> wrote: > On Tue, May 07, 2024 at 03:31:23PM -0600, Alex Williamson wrote: > > Resizing BARs can be blocked when a device in the bridge hierarchy > > itself consumes resources from the resized range. This scenario is > > common with Intel Arc DG2 GPUs where the following is a typical > > topology: > > > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > > > > Here the system BIOS has provided a large 64bit, prefetchable window: > > > > pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] > > > > But only a small portion is programmed into the root port aperture: > > > > pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > > > The upstream port then provides the following aperture: > > > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > With the missing range found to be consumed by the switch port itself: > > > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] > > > > The downstream port above the GPU provides the same aperture as upstream: > > > > pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > Which is entirely consumed by the GPU: > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > > > In summary, iomem reports the following: > > > > b000000000-bfffffffff : PCI Bus 0000:5d > > bfe0000000-bff07fffff : PCI Bus 0000:5e > > bfe0000000-bfefffffff : PCI Bus 0000:5f > > bfe0000000-bfefffffff : PCI Bus 0000:60 > > bfe0000000-bfefffffff : 0000:60:00.0 > > bff0000000-bff07fffff : 0000:5e:00.0 > > > > The GPU at 0000:60:00.0 supports a Resizable BAR: > > > > Capabilities: [420 v1] Physical Resizable BAR > > BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB > > > > However when attempting a resize we get -ENOSPC: > > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space > > pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space > > pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] > > pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned > > > > In this example we need to resize all the way up to the root port > > aperture, but we refuse to change the root port aperture while resources > > are allocated for the upstream port BAR. > > > > The solution proposed here builds on the idea in commit 91fa127794ac > > ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can > > be resized while there is no driver attached. In this case, when there > > is no driver bound to the upstream switch port we'll release resources > > of the bridge which match the reallocation. Therefore we can achieve > > the below successful resize operation by unbinding 0000:5e:00.0 from the > > pcieport driver before invoking the resource2_resize interface on the > > GPU at 0000:60:00.0. > > resource2_resize? Oh, I guess this is the sysfs interface > (resourceN_resize, which leads to pci_resize_resource(), and in this > case we're resizing BAR 2 to 8GB, Exactly. > so it must have been something like > this? (2 ^ (13+20) == 8G) > > # echo 0000:5f:01.0 > /sys/bus/pci/drivers/pcieport/unbind > # echo 0000:5e:00.0 > /sys/bus/pci/drivers/pcieport/unbind > # echo 0000:5d:00.0 > /sys/bus/pci/drivers/pcieport/unbind > # echo 13 > /sys/bus/pci/devices/0000:60:00.0/resource2_resize > > (Maybe we don't need 5d:00.0, since that looks like a Root Port and > doesn't have a BAR that needs to be released?) We don't actually need to unbind either the root port (5d:00.0) or the downstream switch port (5f:01.0) since they don't consume any resources from the aperture we need to resize. For example if this were an AMD GPU we'd have a similar PCIe switch topology but the upstream switch does not expose a 64-bit prefetchable BAR, only the GPU endpoint itself consumes resources from that aperture. Therefore we'd only need to unbind the endpoint driver, effect the resize, and rebind the endpoint driver. Ex (assuming the same topology): # echo 0000:60:00.0 > /sys/bus/pci/devices/0000:60:00.0/driver/unbind # echo 13 > /sys/bus/pci/devices/0000:60:00.0/resource2_resize # echo 0000:60:00.0 > /sys/bus/pci/drivers_probe The Intel GPU has made the unfortunate hardware decision to have the upstream port consume resources from the same aperture as used by the downstream resizable BAR, therefore the above steps fail with the -ENOSPC error for an Intel Arc GPU. This proposal allows it to work as: # echo 0000:60:00.0 > /sys/bus/pci/devices/0000:60:00.0/driver/unbind # echo 0000:5e:00.0 > /sys/bus/pci/devices/0000:5e:00.0/driver/unbind # echo 13 > /sys/bus/pci/devices/0000:60:00.0/resource2_resize # echo 0000:5e:00.0 > /sys/bus/pci/drivers_probe # echo 0000:60:00.0 > /sys/bus/pci/drivers_probe > And I guess we probably need to rebind pcieport afterwards so hotplug, > etc will work again? Yep. Thanks, Alex > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] > > pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] > > pci 0000:5e:00.0: PCI bridge to [bus 5f-61] > > pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > pcieport 0000:5f:01.0: PCI bridge to [bus 60] > > pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] > > pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > --- > > drivers/pci/setup-bus.c | 24 +++++++++++++++++++++++- > > 1 file changed, 23 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c > > index 909e6a7c3cc3..15fc8e4e84c9 100644 > > --- a/drivers/pci/setup-bus.c > > +++ b/drivers/pci/setup-bus.c > > @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) > > } > > EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); > > > > +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) > > +{ > > + int i; > > + > > + if (!device_trylock(&pdev->dev)) > > + return; > > + > > + if (pdev->dev.driver) > > + goto unlock; > > + > > + for (i = 0; i < PCI_STD_NUM_BARS; i++) { > > + if (pci_resource_len(pdev, i) && > > + !((pci_resource_flags(pdev, i) ^ type) & PCI_RES_TYPE_MASK)) > > + pci_release_resource(pdev, i); > > + } > > + > > +unlock: > > + device_unlock(&pdev->dev); > > +} > > + > > int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > > { > > struct pci_dev_resource *dev_res; > > @@ -2260,8 +2280,10 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) > > > > pci_info(bridge, "%s %pR: releasing\n", res_name, res); > > > > - if (res->parent) > > + if (res->parent) { > > release_resource(res); > > + pci_release_resource_type(bridge, type); > > + } > > res->start = 0; > > res->end = 0; > > break; > > -- > > 2.44.0 > > >
On Fri, Jun 07, 2024 at 05:01:56PM -0600, Alex Williamson wrote: > On Fri, 7 Jun 2024 17:33:20 -0500 > Bjorn Helgaas <helgaas@kernel.org> wrote: > > > On Tue, May 07, 2024 at 03:31:23PM -0600, Alex Williamson wrote: > > > Resizing BARs can be blocked when a device in the bridge hierarchy > > > itself consumes resources from the resized range. This scenario is > > > common with Intel Arc DG2 GPUs where the following is a typical > > > topology: > > > > > > +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] > > > \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller > ... > > > pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing > > > pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing > > > pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing > > > pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned > > > pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned > > > pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned It didn't occur to me before, but don't we have a potential issue here because we relocated 5e:00.0 BAR 0 from 0xbff0000000 to 0xb200000000 without notifying any code that might be using it? I'm worried about vendor-specific perf/EDAC/etc drivers that can't claim the the bridge using pci_register_driver() because the pcieport driver is in the way. I think some of those drivers go behind the back of the driver model to locate their device and ioremap the BAR direction. Bjorn
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c index 909e6a7c3cc3..15fc8e4e84c9 100644 --- a/drivers/pci/setup-bus.c +++ b/drivers/pci/setup-bus.c @@ -2226,6 +2226,26 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) } EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources); +static void pci_release_resource_type(struct pci_dev *pdev, unsigned long type) +{ + int i; + + if (!device_trylock(&pdev->dev)) + return; + + if (pdev->dev.driver) + goto unlock; + + for (i = 0; i < PCI_STD_NUM_BARS; i++) { + if (pci_resource_len(pdev, i) && + !((pci_resource_flags(pdev, i) ^ type) & PCI_RES_TYPE_MASK)) + pci_release_resource(pdev, i); + } + +unlock: + device_unlock(&pdev->dev); +} + int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) { struct pci_dev_resource *dev_res; @@ -2260,8 +2280,10 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type) pci_info(bridge, "%s %pR: releasing\n", res_name, res); - if (res->parent) + if (res->parent) { release_resource(res); + pci_release_resource_type(bridge, type); + } res->start = 0; res->end = 0; break;
Resizing BARs can be blocked when a device in the bridge hierarchy itself consumes resources from the resized range. This scenario is common with Intel Arc DG2 GPUs where the following is a typical topology: +-[0000:5d]-+-00.0-[5e-61]----00.0-[5f-61]--+-01.0-[60]----00.0 Intel Corporation DG2 [Arc A380] \-04.0-[61]----00.0 Intel Corporation DG2 Audio Controller Here the system BIOS has provided a large 64bit, prefetchable window: pci_bus 0000:5d: root bus resource [mem 0xb000000000-0xbfffffffff window] But only a small portion is programmed into the root port aperture: pci 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] The upstream port then provides the following aperture: pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] With the missing range found to be consumed by the switch port itself: pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref] The downstream port above the GPU provides the same aperture as upstream: pci 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] Which is entirely consumed by the GPU: pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref] In summary, iomem reports the following: b000000000-bfffffffff : PCI Bus 0000:5d bfe0000000-bff07fffff : PCI Bus 0000:5e bfe0000000-bfefffffff : PCI Bus 0000:5f bfe0000000-bfefffffff : PCI Bus 0000:60 bfe0000000-bfefffffff : 0000:60:00.0 bff0000000-bff07fffff : 0000:5e:00.0 The GPU at 0000:60:00.0 supports a Resizable BAR: Capabilities: [420 v1] Physical Resizable BAR BAR 2: current size: 256MB, supported: 256MB 512MB 1GB 2GB 4GB 8GB However when attempting a resize we get -ENOSPC: pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space pcieport 0000:5e:00.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: can't assign; no space pcieport 0000:5f:01.0: bridge window [mem size 0x200000000 64bit pref]: failed to assign pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: can't assign; no space pci 0000:60:00.0: BAR 2 [mem size 0x200000000 64bit pref]: failed to assign pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref] pcieport 0000:5e:00.0: PCI bridge to [bus 5f-61] pcieport 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] pcieport 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] pcieport 0000:5f:01.0: PCI bridge to [bus 60] pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref] pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: assigned In this example we need to resize all the way up to the root port aperture, but we refuse to change the root port aperture while resources are allocated for the upstream port BAR. The solution proposed here builds on the idea in commit 91fa127794ac ("PCI: Expose PCIe Resizable BAR support via sysfs") where the BAR can be resized while there is no driver attached. In this case, when there is no driver bound to the upstream switch port we'll release resources of the bridge which match the reallocation. Therefore we can achieve the below successful resize operation by unbinding 0000:5e:00.0 from the pcieport driver before invoking the resource2_resize interface on the GPU at 0000:60:00.0. pci 0000:60:00.0: BAR 2 [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pcieport 0000:5f:01.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pci 0000:5e:00.0: bridge window [mem 0xbfe0000000-0xbfefffffff 64bit pref]: releasing pci 0000:5e:00.0: BAR 0 [mem 0xbff0000000-0xbff07fffff 64bit pref]: releasing pcieport 0000:5d:00.0: bridge window [mem 0xbfe0000000-0xbff07fffff 64bit pref]: releasing pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref]: assigned pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned pci 0000:5e:00.0: BAR 0 [mem 0xb200000000-0xb2007fffff 64bit pref]: assigned pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned pci 0000:60:00.0: BAR 2 [mem 0xb000000000-0xb1ffffffff 64bit pref]: assigned pci 0000:5e:00.0: PCI bridge to [bus 5f-61] pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] pcieport 0000:5d:00.0: PCI bridge to [bus 5e-61] pcieport 0000:5d:00.0: bridge window [mem 0xb9000000-0xba0fffff] pcieport 0000:5d:00.0: bridge window [mem 0xb000000000-0xb2ffffffff 64bit pref] pci 0000:5e:00.0: PCI bridge to [bus 5f-61] pci 0000:5e:00.0: bridge window [mem 0xb9000000-0xba0fffff] pci 0000:5e:00.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] pcieport 0000:5f:01.0: PCI bridge to [bus 60] pcieport 0000:5f:01.0: bridge window [mem 0xb9000000-0xb9ffffff] pcieport 0000:5f:01.0: bridge window [mem 0xb000000000-0xb1ffffffff 64bit pref] Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- drivers/pci/setup-bus.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)