mbox series

[v8,00/24] PCI: Allow BAR movement during boot and hotplug

Message ID 20200427182358.2067702-1-s.miroshnichenko@yadro.com (mailing list archive)
Headers show
Series PCI: Allow BAR movement during boot and hotplug | expand

Message

Sergei Miroshnichenko April 27, 2020, 6:23 p.m. UTC
Currently PCI hotplug works on top of resources which are usually reserved
not by the kernel, but by BIOS, bootloader, firmware, etc. These resources
are gaps in the address space where BARs of new devices may fit, and extra
bus number per port, so bridges can be hot-added. This series aim the BARs
problem: it shows the kernel how to redistribute them on the run, so the
hotplug becomes predictable and cross-platform. A follow-up patchset will
propose a solution for bus numbers.

To arrange a space for BARs of new hotplugged devices, the kernel can pause
the drivers of working PCI devices and reshuffle the assigned BARs. When a
driver is un-paused by the kernel, it should ioremap() the new addresses of
its BARs.

Drivers indicate their support of the feature by implementing the new hooks
.rescan_prepare() and .rescan_done() in the struct pci_driver. If a driver
doesn't yet support the feature, BARs of its devices will be considered as
immovable and handled in the same way as resources with the PCI_FIXED flag:
they are guaranteed to remain untouched.

Tested on a number of x86_64 machines without any special kernel command
line arguments:
 - PC: i7-5930K + ASUS X99-A;
 - PC: i5-8500 + ASUS Z370-F;
 - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
 - HP ProLiant DL380 G5: Xeon X5460;
 - Dell Inspiron N5010: i5 M 480;
 - Dell Precision M6600: i7-2920XM.

Also tested on a Power8 (Vesnin) and Power9 (Nicole) ppc64le machines, but
with extra patchset, its next version is to be sent upstream a bit later.

First two patches of this series are independent bugfixes, both are not
related directly to the movable BARs feature, but without them the rest of
this series will not work as expected.

Patches 03-15 implement the essentials of the feature.

Patches 16-21 are performance improvements for movable BARs and pciehp.

Patch 22 enables the feature by default.

Patches 23-24 add movable BARs support to nvme and portdrv.

This patchset is a part of our work on adding support for hotplugging
chains of chassis full of other bridges, NVME drives, SAS HBAs, GPUs, etc.
without special requirements such as Hot-Plug Controller, reservation of
bus numbers or memory regions by firmware, etc.

Added Stefan Roese and Andy Lavr to CC, thank you for trying this on your
hardware!

Added Christian König and Ard Biesheuvel to CC, because of the recent
"PCI: allow pci_resize_resource() to be used on devices on the root bus"
thread, which covers a similar problem.

Changes since v7:
 - Added some documentation;
 - Replaced every occurrence of the word "immovable" with "fixed";
 - Don't touch PNP, ACPI resources anymore;
 - Replaced double rescan with triple rescan:
   * first try every BAR;
   * if that failed, retry without BARs which weren't assigned before;
   * if that failed, retry without BARs of hotplugged devices;
 - Reassign BARs during boot only if BIOS assigned not all requested BARs;
 - Fixed up PCIBIOS_MIN_MEM instead of ignoring it;
 - Now the feature auto-disables in presence of a transparent bridge;
 - Improved support of runtime PM;
 - Fixed issues with incorrectly released bridge windows;
 - Fixed calculating bridge window size.
 
Changes since v6:
 - Added a fix for hotplug on AMD Epyc + Supermicro H11SSL-i by ignoring
   PCIBIOS_MIN_MEM;
 - Fixed a workaround which marks VGA BARs as immovables;
 - Fixed misleading "can't claim BAR ... no compatible bridge window" error
   messages;
 - Refactored the code, reduced the amount of patches;
 - Exclude PowerPC-specific arch patches, they will be sent separately;
 - Disabled for PowerNV by default - waiting for the PCIPOCALYPSE patchset.
 - Fixed reports from the kbuild test robot.

Changes since v5:
 - Simplified the disable flag, now it is "pci=no_movable_buses";
 - More deliberate marking the BARs as immovable;
 - Mark as immovable BARs which are used by unbound drivers;
 - Ignoring BAR assignment by non-kernel program components, so the kernel
   is able now to distribute BARs in optimal and predictable way;
 - Move here PowerNV-specific patches from the older "powerpc/powernv/pci:
   Make hotplug self-sufficient, independent of FW and DT" series;
 - Fix EEH cache rebuilding and PE allocation for PowerNV during rescan.

Changes since v4:
 - Feature is enabled by default (turned on by one of the latest patches);
 - Add pci_dev_movable_bars_supported(dev) instead of marking the immovable
   BARs with the IORESOURCE_PCI_FIXED flag;
 - Set up PCIe bridges during rescan via sysfs, so MPS settings are now
   configured not only during system boot or pcihp events;
 - Allow movement of switch's BARs if claimed by portdrv;
 - Update EEH address caches after rescan for powerpc;
 - Don't disable completely hot-added devices which can't have BARs being
   fit - just disable their BARs, so they are still visible in lspci etc;
 - Clearer names: fixed_range_hard -> immovable_range, fixed_range_soft ->
   realloc_range;
 - Drop the patch for pci_restore_config_space() - fixed by properly using
   the runtime PM.

Changes since v3:
 - Rebased to the upstream, so the patches apply cleanly again.

Changes since v2:
 - Fixed double-assignment of bridge windows;
 - Fixed assignment of fixed prefetched resources;
 - Fixed releasing of fixed resources;
 - Fixed a debug message;
 - Removed auto-enabling the movable BARs for x86 - let's rely on the
   "pcie_movable_bars=force" option for now;
 - Reordered the patches - bugfixes first.

Changes since v1:
 - Add a "pcie_movable_bars={ off | force }" command line argument;
 - Handle the IORESOURCE_PCI_FIXED flag properly;
 - Don't move BARs of devices which don't support the feature;
 - Guarantee that new hotplugged devices will not steal memory from working
   devices by ignoring the failing new devices with the new PCI_DEV_IGNORE
   flag;
 - Add rescan_prepare()+rescan_done() to the struct pci_driver instead of
   using the reset_prepare()+reset_done() from struct pci_error_handlers;
 - Add a bugfix of a race condition;
 - Fixed hotplug in a non-pre-enabled (by BIOS/firmware) bridge;
 - Fix the compatibility of the feature with pm_runtime and D3-state;
 - Hotplug events from pciehp also can move BARs;
 - Add support of the feature to the NVME driver.

Sergei Miroshnichenko (24):
  PCI: Fix race condition in pci_enable/disable_device()
  PCI: Ensure a bridge has I/O and MEM access for hot-added devices
  PCI: hotplug: Initial support of the movable BARs feature
  PCI: Add version of release_child_resources() aware of fixed BARs
  PCI: hotplug: Fix reassigning the released BARs
  PCI: hotplug: Recalculate every bridge window during rescan
  PCI: hotplug: Don't allow hot-added devices to steal resources
  PCI: Reassign BARs if BIOS/bootloader had assigned not all of them
  PCI: hotplug: Try to reassign movable BARs only once
  PCI: hotplug: Calculate fixed parts of bridge windows
  PCI: Include fixed BARs into the bus size calculating
  PCI: hotplug: movable BARs: Compute limits for relocated bridge
    windows
  PCI: Make sure bridge windows include their fixed BARs
  PCI: hotplug: Add support of fixed BARs to pci_assign_resource()
  PCI: hotplug: Sort fixed BARs before assignment
  x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is
    invalid
  PCI: hotplug: Configure MPS after manual bus rescan
  PCI: hotplug: Don't disable the released bridge windows immediately
  PCI: pciehp: Trigger a domain rescan on hp events when enabled movable
    BARs
  PCI: Don't claim fixed BARs
  PCI: hotplug: Don't reserve bus space when enabled movable BARs
  PCI: hotplug: Enable the movable BARs feature by default
  PCI/portdrv: Declare support of movable BARs
  nvme-pci: Handle movable BARs

 Documentation/PCI/pci.rst                     |  55 +++
 .../admin-guide/kernel-parameters.txt         |   1 +
 arch/powerpc/platforms/powernv/pci.c          |   2 +
 arch/powerpc/platforms/pseries/setup.c        |   2 +
 arch/x86/pci/acpi.c                           |  15 +
 drivers/nvme/host/pci.c                       |  21 +-
 drivers/pci/bus.c                             |   2 +-
 drivers/pci/hotplug/pciehp_pci.c              |   5 +
 drivers/pci/iov.c                             |   2 +
 drivers/pci/pci.c                             |  33 +-
 drivers/pci/pci.h                             |  33 ++
 drivers/pci/pcie/portdrv_pci.c                |  11 +
 drivers/pci/probe.c                           | 399 +++++++++++++++++-
 drivers/pci/setup-bus.c                       | 301 ++++++++++---
 drivers/pci/setup-res.c                       |  75 +++-
 include/linux/pci.h                           |  20 +
 16 files changed, 905 insertions(+), 72 deletions(-)


base-commit: 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c

Comments

Christian König April 28, 2020, 12:59 p.m. UTC | #1
Well that is a really nice surprise. Just FYI the situation with GPUs is 
essentially this:

a) The BAR to access video memory with the CPU is by default only 256MB 
in size for backward compatibility with 32bit Windows 7 and older.

b) Modern GPUs easily have 16GB of video memory, but most of that used 
to be accessed only rarely by the CPU. Unfortunately this has changed 
recently by getting more modern graphics APIs in userspace (Vulkan).

c) Both NVidia as well as AMD used to have a mechanism to map different 
stuff into the 256MB window, but AMD dropped this ability quite some 
time ago because it was rather inefficient.

d) Instead for hard of the last 5 years AMD implements the PCI standard 
for dynamic BAR resizing. So what we do is to extend the 256MB BAR into 
16GB (or whatever is needed) once the OS is started and the driver loads.

The problem with this approach is that sometimes bridges can't be 
reconfigured and BARs resized because we have other BARs currently in 
use under the same bridge.

So long story short you have fixed my BAR resizing problem with this 
patchset as well :D

Am 27.04.20 um 20:23 schrieb Sergei Miroshnichenko:
> Currently PCI hotplug works on top of resources which are usually reserved
> not by the kernel, but by BIOS, bootloader, firmware, etc. These resources
> are gaps in the address space where BARs of new devices may fit, and extra
> bus number per port, so bridges can be hot-added. This series aim the BARs
> problem: it shows the kernel how to redistribute them on the run, so the
> hotplug becomes predictable and cross-platform. A follow-up patchset will
> propose a solution for bus numbers.
>
> To arrange a space for BARs of new hotplugged devices, the kernel can pause
> the drivers of working PCI devices and reshuffle the assigned BARs. When a
> driver is un-paused by the kernel, it should ioremap() the new addresses of
> its BARs.
>
> Drivers indicate their support of the feature by implementing the new hooks
> .rescan_prepare() and .rescan_done() in the struct pci_driver. If a driver
> doesn't yet support the feature, BARs of its devices will be considered as
> immovable and handled in the same way as resources with the PCI_FIXED flag:
> they are guaranteed to remain untouched.

Could we let rescan_prepare() optionally return an error and then 
consider the BARs in question not movable for the current rescan? 
Alternatively would it be allowed in the implementation of the 
rescan_prepare() callback to update the PCI_FIXED flags on the BARs?

Problem is that we can't know beforehand if a BAR is currently in use or 
not or if we can block the uses until the rescan is completed.

Additional to that I'm not an expert on the PCI code outside of the 
stuff that I wrote/touched. Still trying to go over the set in the next 
couple of days, but don't expect more than an Acked-by from me.

Cheers,
Christian.

>
> Tested on a number of x86_64 machines without any special kernel command
> line arguments:
>   - PC: i7-5930K + ASUS X99-A;
>   - PC: i5-8500 + ASUS Z370-F;
>   - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
>   - HP ProLiant DL380 G5: Xeon X5460;
>   - Dell Inspiron N5010: i5 M 480;
>   - Dell Precision M6600: i7-2920XM.
>
> Also tested on a Power8 (Vesnin) and Power9 (Nicole) ppc64le machines, but
> with extra patchset, its next version is to be sent upstream a bit later.
>
> First two patches of this series are independent bugfixes, both are not
> related directly to the movable BARs feature, but without them the rest of
> this series will not work as expected.
>
> Patches 03-15 implement the essentials of the feature.
>
> Patches 16-21 are performance improvements for movable BARs and pciehp.
>
> Patch 22 enables the feature by default.
>
> Patches 23-24 add movable BARs support to nvme and portdrv.
>
> This patchset is a part of our work on adding support for hotplugging
> chains of chassis full of other bridges, NVME drives, SAS HBAs, GPUs, etc.
> without special requirements such as Hot-Plug Controller, reservation of
> bus numbers or memory regions by firmware, etc.
>
> Added Stefan Roese and Andy Lavr to CC, thank you for trying this on your
> hardware!
>
> Added Christian König and Ard Biesheuvel to CC, because of the recent
> "PCI: allow pci_resize_resource() to be used on devices on the root bus"
> thread, which covers a similar problem.
>
> Changes since v7:
>   - Added some documentation;
>   - Replaced every occurrence of the word "immovable" with "fixed";
>   - Don't touch PNP, ACPI resources anymore;
>   - Replaced double rescan with triple rescan:
>     * first try every BAR;
>     * if that failed, retry without BARs which weren't assigned before;
>     * if that failed, retry without BARs of hotplugged devices;
>   - Reassign BARs during boot only if BIOS assigned not all requested BARs;
>   - Fixed up PCIBIOS_MIN_MEM instead of ignoring it;
>   - Now the feature auto-disables in presence of a transparent bridge;
>   - Improved support of runtime PM;
>   - Fixed issues with incorrectly released bridge windows;
>   - Fixed calculating bridge window size.
>   
> Changes since v6:
>   - Added a fix for hotplug on AMD Epyc + Supermicro H11SSL-i by ignoring
>     PCIBIOS_MIN_MEM;
>   - Fixed a workaround which marks VGA BARs as immovables;
>   - Fixed misleading "can't claim BAR ... no compatible bridge window" error
>     messages;
>   - Refactored the code, reduced the amount of patches;
>   - Exclude PowerPC-specific arch patches, they will be sent separately;
>   - Disabled for PowerNV by default - waiting for the PCIPOCALYPSE patchset.
>   - Fixed reports from the kbuild test robot.
>
> Changes since v5:
>   - Simplified the disable flag, now it is "pci=no_movable_buses";
>   - More deliberate marking the BARs as immovable;
>   - Mark as immovable BARs which are used by unbound drivers;
>   - Ignoring BAR assignment by non-kernel program components, so the kernel
>     is able now to distribute BARs in optimal and predictable way;
>   - Move here PowerNV-specific patches from the older "powerpc/powernv/pci:
>     Make hotplug self-sufficient, independent of FW and DT" series;
>   - Fix EEH cache rebuilding and PE allocation for PowerNV during rescan.
>
> Changes since v4:
>   - Feature is enabled by default (turned on by one of the latest patches);
>   - Add pci_dev_movable_bars_supported(dev) instead of marking the immovable
>     BARs with the IORESOURCE_PCI_FIXED flag;
>   - Set up PCIe bridges during rescan via sysfs, so MPS settings are now
>     configured not only during system boot or pcihp events;
>   - Allow movement of switch's BARs if claimed by portdrv;
>   - Update EEH address caches after rescan for powerpc;
>   - Don't disable completely hot-added devices which can't have BARs being
>     fit - just disable their BARs, so they are still visible in lspci etc;
>   - Clearer names: fixed_range_hard -> immovable_range, fixed_range_soft ->
>     realloc_range;
>   - Drop the patch for pci_restore_config_space() - fixed by properly using
>     the runtime PM.
>
> Changes since v3:
>   - Rebased to the upstream, so the patches apply cleanly again.
>
> Changes since v2:
>   - Fixed double-assignment of bridge windows;
>   - Fixed assignment of fixed prefetched resources;
>   - Fixed releasing of fixed resources;
>   - Fixed a debug message;
>   - Removed auto-enabling the movable BARs for x86 - let's rely on the
>     "pcie_movable_bars=force" option for now;
>   - Reordered the patches - bugfixes first.
>
> Changes since v1:
>   - Add a "pcie_movable_bars={ off | force }" command line argument;
>   - Handle the IORESOURCE_PCI_FIXED flag properly;
>   - Don't move BARs of devices which don't support the feature;
>   - Guarantee that new hotplugged devices will not steal memory from working
>     devices by ignoring the failing new devices with the new PCI_DEV_IGNORE
>     flag;
>   - Add rescan_prepare()+rescan_done() to the struct pci_driver instead of
>     using the reset_prepare()+reset_done() from struct pci_error_handlers;
>   - Add a bugfix of a race condition;
>   - Fixed hotplug in a non-pre-enabled (by BIOS/firmware) bridge;
>   - Fix the compatibility of the feature with pm_runtime and D3-state;
>   - Hotplug events from pciehp also can move BARs;
>   - Add support of the feature to the NVME driver.
>
> Sergei Miroshnichenko (24):
>    PCI: Fix race condition in pci_enable/disable_device()
>    PCI: Ensure a bridge has I/O and MEM access for hot-added devices
>    PCI: hotplug: Initial support of the movable BARs feature
>    PCI: Add version of release_child_resources() aware of fixed BARs
>    PCI: hotplug: Fix reassigning the released BARs
>    PCI: hotplug: Recalculate every bridge window during rescan
>    PCI: hotplug: Don't allow hot-added devices to steal resources
>    PCI: Reassign BARs if BIOS/bootloader had assigned not all of them
>    PCI: hotplug: Try to reassign movable BARs only once
>    PCI: hotplug: Calculate fixed parts of bridge windows
>    PCI: Include fixed BARs into the bus size calculating
>    PCI: hotplug: movable BARs: Compute limits for relocated bridge
>      windows
>    PCI: Make sure bridge windows include their fixed BARs
>    PCI: hotplug: Add support of fixed BARs to pci_assign_resource()
>    PCI: hotplug: Sort fixed BARs before assignment
>    x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is
>      invalid
>    PCI: hotplug: Configure MPS after manual bus rescan
>    PCI: hotplug: Don't disable the released bridge windows immediately
>    PCI: pciehp: Trigger a domain rescan on hp events when enabled movable
>      BARs
>    PCI: Don't claim fixed BARs
>    PCI: hotplug: Don't reserve bus space when enabled movable BARs
>    PCI: hotplug: Enable the movable BARs feature by default
>    PCI/portdrv: Declare support of movable BARs
>    nvme-pci: Handle movable BARs
>
>   Documentation/PCI/pci.rst                     |  55 +++
>   .../admin-guide/kernel-parameters.txt         |   1 +
>   arch/powerpc/platforms/powernv/pci.c          |   2 +
>   arch/powerpc/platforms/pseries/setup.c        |   2 +
>   arch/x86/pci/acpi.c                           |  15 +
>   drivers/nvme/host/pci.c                       |  21 +-
>   drivers/pci/bus.c                             |   2 +-
>   drivers/pci/hotplug/pciehp_pci.c              |   5 +
>   drivers/pci/iov.c                             |   2 +
>   drivers/pci/pci.c                             |  33 +-
>   drivers/pci/pci.h                             |  33 ++
>   drivers/pci/pcie/portdrv_pci.c                |  11 +
>   drivers/pci/probe.c                           | 399 +++++++++++++++++-
>   drivers/pci/setup-bus.c                       | 301 ++++++++++---
>   drivers/pci/setup-res.c                       |  75 +++-
>   include/linux/pci.h                           |  20 +
>   16 files changed, 905 insertions(+), 72 deletions(-)
>
>
> base-commit: 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c
Sergei Miroshnichenko May 4, 2020, 9:30 a.m. UTC | #2
Hello Christian,

On Tue, 2020-04-28 at 14:59 +0200, Christian König wrote:
> Well that is a really nice surprise. Just FYI the situation with GPUs
> is 
> essentially this:
> 
> a) The BAR to access video memory with the CPU is by default only
> 256MB 
> in size for backward compatibility with 32bit Windows 7 and older.
> 
> b) Modern GPUs easily have 16GB of video memory, but most of that
> used 
> to be accessed only rarely by the CPU. Unfortunately this has
> changed 
> recently by getting more modern graphics APIs in userspace (Vulkan).
> 
> c) Both NVidia as well as AMD used to have a mechanism to map
> different 
> stuff into the 256MB window, but AMD dropped this ability quite some 
> time ago because it was rather inefficient.
> 
> d) Instead for hard of the last 5 years AMD implements the PCI
> standard 
> for dynamic BAR resizing. So what we do is to extend the 256MB BAR
> into 
> 16GB (or whatever is needed) once the OS is started and the driver
> loads.
> 
> The problem with this approach is that sometimes bridges can't be 
> reconfigured and BARs resized because we have other BARs currently
> in 
> use under the same bridge.
> 
> So long story short you have fixed my BAR resizing problem with this 
> patchset as well :D
> 

Thanks for introducing to this problem, it is not yet covered by
this code, so I'll modify the pci_resize_resource(): let it try as
it does now, and if that didn't work - try pci_rescan_bus(), which
moves BARs.

To test this, do I need to trigger a BAR resizing manually, or
drm/amdgpu will do it automatically during init?

May these resized BAR change their start address during init?

> Am 27.04.20 um 20:23 schrieb Sergei Miroshnichenko:
> > ...
> > 
> > Drivers indicate their support of the feature by implementing the
> > new hooks
> > .rescan_prepare() and .rescan_done() in the struct pci_driver. If a
> > driver
> > doesn't yet support the feature, BARs of its devices will be
> > considered as
> > immovable and handled in the same way as resources with the
> > PCI_FIXED flag:
> > they are guaranteed to remain untouched.
> 
> Could we let rescan_prepare() optionally return an error and then 
> consider the BARs in question not movable for the current rescan? 
> Alternatively would it be allowed in the implementation of the 
> rescan_prepare() callback to update the PCI_FIXED flags on the BARs?
> 
> Problem is that we can't know beforehand if a BAR is currently in use
> or 
> not or if we can block the uses until the rescan is completed.

I guess one more optional hook may be added to the pci_driver:

  bool (*bar_fixed)(struct pci_dev *dev, struct resource *res);

So a driver can mark some BARs as fixed, and some - as movable, in
runtime, depending on current conditions.

If rescan_prepare() and rescan_done() hooks are set, but bar_fixed()
isn't, consider every BAR as movable. If bar_fixed() is set and returns
false, the driver must not use it between rescan_prepare() and
rescan_done().

Best regards,
Serge
Bjorn Helgaas Aug. 10, 2020, 10:21 p.m. UTC | #3
On Mon, Apr 27, 2020 at 09:23:34PM +0300, Sergei Miroshnichenko wrote:
> Currently PCI hotplug works on top of resources which are usually reserved
> not by the kernel, but by BIOS, bootloader, firmware, etc. These resources
> are gaps in the address space where BARs of new devices may fit, and extra
> bus number per port, so bridges can be hot-added. This series aim the BARs
> problem: it shows the kernel how to redistribute them on the run, so the
> hotplug becomes predictable and cross-platform. A follow-up patchset will
> propose a solution for bus numbers.
> 
> To arrange a space for BARs of new hotplugged devices, the kernel can pause
> the drivers of working PCI devices and reshuffle the assigned BARs. When a
> driver is un-paused by the kernel, it should ioremap() the new addresses of
> its BARs.
> 
> Drivers indicate their support of the feature by implementing the new hooks
> .rescan_prepare() and .rescan_done() in the struct pci_driver. If a driver
> doesn't yet support the feature, BARs of its devices will be considered as
> immovable and handled in the same way as resources with the PCI_FIXED flag:
> they are guaranteed to remain untouched.
> 
> Tested on a number of x86_64 machines without any special kernel command
> line arguments:
>  - PC: i7-5930K + ASUS X99-A;
>  - PC: i5-8500 + ASUS Z370-F;
>  - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
>  - HP ProLiant DL380 G5: Xeon X5460;
>  - Dell Inspiron N5010: i5 M 480;
>  - Dell Precision M6600: i7-2920XM.
> ...

There's a lot of good work here, and I apologize that we haven't made
much progress on merging it.  I suspect this will become more and more
important with Thunderbolt.

It does touch a lot of the ugliest and least maintainable code under
drivers/pci, which is *good* if we can clean it up a little bit in the
process, but it is also risky.

I expect that a few problems are inevitable because of BIOS issues,
driver issues, and devices that can't tolerate their BARs being moved.
We've tripped over a few of those devices in the past.

Those can be really hard to debug and fix since we won't have the
hardware in question.  To make them tractable, I think we will really
need some way to test at least the resource assignment pieces of this
"in vitro" without needing the actual hardware.  E.g., maybe we could
add enough diagnostics so that a dmesg log would contain all the
information needed to reproduce a PCI hierarchy, the initial resource
assignments, and subsequent hotplug events in some sort of test
fixture, maybe a qemu boot or similar.

Bjorn