mbox series

[0/3] PCI: Expose resource resizing through sysfs

Message ID 166067824399.1885802.12557332818208187324.stgit@omen (mailing list archive)
Headers show
Series PCI: Expose resource resizing through sysfs | expand

Message

Alex Williamson Aug. 16, 2022, 7:39 p.m. UTC
We have a couple graphics drivers making use of PCIe Resizable BARs
now, but I've been trying to figure out how we can make use of such
features for devices assigned to a VM.  This is a proposal for a
rather basic interface in sysfs such that we have the ability to
pre-enable larger BARs before we bind devices to vfio-pci and
attach them to a VM.

Along the way I found a double-free in the error path of creating
resource attributes, that can certainly be pulled separately (1/).

I'm using an RTX6000 for testing, which unexpectedly only supports
REBAR with smaller than default sizes, which led me to question
why we have such heavy requirements for shrinking resources (2/).

The final patch proposes the sysfs interface and I'll leave the
discussion there for whether this is a good approach.  Thanks,

Alex
---

Alex Williamson (3):
      PCI: Fix double-free in resource attribute error path
      PCI: Skip reassigning bridge resources if reducing BAR size
      PCI: Expose PCIe Resizable BAR support via sysfs


 Documentation/ABI/testing/sysfs-bus-pci |  27 +++++
 drivers/pci/pci-sysfs.c                 | 126 +++++++++++++++++++++++-
 drivers/pci/setup-res.c                 |   2 +-
 include/linux/pci.h                     |   1 +
 4 files changed, 154 insertions(+), 2 deletions(-)

Comments

Christian König Aug. 17, 2022, 10:10 a.m. UTC | #1
Am 16.08.22 um 21:39 schrieb Alex Williamson:
> We have a couple graphics drivers making use of PCIe Resizable BARs
> now, but I've been trying to figure out how we can make use of such
> features for devices assigned to a VM.  This is a proposal for a
> rather basic interface in sysfs such that we have the ability to
> pre-enable larger BARs before we bind devices to vfio-pci and
> attach them to a VM.

Ah, yes please.

I was considering doing this myself just for testing while adding the 
rebar support for the GFX drivers, but then just implementing it on the 
GFX side was simpler.

I would just add a warning that resizing BARs can easily crash the 
system even when no driver directly claimed the resource or PCIe device.

It literally took me weeks to figure out that I need to kick out the EFI 
framebuffer driver before trying to resize the BAR or otherwise I just 
get a hung system.

> Along the way I found a double-free in the error path of creating
> resource attributes, that can certainly be pulled separately (1/).
>
> I'm using an RTX6000 for testing, which unexpectedly only supports
> REBAR with smaller than default sizes, which led me to question
> why we have such heavy requirements for shrinking resources (2/).

Oh, that's easy. You got tons of ARM boards with less than 512MiB of 
address space per root PCIe complex.

If you want to get a GPU working on those you need to decrease the BAR 
size or otherwise you won't be able to fit 256MiB VRAM BAR + register 
BAR into the same hole for the PCIe root complex.

An alternative explanation is that at least AMD produced some boards 
with a messed up resize configuration word. But on those you only got 
256MiB, 512MiB and 1GiB potential BAR sizes IIRC.

Anyway, with an appropriate warning added to the sysfs documentation the 
patch #2 and #3 are Acked-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

>
> The final patch proposes the sysfs interface and I'll leave the
> discussion there for whether this is a good approach.  Thanks,
>
> Alex
> ---
>
> Alex Williamson (3):
>        PCI: Fix double-free in resource attribute error path
>        PCI: Skip reassigning bridge resources if reducing BAR size
>        PCI: Expose PCIe Resizable BAR support via sysfs
>
>
>   Documentation/ABI/testing/sysfs-bus-pci |  27 +++++
>   drivers/pci/pci-sysfs.c                 | 126 +++++++++++++++++++++++-
>   drivers/pci/setup-res.c                 |   2 +-
>   include/linux/pci.h                     |   1 +
>   4 files changed, 154 insertions(+), 2 deletions(-)
>
Alex Williamson Aug. 17, 2022, 2:02 p.m. UTC | #2
On Wed, 17 Aug 2022 12:10:44 +0200
Christian König <christian.koenig@amd.com> wrote:

> Am 16.08.22 um 21:39 schrieb Alex Williamson:
> > We have a couple graphics drivers making use of PCIe Resizable BARs
> > now, but I've been trying to figure out how we can make use of such
> > features for devices assigned to a VM.  This is a proposal for a
> > rather basic interface in sysfs such that we have the ability to
> > pre-enable larger BARs before we bind devices to vfio-pci and
> > attach them to a VM.  
> 
> Ah, yes please.
> 
> I was considering doing this myself just for testing while adding the 
> rebar support for the GFX drivers, but then just implementing it on the 
> GFX side was simpler.
> 
> I would just add a warning that resizing BARs can easily crash the 
> system even when no driver directly claimed the resource or PCIe device.
> 
> It literally took me weeks to figure out that I need to kick out the EFI 
> framebuffer driver before trying to resize the BAR or otherwise I just 
> get a hung system.

Good point, I think maybe we can avoid crashing the system though if we
use the new aperture support to remove conflicting drivers from all VGA
class devices, similar to d17378062079 ("vfio/pci: Remove console
drivers").  A note in the ABI documentation about removing console
drivers from the device when resizing resources would definitely be in
order.

> > Along the way I found a double-free in the error path of creating
> > resource attributes, that can certainly be pulled separately (1/).
> >
> > I'm using an RTX6000 for testing, which unexpectedly only supports
> > REBAR with smaller than default sizes, which led me to question
> > why we have such heavy requirements for shrinking resources (2/).  
> 
> Oh, that's easy. You got tons of ARM boards with less than 512MiB of 
> address space per root PCIe complex.
> 
> If you want to get a GPU working on those you need to decrease the
> BAR size or otherwise you won't be able to fit 256MiB VRAM BAR +
> register BAR into the same hole for the PCIe root complex.
> 
> An alternative explanation is that at least AMD produced some boards 
> with a messed up resize configuration word. But on those you only got 
> 256MiB, 512MiB and 1GiB potential BAR sizes IIRC.

An aspect of shrinking BARs that maybe I'm not giving enough
consideration to is that we might be shrinking a BAR on one device in
order to release MMIO space from a bridge aperture, that we might then
use to expand a BAR elsewhere.  The RTX6000 case only frees a rather
modest amount of MMIO space, but I could imagine more substantial
configurations.  Maybe this justifies resizing the bridge aperture even
in the shrinking case?

> Anyway, with an appropriate warning added to the sysfs documentation
> the patch #2 and #3 are Acked-by: Christian König
> <christian.koenig@amd.com>

Thanks!

Alex
Christian König Aug. 18, 2022, 11:16 a.m. UTC | #3
Am 17.08.22 um 16:02 schrieb Alex Williamson:
> On Wed, 17 Aug 2022 12:10:44 +0200
> Christian König <christian.koenig@amd.com> wrote:
>
>> Am 16.08.22 um 21:39 schrieb Alex Williamson:
>>> We have a couple graphics drivers making use of PCIe Resizable BARs
>>> now, but I've been trying to figure out how we can make use of such
>>> features for devices assigned to a VM.  This is a proposal for a
>>> rather basic interface in sysfs such that we have the ability to
>>> pre-enable larger BARs before we bind devices to vfio-pci and
>>> attach them to a VM.
>> Ah, yes please.
>>
>> I was considering doing this myself just for testing while adding the
>> rebar support for the GFX drivers, but then just implementing it on the
>> GFX side was simpler.
>>
>> I would just add a warning that resizing BARs can easily crash the
>> system even when no driver directly claimed the resource or PCIe device.
>>
>> It literally took me weeks to figure out that I need to kick out the EFI
>> framebuffer driver before trying to resize the BAR or otherwise I just
>> get a hung system.
> Good point, I think maybe we can avoid crashing the system though if we
> use the new aperture support to remove conflicting drivers from all VGA
> class devices, similar to d17378062079 ("vfio/pci: Remove console
> drivers").  A note in the ABI documentation about removing console
> drivers from the device when resizing resources would definitely be in
> order.
>
>>> Along the way I found a double-free in the error path of creating
>>> resource attributes, that can certainly be pulled separately (1/).
>>>
>>> I'm using an RTX6000 for testing, which unexpectedly only supports
>>> REBAR with smaller than default sizes, which led me to question
>>> why we have such heavy requirements for shrinking resources (2/).
>> Oh, that's easy. You got tons of ARM boards with less than 512MiB of
>> address space per root PCIe complex.
>>
>> If you want to get a GPU working on those you need to decrease the
>> BAR size or otherwise you won't be able to fit 256MiB VRAM BAR +
>> register BAR into the same hole for the PCIe root complex.
>>
>> An alternative explanation is that at least AMD produced some boards
>> with a messed up resize configuration word. But on those you only got
>> 256MiB, 512MiB and 1GiB potential BAR sizes IIRC.
> An aspect of shrinking BARs that maybe I'm not giving enough
> consideration to is that we might be shrinking a BAR on one device in
> order to release MMIO space from a bridge aperture, that we might then
> use to expand a BAR elsewhere.  The RTX6000 case only frees a rather
> modest amount of MMIO space, but I could imagine more substantial
> configurations.  Maybe this justifies resizing the bridge aperture even
> in the shrinking case?

That was the original idea why I've kept that in there, yes.

But I never really seen a case where that really mattered.

So far making BARs smaller was something only the BIOS does.

Christian.

>
>> Anyway, with an appropriate warning added to the sysfs documentation
>> the patch #2 and #3 are Acked-by: Christian König
>> <christian.koenig@amd.com>
> Thanks!
>
> Alex
>