diff mbox series

vpci: Add resizable bar support

Message ID 20241113080027.244240-1-Jiqian.Chen@amd.com (mailing list archive)
State New
Headers show
Series vpci: Add resizable bar support | expand

Commit Message

Chen, Jiqian Nov. 13, 2024, 8 a.m. UTC
Some devices, like discrete GPU of amd, support resizable bar capability,
but vpci of Xen doesn't support this feature, so they fail to resize bars
and then cause probing failure.

According to PCIe spec, each bar that support resizing has two registers,
PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
corresponding handler into vpci.

PCI_REBAR_CAP is RO, only provide reading.

PCI_REBAR_CTRL only has bar size is RW, so add write function to support
setting the new size.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 xen/drivers/vpci/Makefile  |  2 +-
 xen/drivers/vpci/rebar.c   | 89 ++++++++++++++++++++++++++++++++++++++
 xen/include/xen/pci_regs.h | 11 +++++
 3 files changed, 101 insertions(+), 1 deletion(-)
 create mode 100644 xen/drivers/vpci/rebar.c

Comments

Roger Pau Monne Nov. 13, 2024, 9:30 a.m. UTC | #1
On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> Some devices, like discrete GPU of amd, support resizable bar capability,
> but vpci of Xen doesn't support this feature, so they fail to resize bars
> and then cause probing failure.
> 
> According to PCIe spec, each bar that support resizing has two registers,
> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> corresponding handler into vpci.
> 
> PCI_REBAR_CAP is RO, only provide reading.
> 
> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
> setting the new size.

I think the logic to handle resizable BAR could be much simpler.  Some
time ago I've made a patch to add support for it, but due to lack of
hardware on my side to test it I've never submitted it.

My approach would be to detect the presence of the
PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
capability is present force the sizing of BARs each time they are
mapped in modify_bars().  I don't think we need to trap accesses to
the capability itself, as resizing can only happen when memory
decoding is not enabled for the device.  It's enough to fetch the size
of the BARs ahead of each enabling of memory decoding.

Note that memory decoding implies mapping the BARs into the p2m, which
is already an expensive operation, the extra sizing is unlikely to
make much of a difference performance wise.

I've found the following on my git tree and rebased on top of staging:

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ef6c965c081c..045aa4bdadc8 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -356,6 +356,30 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
 
         ASSERT(rangeset_is_empty(bar->mem));
 
+        if ( bar->type != VPCI_BAR_ROM && header->bars_resizable &&
+             (cmd & PCI_COMMAND_MEMORY) )
+        {
+            uint64_t addr, size;
+
+            pci_size_mem_bar(pdev->sbdf, PCI_BASE_ADDRESS_0 + i * 4,
+                             &addr, &size, 0);
+
+            if ( bar->addr != addr )
+                printk(XENLOG_G_ERR
+                       "%pp: BAR#%u address mismatch %#lx vs %#lx\n",
+                       &pdev->sbdf, i, bar->addr, addr);
+
+            if ( bar->size != size )
+            {
+                printk(XENLOG_G_DEBUG
+                       "%pp: detected BAR#%u size change (%#lx -> %#lx)\n",
+                       &pdev->sbdf, i, bar->size, size);
+                bar->size = size;
+                end = PFN_DOWN(bar->addr + size - 1);
+                end_guest = PFN_DOWN(bar->guest_addr + size - 1);
+            }
+        }
+
         /*
          * Make sure that the guest set address has the same page offset
          * as the physical address on the host or otherwise things won't work as
@@ -870,6 +894,9 @@ static int cf_check init_header(struct pci_dev *pdev)
     if ( pdev->ignore_bars )
         return 0;
 
+    header->bars_resizable = pci_find_ext_capability(pdev->sbdf,
+                                                     PCI_EXT_CAP_ID_REBAR);
+
     cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND);
 
     /*
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 250ba106dbd3..c543a2b86778 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -459,6 +459,7 @@
 #define PCI_EXT_CAP_ID_ARI	14
 #define PCI_EXT_CAP_ID_ATS	15
 #define PCI_EXT_CAP_ID_SRIOV	16
+#define PCI_EXT_CAP_ID_REBAR	21
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 41e7c3bc2791..45ebc1bb3356 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -129,6 +129,8 @@ struct vpci {
          * upon to know whether BARs are mapped into the guest p2m.
          */
         bool bars_mapped      : 1;
+        /* Device has the Resizable BARs capability. */
+        bool bars_resizable   : 1;
         /* FIXME: currently there's no support for SR-IOV. */
     } header;
Chen, Jiqian Nov. 13, 2024, 10 a.m. UTC | #2
On 2024/11/13 17:30, Roger Pau Monné wrote:
> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
>> Some devices, like discrete GPU of amd, support resizable bar capability,
>> but vpci of Xen doesn't support this feature, so they fail to resize bars
>> and then cause probing failure.
>>
>> According to PCIe spec, each bar that support resizing has two registers,
>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
>> corresponding handler into vpci.
>>
>> PCI_REBAR_CAP is RO, only provide reading.
>>
>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
>> setting the new size.
> 
> I think the logic to handle resizable BAR could be much simpler.  Some
> time ago I've made a patch to add support for it, but due to lack of
> hardware on my side to test it I've never submitted it.
> 
> My approach would be to detect the presence of the
> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> capability is present force the sizing of BARs each time they are
> mapped in modify_bars().  I don't think we need to trap accesses to
> the capability itself, as resizing can only happen when memory
> decoding is not enabled for the device.  It's enough to fetch the size
> of the BARs ahead of each enabling of memory decoding.
> 
> Note that memory decoding implies mapping the BARs into the p2m, which
> is already an expensive operation, the extra sizing is unlikely to
> make much of a difference performance wise.
> 
> I've found the following on my git tree and rebased on top of staging:
OK.
Do you need me to validate your patch in my environment?
And I have one question: where does your patch do writing the resizing size into hardware?

> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ef6c965c081c..045aa4bdadc8 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -356,6 +356,30 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only)
>  
>          ASSERT(rangeset_is_empty(bar->mem));
>  
> +        if ( bar->type != VPCI_BAR_ROM && header->bars_resizable &&
> +             (cmd & PCI_COMMAND_MEMORY) )
> +        {
> +            uint64_t addr, size;
> +
> +            pci_size_mem_bar(pdev->sbdf, PCI_BASE_ADDRESS_0 + i * 4,
> +                             &addr, &size, 0);
> +
> +            if ( bar->addr != addr )
> +                printk(XENLOG_G_ERR
> +                       "%pp: BAR#%u address mismatch %#lx vs %#lx\n",
> +                       &pdev->sbdf, i, bar->addr, addr);
> +
> +            if ( bar->size != size )
> +            {
> +                printk(XENLOG_G_DEBUG
> +                       "%pp: detected BAR#%u size change (%#lx -> %#lx)\n",
> +                       &pdev->sbdf, i, bar->size, size);
> +                bar->size = size;
> +                end = PFN_DOWN(bar->addr + size - 1);
> +                end_guest = PFN_DOWN(bar->guest_addr + size - 1);
> +            }
> +        }
> +
>          /*
>           * Make sure that the guest set address has the same page offset
>           * as the physical address on the host or otherwise things won't work as
> @@ -870,6 +894,9 @@ static int cf_check init_header(struct pci_dev *pdev)
>      if ( pdev->ignore_bars )
>          return 0;
>  
> +    header->bars_resizable = pci_find_ext_capability(pdev->sbdf,
> +                                                     PCI_EXT_CAP_ID_REBAR);
> +
>      cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND);
>  
>      /*
> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
> index 250ba106dbd3..c543a2b86778 100644
> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -459,6 +459,7 @@
>  #define PCI_EXT_CAP_ID_ARI	14
>  #define PCI_EXT_CAP_ID_ATS	15
>  #define PCI_EXT_CAP_ID_SRIOV	16
> +#define PCI_EXT_CAP_ID_REBAR	21
>  
>  /* Advanced Error Reporting */
>  #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 41e7c3bc2791..45ebc1bb3356 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -129,6 +129,8 @@ struct vpci {
>           * upon to know whether BARs are mapped into the guest p2m.
>           */
>          bool bars_mapped      : 1;
> +        /* Device has the Resizable BARs capability. */
> +        bool bars_resizable   : 1;
>          /* FIXME: currently there's no support for SR-IOV. */
>      } header;
>
Roger Pau Monne Nov. 13, 2024, 10:30 a.m. UTC | #3
On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
> On 2024/11/13 17:30, Roger Pau Monné wrote:
> > On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> >> Some devices, like discrete GPU of amd, support resizable bar capability,
> >> but vpci of Xen doesn't support this feature, so they fail to resize bars
> >> and then cause probing failure.
> >>
> >> According to PCIe spec, each bar that support resizing has two registers,
> >> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> >> corresponding handler into vpci.
> >>
> >> PCI_REBAR_CAP is RO, only provide reading.
> >>
> >> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
> >> setting the new size.
> > 
> > I think the logic to handle resizable BAR could be much simpler.  Some
> > time ago I've made a patch to add support for it, but due to lack of
> > hardware on my side to test it I've never submitted it.
> > 
> > My approach would be to detect the presence of the
> > PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> > capability is present force the sizing of BARs each time they are
> > mapped in modify_bars().  I don't think we need to trap accesses to
> > the capability itself, as resizing can only happen when memory
> > decoding is not enabled for the device.  It's enough to fetch the size
> > of the BARs ahead of each enabling of memory decoding.
> > 
> > Note that memory decoding implies mapping the BARs into the p2m, which
> > is already an expensive operation, the extra sizing is unlikely to
> > make much of a difference performance wise.
> > 
> > I've found the following on my git tree and rebased on top of staging:
> OK.
> Do you need me to validate your patch in my environment?

Yes please, I have no way to test it.  Let's see what others think
about the different approaches.

> And I have one question: where does your patch do writing the resizing size into hardware?

dom0 has unrestricted access to the resize capability, so the value
written by dom0 is propagated to the hardware without modification.

I would be wary of exposing the resize capability to untrusted
domains, as allowing a domU to change the size of BARs can lead to
overlapping if the hardware domain hasn't accounted for the increase
in BAR size.

Thanks, Roger.
Jan Beulich Nov. 13, 2024, 10:36 a.m. UTC | #4
On 13.11.2024 11:30, Roger Pau Monné wrote:
> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
>> On 2024/11/13 17:30, Roger Pau Monné wrote:
>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
>>>> Some devices, like discrete GPU of amd, support resizable bar capability,
>>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
>>>> and then cause probing failure.
>>>>
>>>> According to PCIe spec, each bar that support resizing has two registers,
>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
>>>> corresponding handler into vpci.
>>>>
>>>> PCI_REBAR_CAP is RO, only provide reading.
>>>>
>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
>>>> setting the new size.
>>>
>>> I think the logic to handle resizable BAR could be much simpler.  Some
>>> time ago I've made a patch to add support for it, but due to lack of
>>> hardware on my side to test it I've never submitted it.
>>>
>>> My approach would be to detect the presence of the
>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
>>> capability is present force the sizing of BARs each time they are
>>> mapped in modify_bars().  I don't think we need to trap accesses to
>>> the capability itself, as resizing can only happen when memory
>>> decoding is not enabled for the device.  It's enough to fetch the size
>>> of the BARs ahead of each enabling of memory decoding.
>>>
>>> Note that memory decoding implies mapping the BARs into the p2m, which
>>> is already an expensive operation, the extra sizing is unlikely to
>>> make much of a difference performance wise.
>>>
>>> I've found the following on my git tree and rebased on top of staging:
>> OK.
>> Do you need me to validate your patch in my environment?
> 
> Yes please, I have no way to test it.  Let's see what others think
> about the different approaches.

I'd certainly prefer your simpler form, if it's safe and fits the needs.

>> And I have one question: where does your patch do writing the resizing size into hardware?
> 
> dom0 has unrestricted access to the resize capability, so the value
> written by dom0 is propagated to the hardware without modification.
> 
> I would be wary of exposing the resize capability to untrusted
> domains, as allowing a domU to change the size of BARs can lead to
> overlapping if the hardware domain hasn't accounted for the increase
> in BAR size.

Question is how the feature is used in practice: If it was a driver to
request the re-size, I'd have a hard time seeing how we could make that
work without intercepting accesses to the capability for DomU-s (implying
to expose it in the first place, of course).

Jan
Roger Pau Monne Nov. 13, 2024, 10:56 a.m. UTC | #5
On Wed, Nov 13, 2024 at 11:36:46AM +0100, Jan Beulich wrote:
> On 13.11.2024 11:30, Roger Pau Monné wrote:
> > On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
> >> On 2024/11/13 17:30, Roger Pau Monné wrote:
> >>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> >>>> Some devices, like discrete GPU of amd, support resizable bar capability,
> >>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
> >>>> and then cause probing failure.
> >>>>
> >>>> According to PCIe spec, each bar that support resizing has two registers,
> >>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> >>>> corresponding handler into vpci.
> >>>>
> >>>> PCI_REBAR_CAP is RO, only provide reading.
> >>>>
> >>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
> >>>> setting the new size.
> >>>
> >>> I think the logic to handle resizable BAR could be much simpler.  Some
> >>> time ago I've made a patch to add support for it, but due to lack of
> >>> hardware on my side to test it I've never submitted it.
> >>>
> >>> My approach would be to detect the presence of the
> >>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> >>> capability is present force the sizing of BARs each time they are
> >>> mapped in modify_bars().  I don't think we need to trap accesses to
> >>> the capability itself, as resizing can only happen when memory
> >>> decoding is not enabled for the device.  It's enough to fetch the size
> >>> of the BARs ahead of each enabling of memory decoding.
> >>>
> >>> Note that memory decoding implies mapping the BARs into the p2m, which
> >>> is already an expensive operation, the extra sizing is unlikely to
> >>> make much of a difference performance wise.
> >>>
> >>> I've found the following on my git tree and rebased on top of staging:
> >> OK.
> >> Do you need me to validate your patch in my environment?
> > 
> > Yes please, I have no way to test it.  Let's see what others think
> > about the different approaches.
> 
> I'd certainly prefer your simpler form, if it's safe and fits the needs.
> 
> >> And I have one question: where does your patch do writing the resizing size into hardware?
> > 
> > dom0 has unrestricted access to the resize capability, so the value
> > written by dom0 is propagated to the hardware without modification.
> > 
> > I would be wary of exposing the resize capability to untrusted
> > domains, as allowing a domU to change the size of BARs can lead to
> > overlapping if the hardware domain hasn't accounted for the increase
> > in BAR size.
> 
> Question is how the feature is used in practice: If it was a driver to
> request the re-size, I'd have a hard time seeing how we could make that
> work without intercepting accesses to the capability for DomU-s (implying
> to expose it in the first place, of course).

Question is also whether the capability is required for guests, as in
OS drivers requesting it to be present for proper operation.

I haven't given much thought about how to expose to domUs.  The
current patch doesn't attempt to expose to domUs either, as the
capability is not added to the 'supported_caps' array.

Thanks, Roger.
Jan Beulich Nov. 13, 2024, 11:01 a.m. UTC | #6
On 13.11.2024 11:56, Roger Pau Monné wrote:
> On Wed, Nov 13, 2024 at 11:36:46AM +0100, Jan Beulich wrote:
>> On 13.11.2024 11:30, Roger Pau Monné wrote:
>>> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
>>>> On 2024/11/13 17:30, Roger Pau Monné wrote:
>>>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
>>>>>> Some devices, like discrete GPU of amd, support resizable bar capability,
>>>>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
>>>>>> and then cause probing failure.
>>>>>>
>>>>>> According to PCIe spec, each bar that support resizing has two registers,
>>>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
>>>>>> corresponding handler into vpci.
>>>>>>
>>>>>> PCI_REBAR_CAP is RO, only provide reading.
>>>>>>
>>>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
>>>>>> setting the new size.
>>>>>
>>>>> I think the logic to handle resizable BAR could be much simpler.  Some
>>>>> time ago I've made a patch to add support for it, but due to lack of
>>>>> hardware on my side to test it I've never submitted it.
>>>>>
>>>>> My approach would be to detect the presence of the
>>>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
>>>>> capability is present force the sizing of BARs each time they are
>>>>> mapped in modify_bars().  I don't think we need to trap accesses to
>>>>> the capability itself, as resizing can only happen when memory
>>>>> decoding is not enabled for the device.  It's enough to fetch the size
>>>>> of the BARs ahead of each enabling of memory decoding.
>>>>>
>>>>> Note that memory decoding implies mapping the BARs into the p2m, which
>>>>> is already an expensive operation, the extra sizing is unlikely to
>>>>> make much of a difference performance wise.
>>>>>
>>>>> I've found the following on my git tree and rebased on top of staging:
>>>> OK.
>>>> Do you need me to validate your patch in my environment?
>>>
>>> Yes please, I have no way to test it.  Let's see what others think
>>> about the different approaches.
>>
>> I'd certainly prefer your simpler form, if it's safe and fits the needs.
>>
>>>> And I have one question: where does your patch do writing the resizing size into hardware?
>>>
>>> dom0 has unrestricted access to the resize capability, so the value
>>> written by dom0 is propagated to the hardware without modification.
>>>
>>> I would be wary of exposing the resize capability to untrusted
>>> domains, as allowing a domU to change the size of BARs can lead to
>>> overlapping if the hardware domain hasn't accounted for the increase
>>> in BAR size.
>>
>> Question is how the feature is used in practice: If it was a driver to
>> request the re-size, I'd have a hard time seeing how we could make that
>> work without intercepting accesses to the capability for DomU-s (implying
>> to expose it in the first place, of course).
> 
> Question is also whether the capability is required for guests, as in
> OS drivers requesting it to be present for proper operation.
> 
> I haven't given much thought about how to expose to domUs.  The
> current patch doesn't attempt to expose to domUs either, as the
> capability is not added to the 'supported_caps' array.

Hmm, I see. Yet then adding support to vPCI, but limited to Dom0, ends up
odd in two ways: Another aspect that'll need dealing with for DomU-s, and
the same functionality remaining unavailable (or at least not properly
available, with all possible side effects) to PV Dom0.

Jan
Roger Pau Monne Nov. 13, 2024, 11:24 a.m. UTC | #7
On Wed, Nov 13, 2024 at 12:01:23PM +0100, Jan Beulich wrote:
> On 13.11.2024 11:56, Roger Pau Monné wrote:
> > On Wed, Nov 13, 2024 at 11:36:46AM +0100, Jan Beulich wrote:
> >> On 13.11.2024 11:30, Roger Pau Monné wrote:
> >>> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
> >>>> On 2024/11/13 17:30, Roger Pau Monné wrote:
> >>>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> >>>>>> Some devices, like discrete GPU of amd, support resizable bar capability,
> >>>>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
> >>>>>> and then cause probing failure.
> >>>>>>
> >>>>>> According to PCIe spec, each bar that support resizing has two registers,
> >>>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> >>>>>> corresponding handler into vpci.
> >>>>>>
> >>>>>> PCI_REBAR_CAP is RO, only provide reading.
> >>>>>>
> >>>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
> >>>>>> setting the new size.
> >>>>>
> >>>>> I think the logic to handle resizable BAR could be much simpler.  Some
> >>>>> time ago I've made a patch to add support for it, but due to lack of
> >>>>> hardware on my side to test it I've never submitted it.
> >>>>>
> >>>>> My approach would be to detect the presence of the
> >>>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> >>>>> capability is present force the sizing of BARs each time they are
> >>>>> mapped in modify_bars().  I don't think we need to trap accesses to
> >>>>> the capability itself, as resizing can only happen when memory
> >>>>> decoding is not enabled for the device.  It's enough to fetch the size
> >>>>> of the BARs ahead of each enabling of memory decoding.
> >>>>>
> >>>>> Note that memory decoding implies mapping the BARs into the p2m, which
> >>>>> is already an expensive operation, the extra sizing is unlikely to
> >>>>> make much of a difference performance wise.
> >>>>>
> >>>>> I've found the following on my git tree and rebased on top of staging:
> >>>> OK.
> >>>> Do you need me to validate your patch in my environment?
> >>>
> >>> Yes please, I have no way to test it.  Let's see what others think
> >>> about the different approaches.
> >>
> >> I'd certainly prefer your simpler form, if it's safe and fits the needs.
> >>
> >>>> And I have one question: where does your patch do writing the resizing size into hardware?
> >>>
> >>> dom0 has unrestricted access to the resize capability, so the value
> >>> written by dom0 is propagated to the hardware without modification.
> >>>
> >>> I would be wary of exposing the resize capability to untrusted
> >>> domains, as allowing a domU to change the size of BARs can lead to
> >>> overlapping if the hardware domain hasn't accounted for the increase
> >>> in BAR size.
> >>
> >> Question is how the feature is used in practice: If it was a driver to
> >> request the re-size, I'd have a hard time seeing how we could make that
> >> work without intercepting accesses to the capability for DomU-s (implying
> >> to expose it in the first place, of course).
> > 
> > Question is also whether the capability is required for guests, as in
> > OS drivers requesting it to be present for proper operation.
> > 
> > I haven't given much thought about how to expose to domUs.  The
> > current patch doesn't attempt to expose to domUs either, as the
> > capability is not added to the 'supported_caps' array.
> 
> Hmm, I see. Yet then adding support to vPCI, but limited to Dom0, ends up
> odd in two ways: Another aspect that'll need dealing with for DomU-s, and
> the same functionality remaining unavailable (or at least not properly
> available, with all possible side effects) to PV Dom0.

I think resizable BARs should just work for PV dom0, as Xen allows PV
dom0 to map almost all physical memory.  Xen doesn't require knowing
the BAR positions and sizes like it does for PVH dom0.

Note that resizable BAR capability is not exposed to domUs now either
when using QEMU based pci-passthrough.

Thanks, Roger.
Jan Beulich Nov. 13, 2024, 11:29 a.m. UTC | #8
On 13.11.2024 12:24, Roger Pau Monné wrote:
> On Wed, Nov 13, 2024 at 12:01:23PM +0100, Jan Beulich wrote:
>> On 13.11.2024 11:56, Roger Pau Monné wrote:
>>> On Wed, Nov 13, 2024 at 11:36:46AM +0100, Jan Beulich wrote:
>>>> On 13.11.2024 11:30, Roger Pau Monné wrote:
>>>>> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
>>>>>> On 2024/11/13 17:30, Roger Pau Monné wrote:
>>>>>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
>>>>>>>> Some devices, like discrete GPU of amd, support resizable bar capability,
>>>>>>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
>>>>>>>> and then cause probing failure.
>>>>>>>>
>>>>>>>> According to PCIe spec, each bar that support resizing has two registers,
>>>>>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
>>>>>>>> corresponding handler into vpci.
>>>>>>>>
>>>>>>>> PCI_REBAR_CAP is RO, only provide reading.
>>>>>>>>
>>>>>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
>>>>>>>> setting the new size.
>>>>>>>
>>>>>>> I think the logic to handle resizable BAR could be much simpler.  Some
>>>>>>> time ago I've made a patch to add support for it, but due to lack of
>>>>>>> hardware on my side to test it I've never submitted it.
>>>>>>>
>>>>>>> My approach would be to detect the presence of the
>>>>>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
>>>>>>> capability is present force the sizing of BARs each time they are
>>>>>>> mapped in modify_bars().  I don't think we need to trap accesses to
>>>>>>> the capability itself, as resizing can only happen when memory
>>>>>>> decoding is not enabled for the device.  It's enough to fetch the size
>>>>>>> of the BARs ahead of each enabling of memory decoding.
>>>>>>>
>>>>>>> Note that memory decoding implies mapping the BARs into the p2m, which
>>>>>>> is already an expensive operation, the extra sizing is unlikely to
>>>>>>> make much of a difference performance wise.
>>>>>>>
>>>>>>> I've found the following on my git tree and rebased on top of staging:
>>>>>> OK.
>>>>>> Do you need me to validate your patch in my environment?
>>>>>
>>>>> Yes please, I have no way to test it.  Let's see what others think
>>>>> about the different approaches.
>>>>
>>>> I'd certainly prefer your simpler form, if it's safe and fits the needs.
>>>>
>>>>>> And I have one question: where does your patch do writing the resizing size into hardware?
>>>>>
>>>>> dom0 has unrestricted access to the resize capability, so the value
>>>>> written by dom0 is propagated to the hardware without modification.
>>>>>
>>>>> I would be wary of exposing the resize capability to untrusted
>>>>> domains, as allowing a domU to change the size of BARs can lead to
>>>>> overlapping if the hardware domain hasn't accounted for the increase
>>>>> in BAR size.
>>>>
>>>> Question is how the feature is used in practice: If it was a driver to
>>>> request the re-size, I'd have a hard time seeing how we could make that
>>>> work without intercepting accesses to the capability for DomU-s (implying
>>>> to expose it in the first place, of course).
>>>
>>> Question is also whether the capability is required for guests, as in
>>> OS drivers requesting it to be present for proper operation.
>>>
>>> I haven't given much thought about how to expose to domUs.  The
>>> current patch doesn't attempt to expose to domUs either, as the
>>> capability is not added to the 'supported_caps' array.
>>
>> Hmm, I see. Yet then adding support to vPCI, but limited to Dom0, ends up
>> odd in two ways: Another aspect that'll need dealing with for DomU-s, and
>> the same functionality remaining unavailable (or at least not properly
>> available, with all possible side effects) to PV Dom0.
> 
> I think resizable BARs should just work for PV dom0, as Xen allows PV
> dom0 to map almost all physical memory.  Xen doesn't require knowing
> the BAR positions and sizes like it does for PVH dom0.

Does it really not need to know in any (corner) case? Are there guarantees
that e.g. MSI-X table or PBA can't move when the size of the BAR covering
them changes?

> Note that resizable BAR capability is not exposed to domUs now either
> when using QEMU based pci-passthrough.

Of course.

Jan
Roger Pau Monne Nov. 13, 2024, 12:13 p.m. UTC | #9
On Wed, Nov 13, 2024 at 12:29:11PM +0100, Jan Beulich wrote:
> On 13.11.2024 12:24, Roger Pau Monné wrote:
> > On Wed, Nov 13, 2024 at 12:01:23PM +0100, Jan Beulich wrote:
> >> On 13.11.2024 11:56, Roger Pau Monné wrote:
> >>> On Wed, Nov 13, 2024 at 11:36:46AM +0100, Jan Beulich wrote:
> >>>> On 13.11.2024 11:30, Roger Pau Monné wrote:
> >>>>> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
> >>>>>> On 2024/11/13 17:30, Roger Pau Monné wrote:
> >>>>>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> >>>>>>>> Some devices, like discrete GPU of amd, support resizable bar capability,
> >>>>>>>> but vpci of Xen doesn't support this feature, so they fail to resize bars
> >>>>>>>> and then cause probing failure.
> >>>>>>>>
> >>>>>>>> According to PCIe spec, each bar that support resizing has two registers,
> >>>>>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> >>>>>>>> corresponding handler into vpci.
> >>>>>>>>
> >>>>>>>> PCI_REBAR_CAP is RO, only provide reading.
> >>>>>>>>
> >>>>>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to support
> >>>>>>>> setting the new size.
> >>>>>>>
> >>>>>>> I think the logic to handle resizable BAR could be much simpler.  Some
> >>>>>>> time ago I've made a patch to add support for it, but due to lack of
> >>>>>>> hardware on my side to test it I've never submitted it.
> >>>>>>>
> >>>>>>> My approach would be to detect the presence of the
> >>>>>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> >>>>>>> capability is present force the sizing of BARs each time they are
> >>>>>>> mapped in modify_bars().  I don't think we need to trap accesses to
> >>>>>>> the capability itself, as resizing can only happen when memory
> >>>>>>> decoding is not enabled for the device.  It's enough to fetch the size
> >>>>>>> of the BARs ahead of each enabling of memory decoding.
> >>>>>>>
> >>>>>>> Note that memory decoding implies mapping the BARs into the p2m, which
> >>>>>>> is already an expensive operation, the extra sizing is unlikely to
> >>>>>>> make much of a difference performance wise.
> >>>>>>>
> >>>>>>> I've found the following on my git tree and rebased on top of staging:
> >>>>>> OK.
> >>>>>> Do you need me to validate your patch in my environment?
> >>>>>
> >>>>> Yes please, I have no way to test it.  Let's see what others think
> >>>>> about the different approaches.
> >>>>
> >>>> I'd certainly prefer your simpler form, if it's safe and fits the needs.
> >>>>
> >>>>>> And I have one question: where does your patch do writing the resizing size into hardware?
> >>>>>
> >>>>> dom0 has unrestricted access to the resize capability, so the value
> >>>>> written by dom0 is propagated to the hardware without modification.
> >>>>>
> >>>>> I would be wary of exposing the resize capability to untrusted
> >>>>> domains, as allowing a domU to change the size of BARs can lead to
> >>>>> overlapping if the hardware domain hasn't accounted for the increase
> >>>>> in BAR size.
> >>>>
> >>>> Question is how the feature is used in practice: If it was a driver to
> >>>> request the re-size, I'd have a hard time seeing how we could make that
> >>>> work without intercepting accesses to the capability for DomU-s (implying
> >>>> to expose it in the first place, of course).
> >>>
> >>> Question is also whether the capability is required for guests, as in
> >>> OS drivers requesting it to be present for proper operation.
> >>>
> >>> I haven't given much thought about how to expose to domUs.  The
> >>> current patch doesn't attempt to expose to domUs either, as the
> >>> capability is not added to the 'supported_caps' array.
> >>
> >> Hmm, I see. Yet then adding support to vPCI, but limited to Dom0, ends up
> >> odd in two ways: Another aspect that'll need dealing with for DomU-s, and
> >> the same functionality remaining unavailable (or at least not properly
> >> available, with all possible side effects) to PV Dom0.
> > 
> > I think resizable BARs should just work for PV dom0, as Xen allows PV
> > dom0 to map almost all physical memory.  Xen doesn't require knowing
> > the BAR positions and sizes like it does for PVH dom0.
> 
> Does it really not need to know in any (corner) case? Are there guarantees
> that e.g. MSI-X table or PBA can't move when the size of the BAR covering
> them changes?

Those are the same guarantees that Xen has from a PV dom0 not moving
the position of the BARs after having MSI-X enabled for example.  IOW:
dom0 should set the BAR size before enabling MSI-X, just like it does
for the BAR positions currently.

The specification doesn't mention anything about MSI-X table or PBA
explicitly not changing it's offset if the BAR where it resides is
resized, so I don't think we have any guarantees.  Albeit I would
think it's unexpected for the MSI-X BIR or offset to change as a
result of a BAR resize.

There's an implementation note in the PCI specification that the BAR
should be resized during resource allocation, prior to assigning the
base address to the BAR.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 1a1413b93e76..a7c8a30a8956 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1,2 +1,2 @@ 
-obj-y += vpci.o header.o
+obj-y += vpci.o header.o rebar.o
 obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
new file mode 100644
index 000000000000..84dbd84b0745
--- /dev/null
+++ b/xen/drivers/vpci/rebar.c
@@ -0,0 +1,89 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
+ *
+ * Author: Jiqian Chen <Jiqian.Chen@amd.com>
+ */
+
+#include <xen/hypercall.h>
+#include <xen/vpci.h>
+
+/*
+ * The number value of the BAR Size in PCI_REBAR_CTRL register reprent:
+ * 0    1 MB (2^20 bytes)
+ * 1    2 MB (2^21 bytes)
+ * 2    4 MB (2^22 bytes)
+ *  ...
+ * 43   8 EB (2^63 bytes)
+ */
+#define PCI_REBAR_CTRL_BAR_UNIT (1ULL << 20)
+
+static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
+                                      unsigned int reg,
+                                      uint32_t val,
+                                      void *data)
+{
+    uint32_t ctrl, index;
+    struct vpci_bar *bars = pdev->vpci->header.bars;
+
+    ctrl = pci_conf_read32(pdev->sbdf, reg);
+    if ( ctrl == val )
+        return;
+
+    ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE;
+    if ( ctrl != ( val & ~PCI_REBAR_CTRL_BAR_SIZE ) )
+        return;
+
+    index = ctrl & PCI_REBAR_CTRL_BAR_IDX;
+    bars[index].size = (1 << ((val & PCI_REBAR_CTRL_BAR_SIZE) >>
+                              PCI_REBAR_CTRL_BAR_SHIFT)) *
+                       PCI_REBAR_CTRL_BAR_UNIT;
+
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
+static int cf_check init_rebar(struct pci_dev *pdev)
+{
+    unsigned int rebar_offset;
+    uint32_t ctrl, nbars;
+    int rc = 0;
+
+    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
+
+    if ( !rebar_offset )
+        return rc;
+
+    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
+    nbars = (ctrl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
+
+    for ( int i = 0; i < nbars; i++, rebar_offset += 8 ) {
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, NULL,
+                               rebar_offset + PCI_REBAR_CAP, 4, NULL);
+        if ( rc ) {
+            printk("%s: %pp: add register for PCI_REBAR_CAP failed (rc=%d)\n",
+                   __func__, &pdev->sbdf, rc);
+            break;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
+                               rebar_offset + PCI_REBAR_CTRL, 4, NULL);
+        if ( rc ) {
+            printk("%s: %pp: add register for PCI_REBAR_CTRL failed (rc=%d)\n",
+                   __func__, &pdev->sbdf, rc);
+            break;
+        }
+    }
+
+    return rc;
+}
+REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 250ba106dbd3..5d2aa130916e 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -459,6 +459,7 @@ 
 #define PCI_EXT_CAP_ID_ARI	14
 #define PCI_EXT_CAP_ID_ATS	15
 #define PCI_EXT_CAP_ID_SRIOV	16
+#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
@@ -541,6 +542,16 @@ 
 #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
 #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
 
+/* Resizable BARs */
+#define PCI_REBAR_CAP		4	/* capability register */
+#define  PCI_REBAR_CAP_SIZES		0x00FFFFF0  /* supported BAR sizes */
+#define PCI_REBAR_CTRL		8	/* control register */
+#define  PCI_REBAR_CTRL_BAR_IDX		0x00000007  /* BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0  /* # of resizable BARs */
+#define  PCI_REBAR_CTRL_NBAR_SHIFT	5	    /* shift for # of BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE	0x00001F00  /* BAR size */
+#define  PCI_REBAR_CTRL_BAR_SHIFT	8	    /* shift for BAR size */
+
 /*
  * Hypertransport sub capability types
  *