diff mbox series

[v3] vpci: Add resizable bar support

Message ID 20241213054232.2638640-1-Jiqian.Chen@amd.com (mailing list archive)
State Superseded
Headers show
Series [v3] vpci: Add resizable bar support | expand

Commit Message

Jiqian Chen Dec. 13, 2024, 5:42 a.m. UTC
Some devices, like discrete GPU of amd, support resizable bar
capability, but vpci of Xen doesn't support this feature, so
they fail to resize bars and then cause probing failure.

According to PCIe spec, each bar that supports resizing has
two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
handlers for them to support resizing the size of BARs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
Hi all,
v2->v3 changes:
* Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY",
  and added comments why it needs this check.
* Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return EOPNOTSUPP for domUs.
* Moved BAR type and index check into init_rebar(), then only need to check once.
* Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
* Added macro PCI_REBAR_SIZE_BIAS to represent 20.

TODO: need to hide ReBar capability from hardware domain when init_rebar() fails.

Best regards,
Jiqian Chen.

v1->v2 changes:
* In rebar_ctrl_write, to check if memory decoding is enabled, and added
  some checks for the type of Bar.
* Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
  no write limitation of dom0.
* And has many other minor modifications as well.
---
 xen/drivers/vpci/Makefile  |   2 +-
 xen/drivers/vpci/rebar.c   | 130 +++++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c    |   6 ++
 xen/include/xen/pci_regs.h |  13 ++++
 xen/include/xen/vpci.h     |   2 +
 5 files changed, 152 insertions(+), 1 deletion(-)
 create mode 100644 xen/drivers/vpci/rebar.c

Comments

Roger Pau Monné Dec. 16, 2024, 10:24 a.m. UTC | #1
On Fri, Dec 13, 2024 at 01:42:32PM +0800, Jiqian Chen wrote:
> Some devices, like discrete GPU of amd, support resizable bar
> capability, but vpci of Xen doesn't support this feature, so
> they fail to resize bars and then cause probing failure.
> 
> According to PCIe spec, each bar that supports resizing has
> two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
> handlers for them to support resizing the size of BARs.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> Hi all,
> v2->v3 changes:
> * Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY",
>   and added comments why it needs this check.
> * Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return EOPNOTSUPP for domUs.
> * Moved BAR type and index check into init_rebar(), then only need to check once.
> * Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
> * Added macro PCI_REBAR_SIZE_BIAS to represent 20.
> 
> TODO: need to hide ReBar capability from hardware domain when init_rebar() fails.
> 
> Best regards,
> Jiqian Chen.
> 
> v1->v2 changes:
> * In rebar_ctrl_write, to check if memory decoding is enabled, and added
>   some checks for the type of Bar.
> * Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
>   no write limitation of dom0.
> * And has many other minor modifications as well.
> ---
>  xen/drivers/vpci/Makefile  |   2 +-
>  xen/drivers/vpci/rebar.c   | 130 +++++++++++++++++++++++++++++++++++++
>  xen/drivers/vpci/vpci.c    |   6 ++
>  xen/include/xen/pci_regs.h |  13 ++++
>  xen/include/xen/vpci.h     |   2 +
>  5 files changed, 152 insertions(+), 1 deletion(-)
>  create mode 100644 xen/drivers/vpci/rebar.c
> 
> diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
> index 1a1413b93e76..a7c8a30a8956 100644
> --- a/xen/drivers/vpci/Makefile
> +++ b/xen/drivers/vpci/Makefile
> @@ -1,2 +1,2 @@
> -obj-y += vpci.o header.o
> +obj-y += vpci.o header.o rebar.o
>  obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> new file mode 100644
> index 000000000000..39432c3271a4
> --- /dev/null
> +++ b/xen/drivers/vpci/rebar.c
> @@ -0,0 +1,130 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> + *
> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> + */
> +
> +#include <xen/hypercall.h>

Do you really need the hypercall header?

> +#include <xen/vpci.h>
> +
> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> +                                      unsigned int reg,
> +                                      uint32_t val,
> +                                      void *data)
> +{
> +    uint64_t size;
> +    struct vpci_bar *bar = data;
> +
> +    size = PCI_REBAR_CTRL_SIZE(val);
> +    if ( bar->enabled )
> +    {
> +        if ( size == bar->size )
> +            return;
> +
> +        /*

Indentation

> +        * Refuse to resize a BAR while memory decoding is enabled, as
> +        * otherwise the size of the mapped region in the p2m would become
> +        * stale with the newly set BAR size, and the position of the BAR
> +        * would be reset to undefined.  Note the PCIe specification also
> +        * forbids resizing a BAR with memory decoding enabled.
> +        */
> +        gprintk(XENLOG_ERR,
> +                "%pp: refuse to resize BAR with memory decoding enabled\n",
> +                &pdev->sbdf);
> +        return;
> +    }

Nit: just realized this could be made shorter:

if ( bar->enabled )
{
   /*
    * Refuse to resize a BAR while memory decoding is enabled, as
    * otherwise the size of the mapped region in the p2m would become
    * stale with the newly set BAR size, and the position of the BAR
    * would be reset to undefined.  Note the PCIe specification also
    * forbids resizing a BAR with memory decoding enabled.
    */
    if ( size != bar->size )
        gprintk(XENLOG_ERR,
                "%pp: refuse to resize BAR with memory decoding enabled\n",
                &pdev->sbdf);

    return;
}

> +
> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) &
> +           MASK_EXTR(pci_conf_read32(pdev->sbdf,
> +                                     reg - PCI_REBAR_CTRL + PCI_REBAR_CAP),
> +                                     PCI_REBAR_CAP_SIZES)) )

Would it be possible to cache this information.  It's my understand
the supported sizes won't change, and hence Xen could read and cache
them in init_rebar() to avoid repeated reads to the register on every
access?

> +        gprintk(XENLOG_WARNING,
> +                "%pp: new size %#lx is not supported by hardware\n",
> +                &pdev->sbdf, size);
> +
> +    bar->size = size;
> +    bar->addr = 0;
> +    bar->guest_addr = 0;
> +    pci_conf_write32(pdev->sbdf, reg, val);
> +}
> +
> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int rebar_offset, nbars;
> +
> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);

You can do the init at definition:

    uint32_t ctrl;
    unsigned int nbars;
    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
                                                        PCI_EXT_CAP_ID_REBAR);


> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk("ReBar is not supported for domUs\n");

This needs a bit more information IMO:

printk(XENLOG_ERR
       "%pd %pp: resizable BAR capability not supported for unprivileged domains\n",
       pdev->domain, &pdev->sbdf);

I wonder if this should instead be an XSM check, but that would
require a new XSM hook to process permissions for PCI capabilities.

> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +
> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
> +    {
> +        int rc;
> +        unsigned int index;
> +        struct vpci_bar *bars = pdev->vpci->header.bars;
> +
> +        index = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL) &
> +                PCI_REBAR_CTRL_BAR_IDX;

You could initialize index at definition.

> +
> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> +        {
> +            /*
> +             * TODO: for failed pathes, need to hide ReBar capability
> +             * from hardware domain instead of returning an error.
> +             */
> +            printk("%pp: BAR number %u in REBAR_CTRL register is too big\n",
> +                   &pdev->sdf, index);

XENLOG_ERR, plus we could print the domain the device was assigned to
(pdev->domain).

> +            return -EINVAL;
> +        }
> +
> +        if ( bars[index].type != VPCI_BAR_MEM64_LO &&
> +             bars[index].type != VPCI_BAR_MEM32 )
> +        {
> +            printk("%pp: BAR%u is not in memory space\n", &pdev->sbdf, index);
> +            return -EINVAL;
> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> +                               rebar_offset + PCI_REBAR_CAP, 4, NULL);
> +        if ( rc )
> +        {
> +            printk("%pp: add register for PCI_REBAR_CAP failed (rc=%d)\n",
> +                   &pdev->sbdf, rc);
> +            return rc;
> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> +                               rebar_offset + PCI_REBAR_CTRL, 4,
> +                               &bars[index]);
> +        if ( rc )
> +        {
> +            printk("%pp: add register for PCI_REBAR_CTRL failed %d\n",
> +                   &pdev->sbdf, rc);
> +            return rc;
> +        }

All the log messages above need the XENLOG_ERR prefix, plus possibly
printing the assigned domain.

Thanks, Roger.
Jan Beulich Dec. 16, 2024, 10:30 a.m. UTC | #2
On 16.12.2024 11:24, Roger Pau Monné wrote:
> On Fri, Dec 13, 2024 at 01:42:32PM +0800, Jiqian Chen wrote:
>> +static int cf_check init_rebar(struct pci_dev *pdev)
>> +{
>> +    uint32_t ctrl;
>> +    unsigned int rebar_offset, nbars;
>> +
>> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
> 
> You can do the init at definition:
> 
>     uint32_t ctrl;
>     unsigned int nbars;
>     unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>                                                         PCI_EXT_CAP_ID_REBAR);
> 
> 
>> +
>> +    if ( !rebar_offset )
>> +        return 0;
>> +
>> +    if ( !is_hardware_domain(pdev->domain) )
>> +    {
>> +        printk("ReBar is not supported for domUs\n");
> 
> This needs a bit more information IMO:
> 
> printk(XENLOG_ERR
>        "%pd %pp: resizable BAR capability not supported for unprivileged domains\n",
>        pdev->domain, &pdev->sbdf);
> 
> I wonder if this should instead be an XSM check, but that would
> require a new XSM hook to process permissions for PCI capabilities.

Ultimately perhaps, but right now we need to bail here irrespective of
XSM policy, as the DomU side simply is unimplemented.

Jan
Roger Pau Monné Dec. 16, 2024, 10:38 a.m. UTC | #3
On Mon, Dec 16, 2024 at 11:30:22AM +0100, Jan Beulich wrote:
> On 16.12.2024 11:24, Roger Pau Monné wrote:
> > On Fri, Dec 13, 2024 at 01:42:32PM +0800, Jiqian Chen wrote:
> >> +static int cf_check init_rebar(struct pci_dev *pdev)
> >> +{
> >> +    uint32_t ctrl;
> >> +    unsigned int rebar_offset, nbars;
> >> +
> >> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
> > 
> > You can do the init at definition:
> > 
> >     uint32_t ctrl;
> >     unsigned int nbars;
> >     unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> >                                                         PCI_EXT_CAP_ID_REBAR);
> > 
> > 
> >> +
> >> +    if ( !rebar_offset )
> >> +        return 0;
> >> +
> >> +    if ( !is_hardware_domain(pdev->domain) )
> >> +    {
> >> +        printk("ReBar is not supported for domUs\n");
> > 
> > This needs a bit more information IMO:
> > 
> > printk(XENLOG_ERR
> >        "%pd %pp: resizable BAR capability not supported for unprivileged domains\n",
> >        pdev->domain, &pdev->sbdf);
> > 
> > I wonder if this should instead be an XSM check, but that would
> > require a new XSM hook to process permissions for PCI capabilities.
> 
> Ultimately perhaps, but right now we need to bail here irrespective of
> XSM policy, as the DomU side simply is unimplemented.

Yes, I should have said additionally rather than instead of the
is_hardware_domain() check.

Thanks, Roger.
Jan Beulich Dec. 16, 2024, 11:24 a.m. UTC | #4
On 13.12.2024 06:42, Jiqian Chen wrote:
> --- /dev/null
> +++ b/xen/drivers/vpci/rebar.c
> @@ -0,0 +1,130 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> + *
> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> + */
> +
> +#include <xen/hypercall.h>
> +#include <xen/vpci.h>
> +
> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> +                                      unsigned int reg,
> +                                      uint32_t val,
> +                                      void *data)
> +{
> +    uint64_t size;
> +    struct vpci_bar *bar = data;
> +
> +    size = PCI_REBAR_CTRL_SIZE(val);
> +    if ( bar->enabled )
> +    {
> +        if ( size == bar->size )
> +            return;
> +
> +        /*
> +        * Refuse to resize a BAR while memory decoding is enabled, as
> +        * otherwise the size of the mapped region in the p2m would become
> +        * stale with the newly set BAR size, and the position of the BAR
> +        * would be reset to undefined.  Note the PCIe specification also
> +        * forbids resizing a BAR with memory decoding enabled.
> +        */
> +        gprintk(XENLOG_ERR,
> +                "%pp: refuse to resize BAR with memory decoding enabled\n",
> +                &pdev->sbdf);
> +        return;
> +    }
> +
> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) &
> +           MASK_EXTR(pci_conf_read32(pdev->sbdf,
> +                                     reg - PCI_REBAR_CTRL + PCI_REBAR_CAP),
> +                                     PCI_REBAR_CAP_SIZES)) )
> +        gprintk(XENLOG_WARNING,
> +                "%pp: new size %#lx is not supported by hardware\n",
> +                &pdev->sbdf, size);

This only covers the 1Mb ... 128Tb range. What about the 256Tb ... 8Eb one?

> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int rebar_offset, nbars;
> +
> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk("ReBar is not supported for domUs\n");
> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +
> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )

PCI_REBAR_CTRL is an offset; it can't be used to bump rebar_offset here.
That'll need a separate constant, even if both evaluate to 8.

Jan
Jiqian Chen Dec. 17, 2024, 8:20 a.m. UTC | #5
On 2024/12/16 18:24, Roger Pau Monné wrote:
> On Fri, Dec 13, 2024 at 01:42:32PM +0800, Jiqian Chen wrote:
>> Some devices, like discrete GPU of amd, support resizable bar
>> capability, but vpci of Xen doesn't support this feature, so
>> they fail to resize bars and then cause probing failure.
>>
>> According to PCIe spec, each bar that supports resizing has
>> two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
>> handlers for them to support resizing the size of BARs.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> Hi all,
>> v2->v3 changes:
>> * Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY",
>>   and added comments why it needs this check.
>> * Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return EOPNOTSUPP for domUs.
>> * Moved BAR type and index check into init_rebar(), then only need to check once.
>> * Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
>> * Added macro PCI_REBAR_SIZE_BIAS to represent 20.
>>
>> TODO: need to hide ReBar capability from hardware domain when init_rebar() fails.
>>
>> Best regards,
>> Jiqian Chen.
>>
>> v1->v2 changes:
>> * In rebar_ctrl_write, to check if memory decoding is enabled, and added
>>   some checks for the type of Bar.
>> * Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
>>   no write limitation of dom0.
>> * And has many other minor modifications as well.
>> ---
>>  xen/drivers/vpci/Makefile  |   2 +-
>>  xen/drivers/vpci/rebar.c   | 130 +++++++++++++++++++++++++++++++++++++
>>  xen/drivers/vpci/vpci.c    |   6 ++
>>  xen/include/xen/pci_regs.h |  13 ++++
>>  xen/include/xen/vpci.h     |   2 +
>>  5 files changed, 152 insertions(+), 1 deletion(-)
>>  create mode 100644 xen/drivers/vpci/rebar.c
>>
>> diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
>> index 1a1413b93e76..a7c8a30a8956 100644
>> --- a/xen/drivers/vpci/Makefile
>> +++ b/xen/drivers/vpci/Makefile
>> @@ -1,2 +1,2 @@
>> -obj-y += vpci.o header.o
>> +obj-y += vpci.o header.o rebar.o
>>  obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
>> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
>> new file mode 100644
>> index 000000000000..39432c3271a4
>> --- /dev/null
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -0,0 +1,130 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
>> + *
>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
>> + */
>> +
>> +#include <xen/hypercall.h>
> 
> Do you really need the hypercall header?
I will change to <xen/sched.h> since is_hardware_domain() needs.

> 
>> +#include <xen/vpci.h>
>> +
>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>> +                                      unsigned int reg,
>> +                                      uint32_t val,
>> +                                      void *data)
>> +{
>> +    uint64_t size;
>> +    struct vpci_bar *bar = data;
>> +
>> +    size = PCI_REBAR_CTRL_SIZE(val);
>> +    if ( bar->enabled )
>> +    {
>> +        if ( size == bar->size )
>> +            return;
>> +
>> +        /*
> 
> Indentation
> 
>> +        * Refuse to resize a BAR while memory decoding is enabled, as
>> +        * otherwise the size of the mapped region in the p2m would become
>> +        * stale with the newly set BAR size, and the position of the BAR
>> +        * would be reset to undefined.  Note the PCIe specification also
>> +        * forbids resizing a BAR with memory decoding enabled.
>> +        */
>> +        gprintk(XENLOG_ERR,
>> +                "%pp: refuse to resize BAR with memory decoding enabled\n",
>> +                &pdev->sbdf);
>> +        return;
>> +    }
> 
> Nit: just realized this could be made shorter:
> 
> if ( bar->enabled )
> {
>    /*
>     * Refuse to resize a BAR while memory decoding is enabled, as
>     * otherwise the size of the mapped region in the p2m would become
>     * stale with the newly set BAR size, and the position of the BAR
>     * would be reset to undefined.  Note the PCIe specification also
>     * forbids resizing a BAR with memory decoding enabled.
>     */
>     if ( size != bar->size )
>         gprintk(XENLOG_ERR,
>                 "%pp: refuse to resize BAR with memory decoding enabled\n",
>                 &pdev->sbdf);
> 
>     return;
> }
> 
>> +
>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) &
>> +           MASK_EXTR(pci_conf_read32(pdev->sbdf,
>> +                                     reg - PCI_REBAR_CTRL + PCI_REBAR_CAP),
>> +                                     PCI_REBAR_CAP_SIZES)) )
> 
> Would it be possible to cache this information.  It's my understand
> the supported sizes won't change, and hence Xen could read and cache
> them in init_rebar() to avoid repeated reads to the register on every
> access?
Thanks, I will change in next version.
Then I think I need to add a parameter to struct vpci_bar.
Maybe I can name it "resizable_sizes" ?

> 
>> +        gprintk(XENLOG_WARNING,
>> +                "%pp: new size %#lx is not supported by hardware\n",
>> +                &pdev->sbdf, size);
>> +
>> +    bar->size = size;
>> +    bar->addr = 0;
>> +    bar->guest_addr = 0;
>> +    pci_conf_write32(pdev->sbdf, reg, val);
>> +}
>> +
>> +static int cf_check init_rebar(struct pci_dev *pdev)
>> +{
>> +    uint32_t ctrl;
>> +    unsigned int rebar_offset, nbars;
>> +
>> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
> 
> You can do the init at definition:
> 
>     uint32_t ctrl;
>     unsigned int nbars;
>     unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>                                                         PCI_EXT_CAP_ID_REBAR);
> 
> 
>> +
>> +    if ( !rebar_offset )
>> +        return 0;
>> +
>> +    if ( !is_hardware_domain(pdev->domain) )
>> +    {
>> +        printk("ReBar is not supported for domUs\n");
> 
> This needs a bit more information IMO:
> 
> printk(XENLOG_ERR
>        "%pd %pp: resizable BAR capability not supported for unprivileged domains\n",
>        pdev->domain, &pdev->sbdf);
OK, will change.
If the length of code of printing more than 80 characters in one line, is it fine?

> 
> I wonder if this should instead be an XSM check, but that would
> require a new XSM hook to process permissions for PCI capabilities.
> 
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>> +
>> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
>> +    {
>> +        int rc;
>> +        unsigned int index;
>> +        struct vpci_bar *bars = pdev->vpci->header.bars;
>> +
>> +        index = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL) &
>> +                PCI_REBAR_CTRL_BAR_IDX;
> 
> You could initialize index at definition.
> 
>> +
>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
>> +        {
>> +            /*
>> +             * TODO: for failed pathes, need to hide ReBar capability
>> +             * from hardware domain instead of returning an error.
>> +             */
>> +            printk("%pp: BAR number %u in REBAR_CTRL register is too big\n",
>> +                   &pdev->sdf, index);
> 
> XENLOG_ERR, plus we could print the domain the device was assigned to
> (pdev->domain).
> 
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( bars[index].type != VPCI_BAR_MEM64_LO &&
>> +             bars[index].type != VPCI_BAR_MEM32 )
>> +        {
>> +            printk("%pp: BAR%u is not in memory space\n", &pdev->sbdf, index);
>> +            return -EINVAL;
>> +        }
>> +
>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
>> +                               rebar_offset + PCI_REBAR_CAP, 4, NULL);
>> +        if ( rc )
>> +        {
>> +            printk("%pp: add register for PCI_REBAR_CAP failed (rc=%d)\n",
>> +                   &pdev->sbdf, rc);
>> +            return rc;
>> +        }
>> +
>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
>> +                               rebar_offset + PCI_REBAR_CTRL, 4,
>> +                               &bars[index]);
>> +        if ( rc )
>> +        {
>> +            printk("%pp: add register for PCI_REBAR_CTRL failed %d\n",
>> +                   &pdev->sbdf, rc);
>> +            return rc;
>> +        }
> 
> All the log messages above need the XENLOG_ERR prefix, plus possibly
> printing the assigned domain.
Will change according to your all comments, thank you!

> 
> Thanks, Roger.
Jiqian Chen Dec. 17, 2024, 8:22 a.m. UTC | #6
On 2024/12/16 19:24, Jan Beulich wrote:
> On 13.12.2024 06:42, Jiqian Chen wrote:
>> --- /dev/null
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -0,0 +1,130 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
>> + *
>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
>> + */
>> +
>> +#include <xen/hypercall.h>
>> +#include <xen/vpci.h>
>> +
>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>> +                                      unsigned int reg,
>> +                                      uint32_t val,
>> +                                      void *data)
>> +{
>> +    uint64_t size;
>> +    struct vpci_bar *bar = data;
>> +
>> +    size = PCI_REBAR_CTRL_SIZE(val);
>> +    if ( bar->enabled )
>> +    {
>> +        if ( size == bar->size )
>> +            return;
>> +
>> +        /*
>> +        * Refuse to resize a BAR while memory decoding is enabled, as
>> +        * otherwise the size of the mapped region in the p2m would become
>> +        * stale with the newly set BAR size, and the position of the BAR
>> +        * would be reset to undefined.  Note the PCIe specification also
>> +        * forbids resizing a BAR with memory decoding enabled.
>> +        */
>> +        gprintk(XENLOG_ERR,
>> +                "%pp: refuse to resize BAR with memory decoding enabled\n",
>> +                &pdev->sbdf);
>> +        return;
>> +    }
>> +
>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) &
>> +           MASK_EXTR(pci_conf_read32(pdev->sbdf,
>> +                                     reg - PCI_REBAR_CTRL + PCI_REBAR_CAP),
>> +                                     PCI_REBAR_CAP_SIZES)) )
>> +        gprintk(XENLOG_WARNING,
>> +                "%pp: new size %#lx is not supported by hardware\n",
>> +                &pdev->sbdf, size);
> 
> This only covers the 1Mb ... 128Tb range. What about the 256Tb ... 8Eb one?
Thank you for reminding me!
I will consider that in next version.

> 
>> +static int cf_check init_rebar(struct pci_dev *pdev)
>> +{
>> +    uint32_t ctrl;
>> +    unsigned int rebar_offset, nbars;
>> +
>> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
>> +
>> +    if ( !rebar_offset )
>> +        return 0;
>> +
>> +    if ( !is_hardware_domain(pdev->domain) )
>> +    {
>> +        printk("ReBar is not supported for domUs\n");
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>> +
>> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
> 
> PCI_REBAR_CTRL is an offset; it can't be used to bump rebar_offset here.
> That'll need a separate constant, even if both evaluate to 8.
I will add a new macro to represent the '8' in rebar.c
Maybe I can name it "PCI_REBAR_SINGLE_BAR_LEN" ?

> 
> Jan
Jan Beulich Dec. 17, 2024, 8:44 a.m. UTC | #7
On 17.12.2024 09:22, Chen, Jiqian wrote:
> On 2024/12/16 19:24, Jan Beulich wrote:
>> On 13.12.2024 06:42, Jiqian Chen wrote:
>>> +static int cf_check init_rebar(struct pci_dev *pdev)
>>> +{
>>> +    uint32_t ctrl;
>>> +    unsigned int rebar_offset, nbars;
>>> +
>>> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
>>> +
>>> +    if ( !rebar_offset )
>>> +        return 0;
>>> +
>>> +    if ( !is_hardware_domain(pdev->domain) )
>>> +    {
>>> +        printk("ReBar is not supported for domUs\n");
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +
>>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
>>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>>> +
>>> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
>>
>> PCI_REBAR_CTRL is an offset; it can't be used to bump rebar_offset here.
>> That'll need a separate constant, even if both evaluate to 8.
> I will add a new macro to represent the '8' in rebar.c
> Maybe I can name it "PCI_REBAR_SINGLE_BAR_LEN" ?

Naming is a 2nd step only, I think (and no really suitable name comes to mind).
Before thinking of names, I think the approach of doing the accesses here wants
reconsidering. It isn't quite right to bump rebar_offset. When using #define-s,
I'd instead expect to first move _just_ past the capability header, and then
use constants to get at capability and control registers. Alternatively, if we
want to express everything relative to rebar_offset, I think we'd want

#define PCI_REBAR_CAP(n) (4 + 8 *(n))
#define PCI_REBAR_CTRL(n) (8 + 8 *(n))

eliminating the need to alter rebar_offset (and hence disconnecting variable
name from its purpose).

Jan
Jan Beulich Dec. 17, 2024, 8:49 a.m. UTC | #8
On 17.12.2024 09:20, Chen, Jiqian wrote:
> On 2024/12/16 18:24, Roger Pau Monné wrote:
>> On Fri, Dec 13, 2024 at 01:42:32PM +0800, Jiqian Chen wrote:
>>> +    if ( !is_hardware_domain(pdev->domain) )
>>> +    {
>>> +        printk("ReBar is not supported for domUs\n");
>>
>> This needs a bit more information IMO:
>>
>> printk(XENLOG_ERR
>>        "%pd %pp: resizable BAR capability not supported for unprivileged domains\n",
>>        pdev->domain, &pdev->sbdf);
> OK, will change.
> If the length of code of printing more than 80 characters in one line, is it fine?

Technically: Yes. Imo in such cases one ought to strive though to shorten
wording as much as possible, without losing information. E.g. in this case
possibly:

    printk(XENLOG_ERR
           "%pp: resizable BARs unsupported for unpriv %pd\n",
           &pdev->sbdf, pdev->domain);

Jan
Jiqian Chen Dec. 17, 2024, 9:35 a.m. UTC | #9
On 2024/12/17 16:44, Jan Beulich wrote:
> On 17.12.2024 09:22, Chen, Jiqian wrote:
>> On 2024/12/16 19:24, Jan Beulich wrote:
>>> On 13.12.2024 06:42, Jiqian Chen wrote:
>>>> +static int cf_check init_rebar(struct pci_dev *pdev)
>>>> +{
>>>> +    uint32_t ctrl;
>>>> +    unsigned int rebar_offset, nbars;
>>>> +
>>>> +    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
>>>> +
>>>> +    if ( !rebar_offset )
>>>> +        return 0;
>>>> +
>>>> +    if ( !is_hardware_domain(pdev->domain) )
>>>> +    {
>>>> +        printk("ReBar is not supported for domUs\n");
>>>> +        return -EOPNOTSUPP;
>>>> +    }
>>>> +
>>>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
>>>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>>>> +
>>>> +    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
>>>
>>> PCI_REBAR_CTRL is an offset; it can't be used to bump rebar_offset here.
>>> That'll need a separate constant, even if both evaluate to 8.
>> I will add a new macro to represent the '8' in rebar.c
>> Maybe I can name it "PCI_REBAR_SINGLE_BAR_LEN" ?
> 
> Naming is a 2nd step only, I think (and no really suitable name comes to mind).
> Before thinking of names, I think the approach of doing the accesses here wants
> reconsidering. It isn't quite right to bump rebar_offset. When using #define-s,
> I'd instead expect to first move _just_ past the capability header, and then
> use constants to get at capability and control registers. Alternatively, if we
> want to express everything relative to rebar_offset, I think we'd want
> 
> #define PCI_REBAR_CAP(n) (4 + 8 *(n))
> #define PCI_REBAR_CTRL(n) (8 + 8 *(n))
> 
> eliminating the need to alter rebar_offset (and hence disconnecting variable
> name from its purpose).
It sounds much better, thank you very much!

> 
> Jan
diff mbox series

Patch

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 1a1413b93e76..a7c8a30a8956 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1,2 +1,2 @@ 
-obj-y += vpci.o header.o
+obj-y += vpci.o header.o rebar.o
 obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
new file mode 100644
index 000000000000..39432c3271a4
--- /dev/null
+++ b/xen/drivers/vpci/rebar.c
@@ -0,0 +1,130 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
+ *
+ * Author: Jiqian Chen <Jiqian.Chen@amd.com>
+ */
+
+#include <xen/hypercall.h>
+#include <xen/vpci.h>
+
+static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
+                                      unsigned int reg,
+                                      uint32_t val,
+                                      void *data)
+{
+    uint64_t size;
+    struct vpci_bar *bar = data;
+
+    size = PCI_REBAR_CTRL_SIZE(val);
+    if ( bar->enabled )
+    {
+        if ( size == bar->size )
+            return;
+
+        /*
+        * Refuse to resize a BAR while memory decoding is enabled, as
+        * otherwise the size of the mapped region in the p2m would become
+        * stale with the newly set BAR size, and the position of the BAR
+        * would be reset to undefined.  Note the PCIe specification also
+        * forbids resizing a BAR with memory decoding enabled.
+        */
+        gprintk(XENLOG_ERR,
+                "%pp: refuse to resize BAR with memory decoding enabled\n",
+                &pdev->sbdf);
+        return;
+    }
+
+    if ( !((size >> PCI_REBAR_SIZE_BIAS) &
+           MASK_EXTR(pci_conf_read32(pdev->sbdf,
+                                     reg - PCI_REBAR_CTRL + PCI_REBAR_CAP),
+                                     PCI_REBAR_CAP_SIZES)) )
+        gprintk(XENLOG_WARNING,
+                "%pp: new size %#lx is not supported by hardware\n",
+                &pdev->sbdf, size);
+
+    bar->size = size;
+    bar->addr = 0;
+    bar->guest_addr = 0;
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
+static int cf_check init_rebar(struct pci_dev *pdev)
+{
+    uint32_t ctrl;
+    unsigned int rebar_offset, nbars;
+
+    rebar_offset = pci_find_ext_capability(pdev->sbdf, PCI_EXT_CAP_ID_REBAR);
+
+    if ( !rebar_offset )
+        return 0;
+
+    if ( !is_hardware_domain(pdev->domain) )
+    {
+        printk("ReBar is not supported for domUs\n");
+        return -EOPNOTSUPP;
+    }
+
+    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL);
+    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
+
+    for ( unsigned int i = 0; i < nbars; i++, rebar_offset += PCI_REBAR_CTRL )
+    {
+        int rc;
+        unsigned int index;
+        struct vpci_bar *bars = pdev->vpci->header.bars;
+
+        index = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL) &
+                PCI_REBAR_CTRL_BAR_IDX;
+
+        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
+        {
+            /*
+             * TODO: for failed pathes, need to hide ReBar capability
+             * from hardware domain instead of returning an error.
+             */
+            printk("%pp: BAR number %u in REBAR_CTRL register is too big\n",
+                   &pdev->sbdf, index);
+            return -EINVAL;
+        }
+
+        if ( bars[index].type != VPCI_BAR_MEM64_LO &&
+             bars[index].type != VPCI_BAR_MEM32 )
+        {
+            printk("%pp: BAR%u is not in memory space\n", &pdev->sbdf, index);
+            return -EINVAL;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
+                               rebar_offset + PCI_REBAR_CAP, 4, NULL);
+        if ( rc )
+        {
+            printk("%pp: add register for PCI_REBAR_CAP failed (rc=%d)\n",
+                   &pdev->sbdf, rc);
+            return rc;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
+                               rebar_offset + PCI_REBAR_CTRL, 4,
+                               &bars[index]);
+        if ( rc )
+        {
+            printk("%pp: add register for PCI_REBAR_CTRL failed %d\n",
+                   &pdev->sbdf, rc);
+            return rc;
+        }
+    }
+
+    return 0;
+}
+REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 1e6aa5d799b9..3349b98389b8 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -232,6 +232,12 @@  void cf_check vpci_hw_write16(
     pci_conf_write16(pdev->sbdf, reg, val);
 }
 
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
 int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
                            vpci_write_t *write_handler, unsigned int offset,
                            unsigned int size, void *data, uint32_t ro_mask,
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 250ba106dbd3..c6bd3545d7a5 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -459,6 +459,7 @@ 
 #define PCI_EXT_CAP_ID_ARI	14
 #define PCI_EXT_CAP_ID_ATS	15
 #define PCI_EXT_CAP_ID_SRIOV	16
+#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
@@ -541,6 +542,18 @@ 
 #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
 #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
 
+/* Resizable BARs */
+#define PCI_REBAR_SIZE_BIAS	20
+#define PCI_REBAR_CAP		4	/* capability register */
+#define  PCI_REBAR_CAP_SIZES		0xFFFFFFF0U  /* supported BAR sizes */
+#define PCI_REBAR_CTRL		8	/* control register */
+#define  PCI_REBAR_CTRL_BAR_IDX	0x00000007  /* BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0  /* # of resizable BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE	0x00001F00  /* BAR size */
+#define  PCI_REBAR_CTRL_SIZE(v) \
+            (1UL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
+                     + PCI_REBAR_SIZE_BIAS))
+
 /*
  * Hypertransport sub capability types
  *
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 41e7c3bc2791..72992e93cece 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -78,6 +78,8 @@  uint32_t cf_check vpci_hw_read32(
     const struct pci_dev *pdev, unsigned int reg, void *data);
 void cf_check vpci_hw_write16(
     const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
 
 /*
  * Check for pending vPCI operations on this vcpu. Returns true if the vcpu