diff mbox series

[v5,08/11] xen/arm: Enable the existing x86 virtual PCI support for ARM.

Message ID 9bdca2cda5d2e83f94dc2423e55714273539760a.1633540842.git.rahul.singh@arm.com (mailing list archive)
State New, archived
Headers show
Series PCI devices passthrough on Arm | expand

Commit Message

Rahul Singh Oct. 6, 2021, 5:40 p.m. UTC
The existing VPCI support available for X86 is adapted for Arm.
When the device is added to XEN via the hyper call
“PHYSDEVOP_pci_device_add”, VPCI handler for the config space
access is added to the Xen to emulate the PCI devices config space.

A MMIO trap handler for the PCI ECAM space is registered in XEN
so that when guest is trying to access the PCI config space,XEN
will trap the access and emulate read/write using the VPCI and
not the real PCI hardware.

For Dom0less systems scan_pci_devices() would be used to discover the
PCI device in XEN and VPCI handler will be added during XEN boots.

Signed-off-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
Change in v5:
- Add pci_cleanup_msi(pdev) in cleanup part.
- Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Change in v4:
- Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
Change in v3:
- Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
- Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
- Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
Change in v2:
- Add new XEN_DOMCTL_CDF_vpci flag
- modify has_vpci() to include XEN_DOMCTL_CDF_vpci
- enable vpci support when pci-passthough option is enabled.
---
---
 xen/arch/arm/Makefile         |   1 +
 xen/arch/arm/domain.c         |   4 ++
 xen/arch/arm/domain_build.c   |   3 +
 xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
 xen/arch/arm/vpci.h           |  36 ++++++++++++
 xen/drivers/passthrough/pci.c |  18 ++++++
 xen/include/asm-arm/domain.h  |   7 ++-
 xen/include/asm-x86/pci.h     |   2 -
 xen/include/public/arch-arm.h |   7 +++
 xen/include/xen/pci.h         |   2 +
 10 files changed, 179 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/arm/vpci.c
 create mode 100644 xen/arch/arm/vpci.h

Comments

Jan Beulich Oct. 7, 2021, 1:43 p.m. UTC | #1
On 06.10.2021 19:40, Rahul Singh wrote:
> --- /dev/null
> +++ b/xen/arch/arm/vpci.c
> @@ -0,0 +1,102 @@
> +/*
> + * xen/arch/arm/vpci.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +#include <xen/sched.h>
> +
> +#include <asm/mmio.h>
> +
> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)

Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
Also isn't this effectively part of the public interface (along with
MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?

> +/* Do some sanity checks. */
> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
> +{
> +    /* Check access size. */
> +    if ( len > 8 )
> +        return false;

struct hsr_dabt's size field doesn't allow len to be above 8. I could
see that you may want to sanity check things, but that's not helpful
if done incompletely. Elsewhere you assume the value to be non-zero,
and ...

> +    /* Check that access is size aligned. */
> +    if ( (reg & (len - 1)) )

... right here you assume the value to be a power of 2. While I'm not
a maintainer, I'd still like to suggest consistency: Do all pertinent
checks or none of them (relying on the caller).

Independent of this - is bare metal Arm enforcing this same
alignment restriction (unconditionally)? Iirc on x86 we felt we'd
better synthesize misaligned accesses.

> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                          register_t *r, void *p)
> +{
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +    unsigned long data = ~0UL;

What use is this initializer? The error path further down doesn't
forward the value into *r, and subsequently the value gets fully
overwritten.

> +    unsigned int size = 1U << info->dabt.size;
> +
> +    sbdf.sbdf = MMCFG_BDF(info->gpa);

This implies segment to be zero. While probably fine for now, I
wonder if this wouldn't warrant a comment.

> +    reg = REGISTER_OFFSET(info->gpa);
> +
> +    if ( !vpci_mmio_access_allowed(reg, size) )
> +        return 0;
> +
> +    data = vpci_read(sbdf, reg, min(4u, size));
> +    if ( size == 8 )
> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;

Throughout this series I haven't been able to spot where the HAS_VPCI
Kconfig symbol would get selected. Hence I cannot tell whether all of
this is Arm64-specific. Otherwise I wonder whether size 8 actually
can occur on Arm32.

> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
> +                           register_t r, void *p)
> +{
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +    unsigned long data = r;

A little like in the read function - what use is this local variable?
Can't you use r directly?

> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>      else
>          iommu_enable_device(pdev);

Please note the context above; ...

> +#ifdef CONFIG_ARM
> +    /*
> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
> +     * Dom0 inform XEN to add the PCI devices in XEN.
> +     */
> +    ret = vpci_add_handlers(pdev);
> +    if ( ret )
> +    {
> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
> +        pci_cleanup_msi(pdev);
> +        ret = iommu_remove_device(pdev);
> +        if ( pdev->domain )
> +            list_del(&pdev->domain_list);
> +        free_pdev(pseg, pdev);

... you unconditionally undo the if() part of the earlier conditional;
is there any guarantee that the other path can't ever be taken, now
and forever? If it can't be taken now (which I think is the case as
long as Dom0 wouldn't report the same device twice), then at least some
precaution wants taking. Maybe moving your addition into that if()
could be an option.

Furthermore I continue to wonder whether this ordering is indeed
preferable over doing software setup before hardware arrangements. This
would address the above issue as well as long as vpci_add_handlers() is
idempotent. And it would likely simplify error cleanup.

Jan
Roger Pau Monné Oct. 11, 2021, 10:51 a.m. UTC | #2
On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> The existing VPCI support available for X86 is adapted for Arm.
> When the device is added to XEN via the hyper call
> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
> access is added to the Xen to emulate the PCI devices config space.
> 
> A MMIO trap handler for the PCI ECAM space is registered in XEN
> so that when guest is trying to access the PCI config space,XEN
> will trap the access and emulate read/write using the VPCI and
> not the real PCI hardware.
> 
> For Dom0less systems scan_pci_devices() would be used to discover the
> PCI device in XEN and VPCI handler will be added during XEN boots.
> 
> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> Change in v5:
> - Add pci_cleanup_msi(pdev) in cleanup part.
> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> Change in v4:
> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
> Change in v3:
> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
> Change in v2:
> - Add new XEN_DOMCTL_CDF_vpci flag
> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
> - enable vpci support when pci-passthough option is enabled.
> ---
> ---
>  xen/arch/arm/Makefile         |   1 +
>  xen/arch/arm/domain.c         |   4 ++
>  xen/arch/arm/domain_build.c   |   3 +
>  xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vpci.h           |  36 ++++++++++++
>  xen/drivers/passthrough/pci.c |  18 ++++++
>  xen/include/asm-arm/domain.h  |   7 ++-
>  xen/include/asm-x86/pci.h     |   2 -
>  xen/include/public/arch-arm.h |   7 +++
>  xen/include/xen/pci.h         |   2 +
>  10 files changed, 179 insertions(+), 3 deletions(-)
>  create mode 100644 xen/arch/arm/vpci.c
>  create mode 100644 xen/arch/arm/vpci.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 44d7cc81fa..fb9c976ea2 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>  obj-y += platforms/
>  endif
>  obj-$(CONFIG_TEE) += tee/
> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>  
>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.init.o
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 36138c1b2e..fbb52f78f1 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -39,6 +39,7 @@
>  #include <asm/vgic.h>
>  #include <asm/vtimer.h>
>  
> +#include "vpci.h"
>  #include "vuart.h"
>  
>  DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>          goto fail;
>  
> +    if ( (rc = domain_vpci_init(d)) != 0 )
> +        goto fail;
> +
>      return 0;
>  
>  fail:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index c5afbe2e05..f4c89bde8c 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>      if ( iommu_enabled )
>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>  
> +    if ( is_pci_passthrough_enabled() )
> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;

I think I'm confused with this. You seem to enable vPCI for dom0, but
then domain_vpci_init will setup traps for the guest virtual ECAM
layout, not the native one that dom0 will be using.

> +
>      dom0 = domain_create(0, &dom0_cfg, true);
>      if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>          panic("Error creating domain 0\n");
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> new file mode 100644
> index 0000000000..76c12b9281
> --- /dev/null
> +++ b/xen/arch/arm/vpci.c
> @@ -0,0 +1,102 @@
> +/*
> + * xen/arch/arm/vpci.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +#include <xen/sched.h>
> +
> +#include <asm/mmio.h>
> +
> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> +
> +/* Do some sanity checks. */
> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
> +{
> +    /* Check access size. */
> +    if ( len > 8 )
> +        return false;
> +
> +    /* Check that access is size aligned. */
> +    if ( (reg & (len - 1)) )
> +        return false;
> +
> +    return true;
> +}

There's a vpci_access_allowed which I think you could generalize and
use here, there's no need to have this duplicated code.

Thanks, Roger.
Bertrand Marquis Oct. 11, 2021, 12:41 p.m. UTC | #3
Hi Jan,

As Rahul is on leave, I will answer you and make the changes needed.

> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 06.10.2021 19:40, Rahul Singh wrote:
>> --- /dev/null
>> +++ b/xen/arch/arm/vpci.c
>> @@ -0,0 +1,102 @@
>> +/*
>> + * xen/arch/arm/vpci.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +#include <xen/sched.h>
>> +
>> +#include <asm/mmio.h>
>> +
>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> 
> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
> Also isn't this effectively part of the public interface (along with
> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?

I will move that in the next version to xen/pci.h and rename it MMCFG_REG_OFFSET.
Would that be ok ?

> 
>> +/* Do some sanity checks. */
>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>> +{
>> +    /* Check access size. */
>> +    if ( len > 8 )
>> +        return false;
> 
> struct hsr_dabt's size field doesn't allow len to be above 8. I could
> see that you may want to sanity check things, but that's not helpful
> if done incompletely. Elsewhere you assume the value to be non-zero,
> and ...
> 
>> +    /* Check that access is size aligned. */
>> +    if ( (reg & (len - 1)) )
> 
> ... right here you assume the value to be a power of 2. While I'm not
> a maintainer, I'd still like to suggest consistency: Do all pertinent
> checks or none of them (relying on the caller).

I will remove the check for len > 8 as dabt.size cannot have a value
greater than 3.

But I will have to introduce a check for len > 4 on 32 bit systems (see after).

> 
> Independent of this - is bare metal Arm enforcing this same
> alignment restriction (unconditionally)? Iirc on x86 we felt we'd
> better synthesize misaligned accesses.

Unaligned IO access could be synthesise also on arm to but I would
rather not make such a change now without testing it (and there is
also a question of it making sense).

So if it is ok with you I will keep that check and discuss it with Rahul
when he is back. I will add a comment in the code to make that clear.

> 
>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>> +                          register_t *r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = ~0UL;
> 
> What use is this initializer? The error path further down doesn't
> forward the value into *r, and subsequently the value gets fully
> overwritten.

Right I will remove it.

> 
>> +    unsigned int size = 1U << info->dabt.size;
>> +
>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> 
> This implies segment to be zero. While probably fine for now, I
> wonder if this wouldn't warrant a comment.

I will add the following comment just before:
/* We ignore segment part and always handle segment 0 */

> 
>> +    reg = REGISTER_OFFSET(info->gpa);
>> +
>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>> +        return 0;
>> +
>> +    data = vpci_read(sbdf, reg, min(4u, size));
>> +    if ( size == 8 )
>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
> 
> Throughout this series I haven't been able to spot where the HAS_VPCI
> Kconfig symbol would get selected. Hence I cannot tell whether all of
> this is Arm64-specific. Otherwise I wonder whether size 8 actually
> can occur on Arm32.

Dabt.size could be 3 even on ARM32 but we should not allow 64bit
access on mmio regions on arm32.

So I will surround this code with ifdef CONFIG_ARM_64 and add a test
for len > 4 to prevent this case on 32bit.

To be completely right we should disable this also for 32bit guests but
this change would be a bit more invasive.

> 
>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>> +                           register_t r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = r;
> 
> A little like in the read function - what use is this local variable?
> Can't you use r directly?

We can and I will remove the data variable.

> 
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>     else
>>         iommu_enable_device(pdev);
> 
> Please note the context above; ...
> 
>> +#ifdef CONFIG_ARM
>> +    /*
>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
>> +     * Dom0 inform XEN to add the PCI devices in XEN.
>> +     */
>> +    ret = vpci_add_handlers(pdev);
>> +    if ( ret )
>> +    {
>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
>> +        pci_cleanup_msi(pdev);
>> +        ret = iommu_remove_device(pdev);
>> +        if ( pdev->domain )
>> +            list_del(&pdev->domain_list);
>> +        free_pdev(pseg, pdev);
> 
> ... you unconditionally undo the if() part of the earlier conditional;
> is there any guarantee that the other path can't ever be taken, now
> and forever? If it can't be taken now (which I think is the case as
> long as Dom0 wouldn't report the same device twice), then at least some
> precaution wants taking. Maybe moving your addition into that if()
> could be an option.
> 
> Furthermore I continue to wonder whether this ordering is indeed
> preferable over doing software setup before hardware arrangements. This
> would address the above issue as well as long as vpci_add_handlers() is
> idempotent. And it would likely simplify error cleanup.

I agree with you so I will move this code block before iommu part.

But digging deeper into this, I would have 2 questions:

- msi_cleanup was done there after a request from Stefano, but is not
done in case or iommu error, is there an issue to solve here ?
Same could also go for the free_pdev ?

- cleanup code was exactly the same as pci_remove_device code.
Should the question about the path also be checked there ?

Regards
Bertrand


> 
> Jan
> 
>
Jan Beulich Oct. 11, 2021, 1:09 p.m. UTC | #4
On 11.10.2021 14:41, Bertrand Marquis wrote:
>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>> On 06.10.2021 19:40, Rahul Singh wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/arm/vpci.c
>>> @@ -0,0 +1,102 @@
>>> +/*
>>> + * xen/arch/arm/vpci.c
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + */
>>> +#include <xen/sched.h>
>>> +
>>> +#include <asm/mmio.h>
>>> +
>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>
>> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
>> Also isn't this effectively part of the public interface (along with
>> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?
> 
> I will move that in the next version to xen/pci.h and rename it MMCFG_REG_OFFSET.
> Would that be ok ?

That would be okay and make sense when put next to MMCFG_BDF(), but
it would not address my comment: That still wouldn't be the public
interface. Otoh you only mimic hardware behavior, so perhaps the
splitting of the address isn't as relevant to put there as would be
GUEST_VPCI_ECAM_{BASE,SIZE}.

>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>>     else
>>>         iommu_enable_device(pdev);
>>
>> Please note the context above; ...
>>
>>> +#ifdef CONFIG_ARM
>>> +    /*
>>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
>>> +     * Dom0 inform XEN to add the PCI devices in XEN.
>>> +     */
>>> +    ret = vpci_add_handlers(pdev);
>>> +    if ( ret )
>>> +    {
>>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
>>> +        pci_cleanup_msi(pdev);
>>> +        ret = iommu_remove_device(pdev);
>>> +        if ( pdev->domain )
>>> +            list_del(&pdev->domain_list);
>>> +        free_pdev(pseg, pdev);
>>
>> ... you unconditionally undo the if() part of the earlier conditional;
>> is there any guarantee that the other path can't ever be taken, now
>> and forever? If it can't be taken now (which I think is the case as
>> long as Dom0 wouldn't report the same device twice), then at least some
>> precaution wants taking. Maybe moving your addition into that if()
>> could be an option.
>>
>> Furthermore I continue to wonder whether this ordering is indeed
>> preferable over doing software setup before hardware arrangements. This
>> would address the above issue as well as long as vpci_add_handlers() is
>> idempotent. And it would likely simplify error cleanup.
> 
> I agree with you so I will move this code block before iommu part.
> 
> But digging deeper into this, I would have 2 questions:
> 
> - msi_cleanup was done there after a request from Stefano, but is not
> done in case or iommu error, is there an issue to solve here ?

Maybe, but I'm not sure. This very much depends on what a domain
could in principle do with a partly set-up device. Plus let's
not forget that we're talking of only Dom0 here (for now at least,
i.e. not considering the dom0less case).

But I'd also like to further defer to Stefano.

> Same could also go for the free_pdev ?

I think it's wrong to free_pdev() here. We want to internally keep
record of the device, even if the device ends up unusable. The only
place where free_pdev() ought to be called is imo pci_remove_device().

> - cleanup code was exactly the same as pci_remove_device code.
> Should the question about the path also be checked there ?

I'm sorry, but I'm afraid I don't see what "the path" refers to
here. You can't mean the conditional in pci_add_device() selecting
between iommu_add_device() and iommu_enable_device(), as "remove"
can only mean "remove", never "disable".

Jan
Bertrand Marquis Oct. 11, 2021, 1:34 p.m. UTC | #5
Hi Jan,

> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 06.10.2021 19:40, Rahul Singh wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/vpci.c
>>>> @@ -0,0 +1,102 @@
>>>> +/*
>>>> + * xen/arch/arm/vpci.c
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>> + * (at your option) any later version.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + */
>>>> +#include <xen/sched.h>
>>>> +
>>>> +#include <asm/mmio.h>
>>>> +
>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>> 
>>> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
>>> Also isn't this effectively part of the public interface (along with
>>> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?
>> 
>> I will move that in the next version to xen/pci.h and rename itMMCFG_REG_OFFSET.
>> Would that be ok ?
> 
> That would be okay and make sense when put next to MMCFG_BDF(), but
> it would not address my comment: That still wouldn't be the public
> interface. Otoh you only mimic hardware behavior, so perhaps the
> splitting of the address isn't as relevant to put there as would be
> GUEST_VPCI_ECAM_{BASE,SIZE}.

Ok now I get what you wanted.

You would actually like both MMCFG_BDF and MMCFG_REG_OFFSET to
be moved to arch-arm.h.

Then I am not quite sure to follow why.
Those are not macros coming out of a way we have to define this but from
how it works in standard PCI.
The base and size are needed to know where the PCI bus will be.

So why should those be needed in public headers ?

> 
>>>> --- a/xen/drivers/passthrough/pci.c
>>>> +++ b/xen/drivers/passthrough/pci.c
>>>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>>>    else
>>>>        iommu_enable_device(pdev);
>>> 
>>> Please note the context above; ...
>>> 
>>>> +#ifdef CONFIG_ARM
>>>> +    /*
>>>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
>>>> +     * Dom0 inform XEN to add the PCI devices in XEN.
>>>> +     */
>>>> +    ret = vpci_add_handlers(pdev);
>>>> +    if ( ret )
>>>> +    {
>>>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
>>>> +        pci_cleanup_msi(pdev);
>>>> +        ret = iommu_remove_device(pdev);
>>>> +        if ( pdev->domain )
>>>> +            list_del(&pdev->domain_list);
>>>> +        free_pdev(pseg, pdev);
>>> 
>>> ... you unconditionally undo the if() part of the earlier conditional;
>>> is there any guarantee that the other path can't ever be taken, now
>>> and forever? If it can't be taken now (which I think is the case as
>>> long as Dom0 wouldn't report the same device twice), then at least some
>>> precaution wants taking. Maybe moving your addition into that if()
>>> could be an option.
>>> 
>>> Furthermore I continue to wonder whether this ordering is indeed
>>> preferable over doing software setup before hardware arrangements. This
>>> would address the above issue as well as long as vpci_add_handlers() is
>>> idempotent. And it would likely simplify error cleanup.
>> 
>> I agree with you so I will move this code block before iommu part.
>> 
>> But digging deeper into this, I would have 2 questions:
>> 
>> - msi_cleanup was done there after a request from Stefano, but is not
>> done in case or iommu error, is there an issue to solve here ?
> 
> Maybe, but I'm not sure. This very much depends on what a domain
> could in principle do with a partly set-up device. Plus let's
> not forget that we're talking of only Dom0 here (for now at least,
> i.e. not considering the dom0less case).
> 
> But I'd also like to further defer to Stefano.

Ok, I must admit I do not really see at that stage why doing an MSI cleanup
could be needed so I will wait for Stefano to know if I need to keep this when
moving the block up (at the end it is theoretical right now as this is empty).

> 
>> Same could also go for the free_pdev ?
> 
> I think it's wrong to free_pdev() here. We want to internally keep
> record of the device, even if the device ends up unusable. The only
> place where free_pdev() ought to be called is imo pci_remove_device().

Ok.

> 
>> - cleanup code was exactly the same as pci_remove_device code.
>> Should the question about the path also be checked there ?
> 
> I'm sorry, but I'm afraid I don't see what "the path" refers to
> here. You can't mean the conditional in pci_add_device() selecting
> between iommu_add_device() and iommu_enable_device(), as "remove"
> can only mean "remove", never "disable".

I will try to explain: when we just enable we do not add an entry in the list but
we still remove an entry from the list every time (as the condition becomes
always true because pdev->domain is at the end always set)

Regards
Bertrand

> 
> Jan
>
Jan Beulich Oct. 11, 2021, 2:10 p.m. UTC | #6
On 11.10.2021 15:34, Bertrand Marquis wrote:
>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
>> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 06.10.2021 19:40, Rahul Singh wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/arch/arm/vpci.c
>>>>> @@ -0,0 +1,102 @@
>>>>> +/*
>>>>> + * xen/arch/arm/vpci.c
>>>>> + *
>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>> + * it under the terms of the GNU General Public License as published by
>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>> + * (at your option) any later version.
>>>>> + *
>>>>> + * This program is distributed in the hope that it will be useful,
>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>> + * GNU General Public License for more details.
>>>>> + */
>>>>> +#include <xen/sched.h>
>>>>> +
>>>>> +#include <asm/mmio.h>
>>>>> +
>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>
>>>> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
>>>> Also isn't this effectively part of the public interface (along with
>>>> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?
>>>
>>> I will move that in the next version to xen/pci.h and rename itMMCFG_REG_OFFSET.
>>> Would that be ok ?
>>
>> That would be okay and make sense when put next to MMCFG_BDF(), but
>> it would not address my comment: That still wouldn't be the public
>> interface. Otoh you only mimic hardware behavior, so perhaps the
>> splitting of the address isn't as relevant to put there as would be
>> GUEST_VPCI_ECAM_{BASE,SIZE}.
> 
> Ok now I get what you wanted.
> 
> You would actually like both MMCFG_BDF and MMCFG_REG_OFFSET to
> be moved to arch-arm.h.
> 
> Then I am not quite sure to follow why.
> Those are not macros coming out of a way we have to define this but from
> how it works in standard PCI.
> The base and size are needed to know where the PCI bus will be.
> 
> So why should those be needed in public headers ?

Well, see my "Otoh ..." in the earlier reply. Keeping the two
address splitting macros out of there is okay.

>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>>>>    else
>>>>>        iommu_enable_device(pdev);
>>>>
>>>> Please note the context above; ...
>>>>
>>>>> +#ifdef CONFIG_ARM
>>>>> +    /*
>>>>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
>>>>> +     * Dom0 inform XEN to add the PCI devices in XEN.
>>>>> +     */
>>>>> +    ret = vpci_add_handlers(pdev);
>>>>> +    if ( ret )
>>>>> +    {
>>>>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
>>>>> +        pci_cleanup_msi(pdev);
>>>>> +        ret = iommu_remove_device(pdev);
>>>>> +        if ( pdev->domain )
>>>>> +            list_del(&pdev->domain_list);
>>>>> +        free_pdev(pseg, pdev);
>>>>
>>>> ... you unconditionally undo the if() part of the earlier conditional;
>>>> is there any guarantee that the other path can't ever be taken, now
>>>> and forever? If it can't be taken now (which I think is the case as
>>>> long as Dom0 wouldn't report the same device twice), then at least some
>>>> precaution wants taking. Maybe moving your addition into that if()
>>>> could be an option.
>>>>
>>>> Furthermore I continue to wonder whether this ordering is indeed
>>>> preferable over doing software setup before hardware arrangements. This
>>>> would address the above issue as well as long as vpci_add_handlers() is
>>>> idempotent. And it would likely simplify error cleanup.
>>>
>>> I agree with you so I will move this code block before iommu part.
>>>
>>> But digging deeper into this, I would have 2 questions:
>>>
>>> - msi_cleanup was done there after a request from Stefano, but is not
>>> done in case or iommu error, is there an issue to solve here ?
>>
>> Maybe, but I'm not sure. This very much depends on what a domain
>> could in principle do with a partly set-up device. Plus let's
>> not forget that we're talking of only Dom0 here (for now at least,
>> i.e. not considering the dom0less case).
>>
>> But I'd also like to further defer to Stefano.
> 
> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
> could be needed so I will wait for Stefano to know if I need to keep this when
> moving the block up (at the end it is theoretical right now as this is empty).
> 
>>
>>> Same could also go for the free_pdev ?
>>
>> I think it's wrong to free_pdev() here. We want to internally keep
>> record of the device, even if the device ends up unusable. The only
>> place where free_pdev() ought to be called is imo pci_remove_device().
> 
> Ok.
> 
>>
>>> - cleanup code was exactly the same as pci_remove_device code.
>>> Should the question about the path also be checked there ?
>>
>> I'm sorry, but I'm afraid I don't see what "the path" refers to
>> here. You can't mean the conditional in pci_add_device() selecting
>> between iommu_add_device() and iommu_enable_device(), as "remove"
>> can only mean "remove", never "disable".
> 
> I will try to explain: when we just enable we do not add an entry in the list but
> we still remove an entry from the list every time (as the condition becomes
> always true because pdev->domain is at the end always set)

Well, that anomaly is what I did point out in my review remarks to
Rahul. We shouldn't remove an entry from the list if we didn't add
one. But quite likely, if we don't free the pdev, we shouldn't be
removing the list entry in either case.

Jan
Bertrand Marquis Oct. 11, 2021, 2:52 p.m. UTC | #7
Hi Jan,

> On 11 Oct 2021, at 15:10, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 11.10.2021 15:34, Bertrand Marquis wrote:
>>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
>>> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> On 06.10.2021 19:40, Rahul Singh wrote:
>>>>>> --- /dev/null
>>>>>> +++ b/xen/arch/arm/vpci.c
>>>>>> @@ -0,0 +1,102 @@
>>>>>> +/*
>>>>>> + * xen/arch/arm/vpci.c
>>>>>> + *
>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>> + * it under the terms of the GNU General Public License as published by
>>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>>> + * (at your option) any later version.
>>>>>> + *
>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>> + * GNU General Public License for more details.
>>>>>> + */
>>>>>> +#include <xen/sched.h>
>>>>>> +
>>>>>> +#include <asm/mmio.h>
>>>>>> +
>>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>> 
>>>>> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
>>>>> Also isn't this effectively part of the public interface (along with
>>>>> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?
>>>> 
>>>> I will move that in the next version to xen/pci.h and rename itMMCFG_REG_OFFSET.
>>>> Would that be ok ?
>>> 
>>> That would be okay and make sense when put next to MMCFG_BDF(), but
>>> it would not address my comment: That still wouldn't be the public
>>> interface. Otoh you only mimic hardware behavior, so perhaps the
>>> splitting of the address isn't as relevant to put there as would be
>>> GUEST_VPCI_ECAM_{BASE,SIZE}.
>> 
>> Ok now I get what you wanted.
>> 
>> You would actually like both MMCFG_BDF and MMCFG_REG_OFFSET to
>> be moved to arch-arm.h.
>> 
>> Then I am not quite sure to follow why.
>> Those are not macros coming out of a way we have to define this but from
>> how it works in standard PCI.
>> The base and size are needed to know where the PCI bus will be.
>> 
>> So why should those be needed in public headers ?
> 
> Well, see my "Otoh ..." in the earlier reply. Keeping the two
> address splitting macros out of there is okay.

Ok.

> 
>>>>>> --- a/xen/drivers/passthrough/pci.c
>>>>>> +++ b/xen/drivers/passthrough/pci.c
>>>>>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
>>>>>>   else
>>>>>>       iommu_enable_device(pdev);
>>>>> 
>>>>> Please note the context above; ...
>>>>> 
>>>>>> +#ifdef CONFIG_ARM
>>>>>> +    /*
>>>>>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
>>>>>> +     * Dom0 inform XEN to add the PCI devices in XEN.
>>>>>> +     */
>>>>>> +    ret = vpci_add_handlers(pdev);
>>>>>> +    if ( ret )
>>>>>> +    {
>>>>>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
>>>>>> +        pci_cleanup_msi(pdev);
>>>>>> +        ret = iommu_remove_device(pdev);
>>>>>> +        if ( pdev->domain )
>>>>>> +            list_del(&pdev->domain_list);
>>>>>> +        free_pdev(pseg, pdev);
>>>>> 
>>>>> ... you unconditionally undo the if() part of the earlier conditional;
>>>>> is there any guarantee that the other path can't ever be taken, now
>>>>> and forever? If it can't be taken now (which I think is the case as
>>>>> long as Dom0 wouldn't report the same device twice), then at least some
>>>>> precaution wants taking. Maybe moving your addition into that if()
>>>>> could be an option.
>>>>> 
>>>>> Furthermore I continue to wonder whether this ordering is indeed
>>>>> preferable over doing software setup before hardware arrangements. This
>>>>> would address the above issue as well as long as vpci_add_handlers() is
>>>>> idempotent. And it would likely simplify error cleanup.
>>>> 
>>>> I agree with you so I will move this code block before iommu part.
>>>> 
>>>> But digging deeper into this, I would have 2 questions:
>>>> 
>>>> - msi_cleanup was done there after a request from Stefano, but is not
>>>> done in case or iommu error, is there an issue to solve here ?
>>> 
>>> Maybe, but I'm not sure. This very much depends on what a domain
>>> could in principle do with a partly set-up device. Plus let's
>>> not forget that we're talking of only Dom0 here (for now at least,
>>> i.e. not considering the dom0less case).
>>> 
>>> But I'd also like to further defer to Stefano.
>> 
>> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
>> could be needed so I will wait for Stefano to know if I need to keep this when
>> moving the block up (at the end it is theoretical right now as this is empty).
>> 
>>> 
>>>> Same could also go for the free_pdev ?
>>> 
>>> I think it's wrong to free_pdev() here. We want to internally keep
>>> record of the device, even if the device ends up unusable. The only
>>> place where free_pdev() ought to be called is imo pci_remove_device().
>> 
>> Ok.
>> 
>>> 
>>>> - cleanup code was exactly the same as pci_remove_device code.
>>>> Should the question about the path also be checked there ?
>>> 
>>> I'm sorry, but I'm afraid I don't see what "the path" refers to
>>> here. You can't mean the conditional in pci_add_device() selecting
>>> between iommu_add_device() and iommu_enable_device(), as "remove"
>>> can only mean "remove", never "disable".
>> 
>> I will try to explain: when we just enable we do not add an entry in the list but
>> we still remove an entry from the list every time (as the condition becomes
>> always true because pdev->domain is at the end always set)
> 
> Well, that anomaly is what I did point out in my review remarks to
> Rahul. We shouldn't remove an entry from the list if we didn't add
> one. But quite likely, if we don't free the pdev, we shouldn't be
> removing the list entry in either case.

This problem will not exist anymore when I will move the code up but I will add to adapt the error case in iommu to also remove the vpci handlers.
To be coherent I will do the same in the pci_remove_device code.

I will do all those in the v6 of the serie.

Thanks a lot for the answers.

Cheers
Bertrand

> 
> Jan
>
Bertrand Marquis Oct. 11, 2021, 4:12 p.m. UTC | #8
Hi Roger,

> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
> 
> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>> The existing VPCI support available for X86 is adapted for Arm.
>> When the device is added to XEN via the hyper call
>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>> access is added to the Xen to emulate the PCI devices config space.
>> 
>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>> so that when guest is trying to access the PCI config space,XEN
>> will trap the access and emulate read/write using the VPCI and
>> not the real PCI hardware.
>> 
>> For Dom0less systems scan_pci_devices() would be used to discover the
>> PCI device in XEN and VPCI handler will be added during XEN boots.
>> 
>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> ---
>> Change in v5:
>> - Add pci_cleanup_msi(pdev) in cleanup part.
>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> Change in v4:
>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>> Change in v3:
>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>> Change in v2:
>> - Add new XEN_DOMCTL_CDF_vpci flag
>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>> - enable vpci support when pci-passthough option is enabled.
>> ---
>> ---
>> xen/arch/arm/Makefile         |   1 +
>> xen/arch/arm/domain.c         |   4 ++
>> xen/arch/arm/domain_build.c   |   3 +
>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>> xen/drivers/passthrough/pci.c |  18 ++++++
>> xen/include/asm-arm/domain.h  |   7 ++-
>> xen/include/asm-x86/pci.h     |   2 -
>> xen/include/public/arch-arm.h |   7 +++
>> xen/include/xen/pci.h         |   2 +
>> 10 files changed, 179 insertions(+), 3 deletions(-)
>> create mode 100644 xen/arch/arm/vpci.c
>> create mode 100644 xen/arch/arm/vpci.h
>> 
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 44d7cc81fa..fb9c976ea2 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>> obj-y += platforms/
>> endif
>> obj-$(CONFIG_TEE) += tee/
>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>> 
>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>> obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 36138c1b2e..fbb52f78f1 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -39,6 +39,7 @@
>> #include <asm/vgic.h>
>> #include <asm/vtimer.h>
>> 
>> +#include "vpci.h"
>> #include "vuart.h"
>> 
>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>         goto fail;
>> 
>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>> +        goto fail;
>> +
>>     return 0;
>> 
>> fail:
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index c5afbe2e05..f4c89bde8c 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>     if ( iommu_enabled )
>>         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>> 
>> +    if ( is_pci_passthrough_enabled() )
>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
> 
> I think I'm confused with this. You seem to enable vPCI for dom0, but
> then domain_vpci_init will setup traps for the guest virtual ECAM
> layout, not the native one that dom0 will be using.

I think after the last discussions, it was decided to also installed the vpci
handler for dom0. I will have to look into this and come back to you.
@Oleksandr: Could you comment on this.

> 
>> +
>>     dom0 = domain_create(0, &dom0_cfg, true);
>>     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>>         panic("Error creating domain 0\n");
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> new file mode 100644
>> index 0000000000..76c12b9281
>> --- /dev/null
>> +++ b/xen/arch/arm/vpci.c
>> @@ -0,0 +1,102 @@
>> +/*
>> + * xen/arch/arm/vpci.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +#include <xen/sched.h>
>> +
>> +#include <asm/mmio.h>
>> +
>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>> +
>> +/* Do some sanity checks. */
>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>> +{
>> +    /* Check access size. */
>> +    if ( len > 8 )
>> +        return false;
>> +
>> +    /* Check that access is size aligned. */
>> +    if ( (reg & (len - 1)) )
>> +        return false;
>> +
>> +    return true;
>> +}
> 
> There's a vpci_access_allowed which I think you could generalize and
> use here, there's no need to have this duplicated code.

It definitely looks exactly like what we want to do.

I would need to turn this into a static inline and put it in some global header.
It is currently in arch/x86/hvm/io.c, any suggestion on the header to move it to ?

Regards
Bertrand

> 
> Thanks, Roger.
Oleksandr Andrushchenko Oct. 11, 2021, 4:20 p.m. UTC | #9
On 11.10.21 19:12, Bertrand Marquis wrote:
> Hi Roger,
>
>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>
>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>> The existing VPCI support available for X86 is adapted for Arm.
>>> When the device is added to XEN via the hyper call
>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>> access is added to the Xen to emulate the PCI devices config space.
>>>
>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>> so that when guest is trying to access the PCI config space,XEN
>>> will trap the access and emulate read/write using the VPCI and
>>> not the real PCI hardware.
>>>
>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>
>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>> ---
>>> Change in v5:
>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>> Change in v4:
>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>> Change in v3:
>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>> Change in v2:
>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>> - enable vpci support when pci-passthough option is enabled.
>>> ---
>>> ---
>>> xen/arch/arm/Makefile         |   1 +
>>> xen/arch/arm/domain.c         |   4 ++
>>> xen/arch/arm/domain_build.c   |   3 +
>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>>> xen/drivers/passthrough/pci.c |  18 ++++++
>>> xen/include/asm-arm/domain.h  |   7 ++-
>>> xen/include/asm-x86/pci.h     |   2 -
>>> xen/include/public/arch-arm.h |   7 +++
>>> xen/include/xen/pci.h         |   2 +
>>> 10 files changed, 179 insertions(+), 3 deletions(-)
>>> create mode 100644 xen/arch/arm/vpci.c
>>> create mode 100644 xen/arch/arm/vpci.h
>>>
>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>> index 44d7cc81fa..fb9c976ea2 100644
>>> --- a/xen/arch/arm/Makefile
>>> +++ b/xen/arch/arm/Makefile
>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>> obj-y += platforms/
>>> endif
>>> obj-$(CONFIG_TEE) += tee/
>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>
>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>> obj-y += bootfdt.init.o
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 36138c1b2e..fbb52f78f1 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -39,6 +39,7 @@
>>> #include <asm/vgic.h>
>>> #include <asm/vtimer.h>
>>>
>>> +#include "vpci.h"
>>> #include "vuart.h"
>>>
>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>          goto fail;
>>>
>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>> +        goto fail;
>>> +
>>>      return 0;
>>>
>>> fail:
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index c5afbe2e05..f4c89bde8c 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>      if ( iommu_enabled )
>>>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>
>>> +    if ( is_pci_passthrough_enabled() )
>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>> I think I'm confused with this. You seem to enable vPCI for dom0, but
>> then domain_vpci_init will setup traps for the guest virtual ECAM
>> layout, not the native one that dom0 will be using.
> I think after the last discussions, it was decided to also installed the vpci
> handler for dom0. I will have to look into this and come back to you.
> @Oleksandr: Could you comment on this.
Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
are in mine as it needs more preparatory work for that. Please see [1]

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/patch/20211008055535.337436-9-andr2000@gmail.com/
Roger Pau Monné Oct. 11, 2021, 4:43 p.m. UTC | #10
On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 11.10.21 19:12, Bertrand Marquis wrote:
> > Hi Roger,
> >
> >> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>
> >> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> >>> The existing VPCI support available for X86 is adapted for Arm.
> >>> When the device is added to XEN via the hyper call
> >>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
> >>> access is added to the Xen to emulate the PCI devices config space.
> >>>
> >>> A MMIO trap handler for the PCI ECAM space is registered in XEN
> >>> so that when guest is trying to access the PCI config space,XEN
> >>> will trap the access and emulate read/write using the VPCI and
> >>> not the real PCI hardware.
> >>>
> >>> For Dom0less systems scan_pci_devices() would be used to discover the
> >>> PCI device in XEN and VPCI handler will be added during XEN boots.
> >>>
> >>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> >>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >>> ---
> >>> Change in v5:
> >>> - Add pci_cleanup_msi(pdev) in cleanup part.
> >>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >>> Change in v4:
> >>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
> >>> Change in v3:
> >>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
> >>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
> >>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
> >>> Change in v2:
> >>> - Add new XEN_DOMCTL_CDF_vpci flag
> >>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
> >>> - enable vpci support when pci-passthough option is enabled.
> >>> ---
> >>> ---
> >>> xen/arch/arm/Makefile         |   1 +
> >>> xen/arch/arm/domain.c         |   4 ++
> >>> xen/arch/arm/domain_build.c   |   3 +
> >>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
> >>> xen/arch/arm/vpci.h           |  36 ++++++++++++
> >>> xen/drivers/passthrough/pci.c |  18 ++++++
> >>> xen/include/asm-arm/domain.h  |   7 ++-
> >>> xen/include/asm-x86/pci.h     |   2 -
> >>> xen/include/public/arch-arm.h |   7 +++
> >>> xen/include/xen/pci.h         |   2 +
> >>> 10 files changed, 179 insertions(+), 3 deletions(-)
> >>> create mode 100644 xen/arch/arm/vpci.c
> >>> create mode 100644 xen/arch/arm/vpci.h
> >>>
> >>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> >>> index 44d7cc81fa..fb9c976ea2 100644
> >>> --- a/xen/arch/arm/Makefile
> >>> +++ b/xen/arch/arm/Makefile
> >>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
> >>> obj-y += platforms/
> >>> endif
> >>> obj-$(CONFIG_TEE) += tee/
> >>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
> >>>
> >>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> >>> obj-y += bootfdt.init.o
> >>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> >>> index 36138c1b2e..fbb52f78f1 100644
> >>> --- a/xen/arch/arm/domain.c
> >>> +++ b/xen/arch/arm/domain.c
> >>> @@ -39,6 +39,7 @@
> >>> #include <asm/vgic.h>
> >>> #include <asm/vtimer.h>
> >>>
> >>> +#include "vpci.h"
> >>> #include "vuart.h"
> >>>
> >>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
> >>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
> >>>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
> >>>          goto fail;
> >>>
> >>> +    if ( (rc = domain_vpci_init(d)) != 0 )
> >>> +        goto fail;
> >>> +
> >>>      return 0;
> >>>
> >>> fail:
> >>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>> index c5afbe2e05..f4c89bde8c 100644
> >>> --- a/xen/arch/arm/domain_build.c
> >>> +++ b/xen/arch/arm/domain_build.c
> >>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
> >>>      if ( iommu_enabled )
> >>>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> >>>
> >>> +    if ( is_pci_passthrough_enabled() )
> >>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
> >> I think I'm confused with this. You seem to enable vPCI for dom0, but
> >> then domain_vpci_init will setup traps for the guest virtual ECAM
> >> layout, not the native one that dom0 will be using.
> > I think after the last discussions, it was decided to also installed the vpci
> > handler for dom0. I will have to look into this and come back to you.
> > @Oleksandr: Could you comment on this.
> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
> are in mine as it needs more preparatory work for that. Please see [1]

Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
should instead be done in the patch where dom0 support is introduced.

Thanks, Roger.
Bertrand Marquis Oct. 11, 2021, 5:15 p.m. UTC | #11
Hi Roger,

> On 11 Oct 2021, at 17:43, Roger Pau Monné <roger.pau@citrix.com> wrote:
> 
> On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
>> 
>> 
>> On 11.10.21 19:12, Bertrand Marquis wrote:
>>> Hi Roger,
>>> 
>>>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>> 
>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>> The existing VPCI support available for X86 is adapted for Arm.
>>>>> When the device is added to XEN via the hyper call
>>>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>>>> access is added to the Xen to emulate the PCI devices config space.
>>>>> 
>>>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>>>> so that when guest is trying to access the PCI config space,XEN
>>>>> will trap the access and emulate read/write using the VPCI and
>>>>> not the real PCI hardware.
>>>>> 
>>>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>>> 
>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> ---
>>>>> Change in v5:
>>>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> Change in v4:
>>>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>>>> Change in v3:
>>>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>>>> Change in v2:
>>>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>>>> - enable vpci support when pci-passthough option is enabled.
>>>>> ---
>>>>> ---
>>>>> xen/arch/arm/Makefile         |   1 +
>>>>> xen/arch/arm/domain.c         |   4 ++
>>>>> xen/arch/arm/domain_build.c   |   3 +
>>>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>>>>> xen/drivers/passthrough/pci.c |  18 ++++++
>>>>> xen/include/asm-arm/domain.h  |   7 ++-
>>>>> xen/include/asm-x86/pci.h     |   2 -
>>>>> xen/include/public/arch-arm.h |   7 +++
>>>>> xen/include/xen/pci.h         |   2 +
>>>>> 10 files changed, 179 insertions(+), 3 deletions(-)
>>>>> create mode 100644 xen/arch/arm/vpci.c
>>>>> create mode 100644 xen/arch/arm/vpci.h
>>>>> 
>>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>>>> index 44d7cc81fa..fb9c976ea2 100644
>>>>> --- a/xen/arch/arm/Makefile
>>>>> +++ b/xen/arch/arm/Makefile
>>>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>>>> obj-y += platforms/
>>>>> endif
>>>>> obj-$(CONFIG_TEE) += tee/
>>>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>>> 
>>>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>>>> obj-y += bootfdt.init.o
>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>>> index 36138c1b2e..fbb52f78f1 100644
>>>>> --- a/xen/arch/arm/domain.c
>>>>> +++ b/xen/arch/arm/domain.c
>>>>> @@ -39,6 +39,7 @@
>>>>> #include <asm/vgic.h>
>>>>> #include <asm/vtimer.h>
>>>>> 
>>>>> +#include "vpci.h"
>>>>> #include "vuart.h"
>>>>> 
>>>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>>>     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>>>         goto fail;
>>>>> 
>>>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>>>> +        goto fail;
>>>>> +
>>>>>     return 0;
>>>>> 
>>>>> fail:
>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>> index c5afbe2e05..f4c89bde8c 100644
>>>>> --- a/xen/arch/arm/domain_build.c
>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>>>     if ( iommu_enabled )
>>>>>         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>>> 
>>>>> +    if ( is_pci_passthrough_enabled() )
>>>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>>>> I think I'm confused with this. You seem to enable vPCI for dom0, but
>>>> then domain_vpci_init will setup traps for the guest virtual ECAM
>>>> layout, not the native one that dom0 will be using.
>>> I think after the last discussions, it was decided to also installed the vpci
>>> handler for dom0. I will have to look into this and come back to you.
>>> @Oleksandr: Could you comment on this.
>> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
>> are in mine as it needs more preparatory work for that. Please see [1]
> 
> Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
> should instead be done in the patch where dom0 support is introduced.

Ok I will check to remove this and include the change in v6.

Cheers
Bertrand

> 
> Thanks, Roger.
Stefano Stabellini Oct. 11, 2021, 6:18 p.m. UTC | #12
On Mon, 11 Oct 2021, Jan Beulich wrote:
> On 11.10.2021 15:34, Bertrand Marquis wrote:
> >> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
> >> On 11.10.2021 14:41, Bertrand Marquis wrote:
> >>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 06.10.2021 19:40, Rahul Singh wrote:
> >>>>> --- /dev/null
> >>>>> +++ b/xen/arch/arm/vpci.c
> >>>>> @@ -0,0 +1,102 @@
> >>>>> +/*
> >>>>> + * xen/arch/arm/vpci.c
> >>>>> + *
> >>>>> + * This program is free software; you can redistribute it and/or modify
> >>>>> + * it under the terms of the GNU General Public License as published by
> >>>>> + * the Free Software Foundation; either version 2 of the License, or
> >>>>> + * (at your option) any later version.
> >>>>> + *
> >>>>> + * This program is distributed in the hope that it will be useful,
> >>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >>>>> + * GNU General Public License for more details.
> >>>>> + */
> >>>>> +#include <xen/sched.h>
> >>>>> +
> >>>>> +#include <asm/mmio.h>
> >>>>> +
> >>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> >>>>
> >>>> Nit: Stray blank (like you had in an earlier version for MMCFG_BDF()).
> >>>> Also isn't this effectively part of the public interface (along with
> >>>> MMCFG_BDF()), alongside GUEST_VPCI_ECAM_{BASE,SIZE}?
> >>>
> >>> I will move that in the next version to xen/pci.h and rename itMMCFG_REG_OFFSET.
> >>> Would that be ok ?
> >>
> >> That would be okay and make sense when put next to MMCFG_BDF(), but
> >> it would not address my comment: That still wouldn't be the public
> >> interface. Otoh you only mimic hardware behavior, so perhaps the
> >> splitting of the address isn't as relevant to put there as would be
> >> GUEST_VPCI_ECAM_{BASE,SIZE}.
> > 
> > Ok now I get what you wanted.
> > 
> > You would actually like both MMCFG_BDF and MMCFG_REG_OFFSET to
> > be moved to arch-arm.h.
> > 
> > Then I am not quite sure to follow why.
> > Those are not macros coming out of a way we have to define this but from
> > how it works in standard PCI.
> > The base and size are needed to know where the PCI bus will be.
> > 
> > So why should those be needed in public headers ?
> 
> Well, see my "Otoh ..." in the earlier reply. Keeping the two
> address splitting macros out of there is okay.
> 
> >>>>> --- a/xen/drivers/passthrough/pci.c
> >>>>> +++ b/xen/drivers/passthrough/pci.c
> >>>>> @@ -766,6 +766,24 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
> >>>>>    else
> >>>>>        iommu_enable_device(pdev);
> >>>>
> >>>> Please note the context above; ...
> >>>>
> >>>>> +#ifdef CONFIG_ARM
> >>>>> +    /*
> >>>>> +     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
> >>>>> +     * Dom0 inform XEN to add the PCI devices in XEN.
> >>>>> +     */
> >>>>> +    ret = vpci_add_handlers(pdev);
> >>>>> +    if ( ret )
> >>>>> +    {
> >>>>> +        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
> >>>>> +        pci_cleanup_msi(pdev);
> >>>>> +        ret = iommu_remove_device(pdev);
> >>>>> +        if ( pdev->domain )
> >>>>> +            list_del(&pdev->domain_list);
> >>>>> +        free_pdev(pseg, pdev);
> >>>>
> >>>> ... you unconditionally undo the if() part of the earlier conditional;
> >>>> is there any guarantee that the other path can't ever be taken, now
> >>>> and forever? If it can't be taken now (which I think is the case as
> >>>> long as Dom0 wouldn't report the same device twice), then at least some
> >>>> precaution wants taking. Maybe moving your addition into that if()
> >>>> could be an option.
> >>>>
> >>>> Furthermore I continue to wonder whether this ordering is indeed
> >>>> preferable over doing software setup before hardware arrangements. This
> >>>> would address the above issue as well as long as vpci_add_handlers() is
> >>>> idempotent. And it would likely simplify error cleanup.
> >>>
> >>> I agree with you so I will move this code block before iommu part.
> >>>
> >>> But digging deeper into this, I would have 2 questions:
> >>>
> >>> - msi_cleanup was done there after a request from Stefano, but is not
> >>> done in case or iommu error, is there an issue to solve here ?
> >>
> >> Maybe, but I'm not sure. This very much depends on what a domain
> >> could in principle do with a partly set-up device. Plus let's
> >> not forget that we're talking of only Dom0 here (for now at least,
> >> i.e. not considering the dom0less case).
> >>
> >> But I'd also like to further defer to Stefano.
> > 
> > Ok, I must admit I do not really see at that stage why doing an MSI cleanup
> > could be needed so I will wait for Stefano to know if I need to keep this when
> > moving the block up (at the end it is theoretical right now as this is empty).

I know that MSIs are not supported yet on ARM (pci_cleanup_msi does
nothing). But I wanted to make sure that the pci_cleanup_msi() calls are
present anywhere necessary, especially on the error paths. So that once
we add MSI support, we don't need to search through the code to find all
the error paths missing a pci_cleanup_msi() call.

To answer your first question: you are right, we are also missing a
pci_cleanup_msi() call in the case of IOMMU error. So it might be better
to move the call to pci_cleanup_msi() under the "out" label so that we
can do it once for both cases.

To answer your second point about whether it is necessary at all: if
MSIs and MSI-Xs cannot be already setup at this point at all (not even
the enable bit), then we don't need any call to pci_cleanup_msi() in
pci_add_device.
Oleksandr Andrushchenko Oct. 11, 2021, 6:30 p.m. UTC | #13
On 11.10.21 20:15, Bertrand Marquis wrote:
> Hi Roger,
>
>> On 11 Oct 2021, at 17:43, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>
>> On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
>>>
>>> On 11.10.21 19:12, Bertrand Marquis wrote:
>>>> Hi Roger,
>>>>
>>>>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>
>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>> The existing VPCI support available for X86 is adapted for Arm.
>>>>>> When the device is added to XEN via the hyper call
>>>>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>>>>> access is added to the Xen to emulate the PCI devices config space.
>>>>>>
>>>>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>>>>> so that when guest is trying to access the PCI config space,XEN
>>>>>> will trap the access and emulate read/write using the VPCI and
>>>>>> not the real PCI hardware.
>>>>>>
>>>>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>>>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>>>>
>>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>> ---
>>>>>> Change in v5:
>>>>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>>>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>> Change in v4:
>>>>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>>>>> Change in v3:
>>>>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>>>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>>>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>>>>> Change in v2:
>>>>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>>>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>>>>> - enable vpci support when pci-passthough option is enabled.
>>>>>> ---
>>>>>> ---
>>>>>> xen/arch/arm/Makefile         |   1 +
>>>>>> xen/arch/arm/domain.c         |   4 ++
>>>>>> xen/arch/arm/domain_build.c   |   3 +
>>>>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>>>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>>>>>> xen/drivers/passthrough/pci.c |  18 ++++++
>>>>>> xen/include/asm-arm/domain.h  |   7 ++-
>>>>>> xen/include/asm-x86/pci.h     |   2 -
>>>>>> xen/include/public/arch-arm.h |   7 +++
>>>>>> xen/include/xen/pci.h         |   2 +
>>>>>> 10 files changed, 179 insertions(+), 3 deletions(-)
>>>>>> create mode 100644 xen/arch/arm/vpci.c
>>>>>> create mode 100644 xen/arch/arm/vpci.h
>>>>>>
>>>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>>>>> index 44d7cc81fa..fb9c976ea2 100644
>>>>>> --- a/xen/arch/arm/Makefile
>>>>>> +++ b/xen/arch/arm/Makefile
>>>>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>>>>> obj-y += platforms/
>>>>>> endif
>>>>>> obj-$(CONFIG_TEE) += tee/
>>>>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>>>>
>>>>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>>>>> obj-y += bootfdt.init.o
>>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>>>> index 36138c1b2e..fbb52f78f1 100644
>>>>>> --- a/xen/arch/arm/domain.c
>>>>>> +++ b/xen/arch/arm/domain.c
>>>>>> @@ -39,6 +39,7 @@
>>>>>> #include <asm/vgic.h>
>>>>>> #include <asm/vtimer.h>
>>>>>>
>>>>>> +#include "vpci.h"
>>>>>> #include "vuart.h"
>>>>>>
>>>>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>>>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>>>>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>>>>          goto fail;
>>>>>>
>>>>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>>>>> +        goto fail;
>>>>>> +
>>>>>>      return 0;
>>>>>>
>>>>>> fail:
>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>> index c5afbe2e05..f4c89bde8c 100644
>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>>>>      if ( iommu_enabled )
>>>>>>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>>>>
>>>>>> +    if ( is_pci_passthrough_enabled() )
>>>>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>>>>> I think I'm confused with this. You seem to enable vPCI for dom0, but
>>>>> then domain_vpci_init will setup traps for the guest virtual ECAM
>>>>> layout, not the native one that dom0 will be using.
>>>> I think after the last discussions, it was decided to also installed the vpci
>>>> handler for dom0. I will have to look into this and come back to you.
>>>> @Oleksandr: Could you comment on this.
>>> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
>>> are in mine as it needs more preparatory work for that. Please see [1]
>> Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
>> should instead be done in the patch where dom0 support is introduced.
> Ok I will check to remove this and include the change in v6.
Just to make it clear: do we want to remove this piece from this patch
and instead have a dedicated patch on top of my series, so it is enabled
right after we have the code that sets up the trap handlers for Dom0?
If so, then do we want that patch to be chained in my series or sent as
a follow up right after it separately?

Thanks,
Oleksandr
>
> Cheers
> Bertrand
>
>> Thanks, Roger.
Stefano Stabellini Oct. 11, 2021, 7:27 p.m. UTC | #14
On Mon, 11 Oct 2021, Oleksandr Andrushchenko wrote:
> On 11.10.21 20:15, Bertrand Marquis wrote:
> > Hi Roger,
> >
> >> On 11 Oct 2021, at 17:43, Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>
> >> On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
> >>>
> >>> On 11.10.21 19:12, Bertrand Marquis wrote:
> >>>> Hi Roger,
> >>>>
> >>>>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>>>>
> >>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> >>>>>> The existing VPCI support available for X86 is adapted for Arm.
> >>>>>> When the device is added to XEN via the hyper call
> >>>>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
> >>>>>> access is added to the Xen to emulate the PCI devices config space.
> >>>>>>
> >>>>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
> >>>>>> so that when guest is trying to access the PCI config space,XEN
> >>>>>> will trap the access and emulate read/write using the VPCI and
> >>>>>> not the real PCI hardware.
> >>>>>>
> >>>>>> For Dom0less systems scan_pci_devices() would be used to discover the
> >>>>>> PCI device in XEN and VPCI handler will be added during XEN boots.
> >>>>>>
> >>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> >>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >>>>>> ---
> >>>>>> Change in v5:
> >>>>>> - Add pci_cleanup_msi(pdev) in cleanup part.
> >>>>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >>>>>> Change in v4:
> >>>>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
> >>>>>> Change in v3:
> >>>>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
> >>>>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
> >>>>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
> >>>>>> Change in v2:
> >>>>>> - Add new XEN_DOMCTL_CDF_vpci flag
> >>>>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
> >>>>>> - enable vpci support when pci-passthough option is enabled.
> >>>>>> ---
> >>>>>> ---
> >>>>>> xen/arch/arm/Makefile         |   1 +
> >>>>>> xen/arch/arm/domain.c         |   4 ++
> >>>>>> xen/arch/arm/domain_build.c   |   3 +
> >>>>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
> >>>>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
> >>>>>> xen/drivers/passthrough/pci.c |  18 ++++++
> >>>>>> xen/include/asm-arm/domain.h  |   7 ++-
> >>>>>> xen/include/asm-x86/pci.h     |   2 -
> >>>>>> xen/include/public/arch-arm.h |   7 +++
> >>>>>> xen/include/xen/pci.h         |   2 +
> >>>>>> 10 files changed, 179 insertions(+), 3 deletions(-)
> >>>>>> create mode 100644 xen/arch/arm/vpci.c
> >>>>>> create mode 100644 xen/arch/arm/vpci.h
> >>>>>>
> >>>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> >>>>>> index 44d7cc81fa..fb9c976ea2 100644
> >>>>>> --- a/xen/arch/arm/Makefile
> >>>>>> +++ b/xen/arch/arm/Makefile
> >>>>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
> >>>>>> obj-y += platforms/
> >>>>>> endif
> >>>>>> obj-$(CONFIG_TEE) += tee/
> >>>>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
> >>>>>>
> >>>>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> >>>>>> obj-y += bootfdt.init.o
> >>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> >>>>>> index 36138c1b2e..fbb52f78f1 100644
> >>>>>> --- a/xen/arch/arm/domain.c
> >>>>>> +++ b/xen/arch/arm/domain.c
> >>>>>> @@ -39,6 +39,7 @@
> >>>>>> #include <asm/vgic.h>
> >>>>>> #include <asm/vtimer.h>
> >>>>>>
> >>>>>> +#include "vpci.h"
> >>>>>> #include "vuart.h"
> >>>>>>
> >>>>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
> >>>>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
> >>>>>>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
> >>>>>>          goto fail;
> >>>>>>
> >>>>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
> >>>>>> +        goto fail;
> >>>>>> +
> >>>>>>      return 0;
> >>>>>>
> >>>>>> fail:
> >>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >>>>>> index c5afbe2e05..f4c89bde8c 100644
> >>>>>> --- a/xen/arch/arm/domain_build.c
> >>>>>> +++ b/xen/arch/arm/domain_build.c
> >>>>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
> >>>>>>      if ( iommu_enabled )
> >>>>>>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> >>>>>>
> >>>>>> +    if ( is_pci_passthrough_enabled() )
> >>>>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
> >>>>> I think I'm confused with this. You seem to enable vPCI for dom0, but
> >>>>> then domain_vpci_init will setup traps for the guest virtual ECAM
> >>>>> layout, not the native one that dom0 will be using.
> >>>> I think after the last discussions, it was decided to also installed the vpci
> >>>> handler for dom0. I will have to look into this and come back to you.
> >>>> @Oleksandr: Could you comment on this.
> >>> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
> >>> are in mine as it needs more preparatory work for that. Please see [1]
> >> Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
> >> should instead be done in the patch where dom0 support is introduced.
> > Ok I will check to remove this and include the change in v6.
> Just to make it clear: do we want to remove this piece from this patch
> and instead have a dedicated patch on top of my series, so it is enabled
> right after we have the code that sets up the trap handlers for Dom0?
> If so, then do we want that patch to be chained in my series or sent as
> a follow up right after it separately?

I think we want to remove the XEN_DOMCTL_CDF_vpci chunk from this patch.

Where exactly it should be introduced, I am not sure. I think it would
be OK as a separate single patch at the end. I doesn't have to be part
of the outstanding series, considering that we are also missing the
patch to add "select HAS_PCI" for ARM.
Oleksandr Andrushchenko Oct. 12, 2021, 5:34 a.m. UTC | #15
On 11.10.21 22:27, Stefano Stabellini wrote:
> On Mon, 11 Oct 2021, Oleksandr Andrushchenko wrote:
>> On 11.10.21 20:15, Bertrand Marquis wrote:
>>> Hi Roger,
>>>
>>>> On 11 Oct 2021, at 17:43, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>
>>>> On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
>>>>> On 11.10.21 19:12, Bertrand Marquis wrote:
>>>>>> Hi Roger,
>>>>>>
>>>>>>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>>>
>>>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>>>> The existing VPCI support available for X86 is adapted for Arm.
>>>>>>>> When the device is added to XEN via the hyper call
>>>>>>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>>>>>>> access is added to the Xen to emulate the PCI devices config space.
>>>>>>>>
>>>>>>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>>>>>>> so that when guest is trying to access the PCI config space,XEN
>>>>>>>> will trap the access and emulate read/write using the VPCI and
>>>>>>>> not the real PCI hardware.
>>>>>>>>
>>>>>>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>>>>>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>>>>>>
>>>>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>> ---
>>>>>>>> Change in v5:
>>>>>>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>>>>>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>> Change in v4:
>>>>>>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>>>>>>> Change in v3:
>>>>>>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>>>>>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>>>>>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>>>>>>> Change in v2:
>>>>>>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>>>>>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>>>>>>> - enable vpci support when pci-passthough option is enabled.
>>>>>>>> ---
>>>>>>>> ---
>>>>>>>> xen/arch/arm/Makefile         |   1 +
>>>>>>>> xen/arch/arm/domain.c         |   4 ++
>>>>>>>> xen/arch/arm/domain_build.c   |   3 +
>>>>>>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>>>>>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>>>>>>>> xen/drivers/passthrough/pci.c |  18 ++++++
>>>>>>>> xen/include/asm-arm/domain.h  |   7 ++-
>>>>>>>> xen/include/asm-x86/pci.h     |   2 -
>>>>>>>> xen/include/public/arch-arm.h |   7 +++
>>>>>>>> xen/include/xen/pci.h         |   2 +
>>>>>>>> 10 files changed, 179 insertions(+), 3 deletions(-)
>>>>>>>> create mode 100644 xen/arch/arm/vpci.c
>>>>>>>> create mode 100644 xen/arch/arm/vpci.h
>>>>>>>>
>>>>>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>>>>>>> index 44d7cc81fa..fb9c976ea2 100644
>>>>>>>> --- a/xen/arch/arm/Makefile
>>>>>>>> +++ b/xen/arch/arm/Makefile
>>>>>>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>>>>>>> obj-y += platforms/
>>>>>>>> endif
>>>>>>>> obj-$(CONFIG_TEE) += tee/
>>>>>>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>>>>>>
>>>>>>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>>>>>>> obj-y += bootfdt.init.o
>>>>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>>>>>> index 36138c1b2e..fbb52f78f1 100644
>>>>>>>> --- a/xen/arch/arm/domain.c
>>>>>>>> +++ b/xen/arch/arm/domain.c
>>>>>>>> @@ -39,6 +39,7 @@
>>>>>>>> #include <asm/vgic.h>
>>>>>>>> #include <asm/vtimer.h>
>>>>>>>>
>>>>>>>> +#include "vpci.h"
>>>>>>>> #include "vuart.h"
>>>>>>>>
>>>>>>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>>>>>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>>>>>>       if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>>>>>>           goto fail;
>>>>>>>>
>>>>>>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>>>>>>> +        goto fail;
>>>>>>>> +
>>>>>>>>       return 0;
>>>>>>>>
>>>>>>>> fail:
>>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>>>> index c5afbe2e05..f4c89bde8c 100644
>>>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>>>>>>       if ( iommu_enabled )
>>>>>>>>           dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>>>>>>
>>>>>>>> +    if ( is_pci_passthrough_enabled() )
>>>>>>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>>>>>>> I think I'm confused with this. You seem to enable vPCI for dom0, but
>>>>>>> then domain_vpci_init will setup traps for the guest virtual ECAM
>>>>>>> layout, not the native one that dom0 will be using.
>>>>>> I think after the last discussions, it was decided to also installed the vpci
>>>>>> handler for dom0. I will have to look into this and come back to you.
>>>>>> @Oleksandr: Could you comment on this.
>>>>> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
>>>>> are in mine as it needs more preparatory work for that. Please see [1]
>>>> Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
>>>> should instead be done in the patch where dom0 support is introduced.
>>> Ok I will check to remove this and include the change in v6.
>> Just to make it clear: do we want to remove this piece from this patch
>> and instead have a dedicated patch on top of my series, so it is enabled
>> right after we have the code that sets up the trap handlers for Dom0?
>> If so, then do we want that patch to be chained in my series or sent as
>> a follow up right after it separately?
> I think we want to remove the XEN_DOMCTL_CDF_vpci chunk from this patch.
>
> Where exactly it should be introduced, I am not sure. I think it would
> be OK as a separate single patch at the end. I doesn't have to be part
> of the outstanding series, considering that we are also missing the
> patch to add "select HAS_PCI" for ARM.
Yes, makes sense
Bertrand Marquis Oct. 12, 2021, 7:44 a.m. UTC | #16
Hi Stefano,

> On 11 Oct 2021, at 20:27, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Mon, 11 Oct 2021, Oleksandr Andrushchenko wrote:
>> On 11.10.21 20:15, Bertrand Marquis wrote:
>>> Hi Roger,
>>> 
>>>> On 11 Oct 2021, at 17:43, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>> 
>>>> On Mon, Oct 11, 2021 at 04:20:14PM +0000, Oleksandr Andrushchenko wrote:
>>>>> 
>>>>> On 11.10.21 19:12, Bertrand Marquis wrote:
>>>>>> Hi Roger,
>>>>>> 
>>>>>>> On 11 Oct 2021, at 11:51, Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>>> 
>>>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>>>> The existing VPCI support available for X86 is adapted for Arm.
>>>>>>>> When the device is added to XEN via the hyper call
>>>>>>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>>>>>>> access is added to the Xen to emulate the PCI devices config space.
>>>>>>>> 
>>>>>>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>>>>>>> so that when guest is trying to access the PCI config space,XEN
>>>>>>>> will trap the access and emulate read/write using the VPCI and
>>>>>>>> not the real PCI hardware.
>>>>>>>> 
>>>>>>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>>>>>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>>>>>> 
>>>>>>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>>>>>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>> ---
>>>>>>>> Change in v5:
>>>>>>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>>>>>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>>> Change in v4:
>>>>>>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>>>>>>> Change in v3:
>>>>>>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>>>>>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>>>>>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>>>>>>> Change in v2:
>>>>>>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>>>>>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>>>>>>> - enable vpci support when pci-passthough option is enabled.
>>>>>>>> ---
>>>>>>>> ---
>>>>>>>> xen/arch/arm/Makefile         |   1 +
>>>>>>>> xen/arch/arm/domain.c         |   4 ++
>>>>>>>> xen/arch/arm/domain_build.c   |   3 +
>>>>>>>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>>>>>>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>>>>>>>> xen/drivers/passthrough/pci.c |  18 ++++++
>>>>>>>> xen/include/asm-arm/domain.h  |   7 ++-
>>>>>>>> xen/include/asm-x86/pci.h     |   2 -
>>>>>>>> xen/include/public/arch-arm.h |   7 +++
>>>>>>>> xen/include/xen/pci.h         |   2 +
>>>>>>>> 10 files changed, 179 insertions(+), 3 deletions(-)
>>>>>>>> create mode 100644 xen/arch/arm/vpci.c
>>>>>>>> create mode 100644 xen/arch/arm/vpci.h
>>>>>>>> 
>>>>>>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>>>>>>> index 44d7cc81fa..fb9c976ea2 100644
>>>>>>>> --- a/xen/arch/arm/Makefile
>>>>>>>> +++ b/xen/arch/arm/Makefile
>>>>>>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>>>>>>> obj-y += platforms/
>>>>>>>> endif
>>>>>>>> obj-$(CONFIG_TEE) += tee/
>>>>>>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>>>>>> 
>>>>>>>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>>>>>>> obj-y += bootfdt.init.o
>>>>>>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>>>>>>> index 36138c1b2e..fbb52f78f1 100644
>>>>>>>> --- a/xen/arch/arm/domain.c
>>>>>>>> +++ b/xen/arch/arm/domain.c
>>>>>>>> @@ -39,6 +39,7 @@
>>>>>>>> #include <asm/vgic.h>
>>>>>>>> #include <asm/vtimer.h>
>>>>>>>> 
>>>>>>>> +#include "vpci.h"
>>>>>>>> #include "vuart.h"
>>>>>>>> 
>>>>>>>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>>>>>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>>>>>>     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>>>>>>         goto fail;
>>>>>>>> 
>>>>>>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>>>>>>> +        goto fail;
>>>>>>>> +
>>>>>>>>     return 0;
>>>>>>>> 
>>>>>>>> fail:
>>>>>>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>>>>>>> index c5afbe2e05..f4c89bde8c 100644
>>>>>>>> --- a/xen/arch/arm/domain_build.c
>>>>>>>> +++ b/xen/arch/arm/domain_build.c
>>>>>>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>>>>>>     if ( iommu_enabled )
>>>>>>>>         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>>>>>> 
>>>>>>>> +    if ( is_pci_passthrough_enabled() )
>>>>>>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>>>>>>> I think I'm confused with this. You seem to enable vPCI for dom0, but
>>>>>>> then domain_vpci_init will setup traps for the guest virtual ECAM
>>>>>>> layout, not the native one that dom0 will be using.
>>>>>> I think after the last discussions, it was decided to also installed the vpci
>>>>>> handler for dom0. I will have to look into this and come back to you.
>>>>>> @Oleksandr: Could you comment on this.
>>>>> Yes, we do trap Dom0 as well. The Dom0 traps are not in this series, but
>>>>> are in mine as it needs more preparatory work for that. Please see [1]
>>>> Then I don't think we should set XEN_DOMCTL_CDF_vpci for dom0 here, it
>>>> should instead be done in the patch where dom0 support is introduced.
>>> Ok I will check to remove this and include the change in v6.
>> Just to make it clear: do we want to remove this piece from this patch
>> and instead have a dedicated patch on top of my series, so it is enabled
>> right after we have the code that sets up the trap handlers for Dom0?
>> If so, then do we want that patch to be chained in my series or sent as
>> a follow up right after it separately?
> 
> I think we want to remove the XEN_DOMCTL_CDF_vpci chunk from this patch.
> 
> Where exactly it should be introduced, I am not sure. I think it would
> be OK as a separate single patch at the end. I doesn't have to be part
> of the outstanding series, considering that we are also missing the
> patch to add "select HAS_PCI" for ARM.

Agree, I will remove that from v6.

Cheers
Bertrand
Jan Beulich Oct. 12, 2021, 8:04 a.m. UTC | #17
On 11.10.2021 20:18, Stefano Stabellini wrote:
> On Mon, 11 Oct 2021, Jan Beulich wrote:
>> On 11.10.2021 15:34, Bertrand Marquis wrote:
>>>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>>>> But digging deeper into this, I would have 2 questions:
>>>>>
>>>>> - msi_cleanup was done there after a request from Stefano, but is not
>>>>> done in case or iommu error, is there an issue to solve here ?
>>>>
>>>> Maybe, but I'm not sure. This very much depends on what a domain
>>>> could in principle do with a partly set-up device. Plus let's
>>>> not forget that we're talking of only Dom0 here (for now at least,
>>>> i.e. not considering the dom0less case).
>>>>
>>>> But I'd also like to further defer to Stefano.
>>>
>>> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
>>> could be needed so I will wait for Stefano to know if I need to keep this when
>>> moving the block up (at the end it is theoretical right now as this is empty).
> 
> I know that MSIs are not supported yet on ARM (pci_cleanup_msi does
> nothing). But I wanted to make sure that the pci_cleanup_msi() calls are
> present anywhere necessary, especially on the error paths. So that once
> we add MSI support, we don't need to search through the code to find all
> the error paths missing a pci_cleanup_msi() call.
> 
> To answer your first question: you are right, we are also missing a
> pci_cleanup_msi() call in the case of IOMMU error. So it might be better
> to move the call to pci_cleanup_msi() under the "out" label so that we
> can do it once for both cases.
> 
> To answer your second point about whether it is necessary at all: if
> MSIs and MSI-Xs cannot be already setup at this point at all (not even
> the enable bit), then we don't need any call to pci_cleanup_msi() in
> pci_add_device.

Well, at the very least MSI can't be set up ahead of the traps getting
put in place. Whether partial success of putting traps in place may
allow a cunning guest to set up MSI may depend on further aspects.

Jan
Julien Grall Oct. 12, 2021, 2:32 p.m. UTC | #18
Hi Rahul,

On 06/10/2021 18:40, Rahul Singh wrote:
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index c9277b5c6d..91d614b37e 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -2,6 +2,7 @@
>   #define __ASM_DOMAIN_H__
>   
>   #include <xen/cache.h>
> +#include <xen/nospec.h>
>   #include <xen/timer.h>
>   #include <asm/page.h>
>   #include <asm/p2m.h>
> @@ -262,7 +263,11 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>   
>   #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>   
> -#define has_vpci(d)    ({ (void)(d); false; })
> +/*
> + * For X86 VPCI is enabled and tested for PVH DOM0 only but
> + * for ARM we enable support VPCI for guest domain also.
> + */

This is the sort of comment that will easily get rot if we change the 
behavior on x86. But I find a bit odd to justify the implementation 
based on x86. Can we simply avoid to mention x86?

> +#define has_vpci(d) evaluate_nospec((d)->options & XEN_DOMCTL_CDF_vpci)
>   
>   #endif /* __ASM_DOMAIN_H__ */
>   
> diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
> index e076951032..c4a4fdcbc2 100644
> --- a/xen/include/asm-x86/pci.h
> +++ b/xen/include/asm-x86/pci.h
> @@ -6,8 +6,6 @@
>   #define CF8_ADDR_HI(cf8) (  ((cf8) & 0x0f000000) >> 16)
>   #define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000))
>   
> -#define MMCFG_BDF(addr)  ( ((addr) & 0x0ffff000) >> 12)
> -
>   #define IS_SNB_GFX(id) (id == 0x01068086 || id == 0x01168086 \
>                           || id == 0x01268086 || id == 0x01028086 \
>                           || id == 0x01128086 || id == 0x01228086 \
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index d46c61fca9..44be337dec 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -418,6 +418,13 @@ typedef uint64_t xen_callback_t;
>   #define GUEST_GICV3_GICR0_BASE     xen_mk_ullong(0x03020000) /* vCPU0..127 */
>   #define GUEST_GICV3_GICR0_SIZE     xen_mk_ullong(0x01000000)
>   
> +/*
> + * 256 MB is reserved for VPCI configuration space based on calculation
> + * 256 buses × 32 devices × 8 functions × 4 KB = 256 MB
> + */
> +#define GUEST_VPCI_ECAM_BASE    xen_mk_ullong(0x10000000)
> +#define GUEST_VPCI_ECAM_SIZE    xen_mk_ullong(0x10000000)
> +
>   /* ACPI tables physical address */
>   #define GUEST_ACPI_BASE xen_mk_ullong(0x20000000)
>   #define GUEST_ACPI_SIZE xen_mk_ullong(0x02000000)
> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
> index 70ac25345c..43b8a08170 100644
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -40,6 +40,8 @@
>   #define PCI_SBDF3(s,b,df) \
>       ((pci_sbdf_t){ .sbdf = (((s) & 0xffff) << 16) | PCI_BDF2(b, df) })
>   
> +#define MMCFG_BDF(addr)  (((addr) & 0x0ffff000) >> 12)
> +
>   typedef union {
>       uint32_t sbdf;
>       struct {
> 

Cheers,
Bertrand Marquis Oct. 12, 2021, 2:34 p.m. UTC | #19
Hi Julien,

> On 12 Oct 2021, at 15:32, Julien Grall <julien@xen.org> wrote:
> 
> Hi Rahul,
> 
> On 06/10/2021 18:40, Rahul Singh wrote:
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index c9277b5c6d..91d614b37e 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -2,6 +2,7 @@
>>  #define __ASM_DOMAIN_H__
>>    #include <xen/cache.h>
>> +#include <xen/nospec.h>
>>  #include <xen/timer.h>
>>  #include <asm/page.h>
>>  #include <asm/p2m.h>
>> @@ -262,7 +263,11 @@ static inline void arch_vcpu_block(struct vcpu *v) {}
>>    #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
>>  -#define has_vpci(d)    ({ (void)(d); false; })
>> +/*
>> + * For X86 VPCI is enabled and tested for PVH DOM0 only but
>> + * for ARM we enable support VPCI for guest domain also.
>> + */
> 
> This is the sort of comment that will easily get rot if we change the behavior on x86. But I find a bit odd to justify the implementation based on x86. Can we simply avoid to mention x86?

Yes I will remove the x86 part of it.

Cheers
Bertrand

> 
>> +#define has_vpci(d) evaluate_nospec((d)->options & XEN_DOMCTL_CDF_vpci)
>>    #endif /* __ASM_DOMAIN_H__ */
>>  diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
>> index e076951032..c4a4fdcbc2 100644
>> --- a/xen/include/asm-x86/pci.h
>> +++ b/xen/include/asm-x86/pci.h
>> @@ -6,8 +6,6 @@
>>  #define CF8_ADDR_HI(cf8) (  ((cf8) & 0x0f000000) >> 16)
>>  #define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000))
>>  -#define MMCFG_BDF(addr)  ( ((addr) & 0x0ffff000) >> 12)
>> -
>>  #define IS_SNB_GFX(id) (id == 0x01068086 || id == 0x01168086 \
>>                          || id == 0x01268086 || id == 0x01028086 \
>>                          || id == 0x01128086 || id == 0x01228086 \
>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
>> index d46c61fca9..44be337dec 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -418,6 +418,13 @@ typedef uint64_t xen_callback_t;
>>  #define GUEST_GICV3_GICR0_BASE     xen_mk_ullong(0x03020000) /* vCPU0..127 */
>>  #define GUEST_GICV3_GICR0_SIZE     xen_mk_ullong(0x01000000)
>>  +/*
>> + * 256 MB is reserved for VPCI configuration space based on calculation
>> + * 256 buses × 32 devices × 8 functions × 4 KB = 256 MB
>> + */
>> +#define GUEST_VPCI_ECAM_BASE    xen_mk_ullong(0x10000000)
>> +#define GUEST_VPCI_ECAM_SIZE    xen_mk_ullong(0x10000000)
>> +
>>  /* ACPI tables physical address */
>>  #define GUEST_ACPI_BASE xen_mk_ullong(0x20000000)
>>  #define GUEST_ACPI_SIZE xen_mk_ullong(0x02000000)
>> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
>> index 70ac25345c..43b8a08170 100644
>> --- a/xen/include/xen/pci.h
>> +++ b/xen/include/xen/pci.h
>> @@ -40,6 +40,8 @@
>>  #define PCI_SBDF3(s,b,df) \
>>      ((pci_sbdf_t){ .sbdf = (((s) & 0xffff) << 16) | PCI_BDF2(b, df) })
>>  +#define MMCFG_BDF(addr)  (((addr) & 0x0ffff000) >> 12)
>> +
>>  typedef union {
>>      uint32_t sbdf;
>>      struct {
> 
> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Oct. 12, 2021, 3:04 p.m. UTC | #20
On 11/10/2021 13:41, Bertrand Marquis wrote:
> Hi Jan,

Hi Bertrand,

> As Rahul is on leave, I will answer you and make the changes needed.
> 
>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>> Independent of this - is bare metal Arm enforcing this same
>> alignment restriction (unconditionally)? Iirc on x86 we felt we'd
>> better synthesize misaligned accesses.
> 
> Unaligned IO access could be synthesise also on arm to but I would
> rather not make such a change now without testing it (and there is
> also a question of it making sense).

Yes it makes sense. I actually have an item in my TODO list to forbid 
unaligned access because they should not happen on any device we 
currently emulate.

Although, I am not aware of any issue other than the guest would shoot 
itself in the foot if this happens.

> 
> So if it is ok with you I will keep that check and discuss it with Rahul
> when he is back. I will add a comment in the code to make that clear.

I am OK with it.

[...]

>> Throughout this series I haven't been able to spot where the HAS_VPCI
>> Kconfig symbol would get selected. Hence I cannot tell whether all of
>> this is Arm64-specific. Otherwise I wonder whether size 8 actually
>> can occur on Arm32.
> 
> Dabt.size could be 3 even on ARM32 but we should not allow 64bit
> access on mmio regions on arm32.

Hmmm... Looking at the Armv7 and Armv8 spec, ldrd/strd (64-bit read) 
would not present a valid ISV. So I think it is not be possible to have 
dabt.size = 3 for 32-bit domain. But I agree we probably want to harden 
the code.

> 
> So I will surround this code with ifdef CONFIG_ARM_64 and add a test
> for len > 4 to prevent this case on 32bit.
> 
> To be completely right we should disable this also for 32bit guests but
> this change would be a bit more invasive.

I think the following should be sufficient:

if ( is_domain_32bit_domain() && len > 4 )
   return ...;

Cheers,
Bertrand Marquis Oct. 12, 2021, 4:12 p.m. UTC | #21
Hi Julien,

> On 12 Oct 2021, at 16:04, Julien Grall <julien@xen.org> wrote:
> 
> On 11/10/2021 13:41, Bertrand Marquis wrote:
>> Hi Jan,
> 
> Hi Bertrand,
> 
>> As Rahul is on leave, I will answer you and make the changes needed.
>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>> Independent of this - is bare metal Arm enforcing this same
>>> alignment restriction (unconditionally)? Iirc on x86 we felt we'd
>>> better synthesize misaligned accesses.
>> Unaligned IO access could be synthesise also on arm to but I would
>> rather not make such a change now without testing it (and there is
>> also a question of it making sense).
> 
> Yes it makes sense. I actually have an item in my TODO list to forbid unaligned access because they should not happen on any device we currently emulate.
> 
> Although, I am not aware of any issue other than the guest would shoot itself in the foot if this happens.
> 
>> So if it is ok with you I will keep that check and discuss it with Rahul
>> when he is back. I will add a comment in the code to make that clear.
> 
> I am OK with it.
> 
> [...]
> 
>>> Throughout this series I haven't been able to spot where the HAS_VPCI
>>> Kconfig symbol would get selected. Hence I cannot tell whether all of
>>> this is Arm64-specific. Otherwise I wonder whether size 8 actually
>>> can occur on Arm32.
>> Dabt.size could be 3 even on ARM32 but we should not allow 64bit
>> access on mmio regions on arm32.
> 
> Hmmm... Looking at the Armv7 and Armv8 spec, ldrd/strd (64-bit read) would not present a valid ISV. So I think it is not be possible to have dabt.size = 3 for 32-bit domain. But I agree we probably want to harden the code.
> 
>> So I will surround this code with ifdef CONFIG_ARM_64 and add a test
>> for len > 4 to prevent this case on 32bit.
>> To be completely right we should disable this also for 32bit guests but
>> this change would be a bit more invasive.
> 
> I think the following should be sufficient:
> 
> if ( is_domain_32bit_domain() && len > 4 )
>  return ...;

With the last request from Roger to use the function implemented in arch/x86/hw/io.c, the function will move to vpci.h so using is_32bit_domain will not be possible without ifdefery CONFIG_ARM.
Also I have no access to the domain there.

So the best I can do for now would be something like:
#ifdef CONFIG_ARM_32
If (len == 8)
    return false
#endif

A 32bit guest on 64bit xen would not be checked.

Would that be ok for now ?

I could add a comment in the code to warn about the limitation.

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Oct. 12, 2021, 4:20 p.m. UTC | #22
On 12/10/2021 17:12, Bertrand Marquis wrote:
> Hi Julien,
> 
>> On 12 Oct 2021, at 16:04, Julien Grall <julien@xen.org> wrote:
>>
>> On 11/10/2021 13:41, Bertrand Marquis wrote:
>>> Hi Jan,
>>
>> Hi Bertrand,
>>
>>> As Rahul is on leave, I will answer you and make the changes needed.
>>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>>> Independent of this - is bare metal Arm enforcing this same
>>>> alignment restriction (unconditionally)? Iirc on x86 we felt we'd
>>>> better synthesize misaligned accesses.
>>> Unaligned IO access could be synthesise also on arm to but I would
>>> rather not make such a change now without testing it (and there is
>>> also a question of it making sense).
>>
>> Yes it makes sense. I actually have an item in my TODO list to forbid unaligned access because they should not happen on any device we currently emulate.
>>
>> Although, I am not aware of any issue other than the guest would shoot itself in the foot if this happens.
>>
>>> So if it is ok with you I will keep that check and discuss it with Rahul
>>> when he is back. I will add a comment in the code to make that clear.
>>
>> I am OK with it.
>>
>> [...]
>>
>>>> Throughout this series I haven't been able to spot where the HAS_VPCI
>>>> Kconfig symbol would get selected. Hence I cannot tell whether all of
>>>> this is Arm64-specific. Otherwise I wonder whether size 8 actually
>>>> can occur on Arm32.
>>> Dabt.size could be 3 even on ARM32 but we should not allow 64bit
>>> access on mmio regions on arm32.
>>
>> Hmmm... Looking at the Armv7 and Armv8 spec, ldrd/strd (64-bit read) would not present a valid ISV. So I think it is not be possible to have dabt.size = 3 for 32-bit domain. But I agree we probably want to harden the code.
>>
>>> So I will surround this code with ifdef CONFIG_ARM_64 and add a test
>>> for len > 4 to prevent this case on 32bit.
>>> To be completely right we should disable this also for 32bit guests but
>>> this change would be a bit more invasive.
>>
>> I think the following should be sufficient:
>>
>> if ( is_domain_32bit_domain() && len > 4 )
>>   return ...;
> 
> With the last request from Roger to use the function implemented in arch/x86/hw/io.c, the function will move to vpci.h so using is_32bit_domain will not be possible without ifdefery CONFIG_ARM.
> Also I have no access to the domain there.
> 
> So the best I can do for now would be something like:
> #ifdef CONFIG_ARM_32
> If (len == 8)
>      return false
> #endif
> 
> A 32bit guest on 64bit xen would not be checked.
> 
> Would that be ok for now ?

I think the #ifdef is a bit pointless. My preference would be to not add 
the #ifdef but instead add...

> 
> I could add a comment in the code to warn about the limitation.

.. a comment for now as this is only an hardening problem.

Cheers,
Bertrand Marquis Oct. 12, 2021, 5:50 p.m. UTC | #23
Hi Julien,

> On 12 Oct 2021, at 17:20, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 12/10/2021 17:12, Bertrand Marquis wrote:
>> Hi Julien,
>>> On 12 Oct 2021, at 16:04, Julien Grall <julien@xen.org> wrote:
>>> 
>>> On 11/10/2021 13:41, Bertrand Marquis wrote:
>>>> Hi Jan,
>>> 
>>> Hi Bertrand,
>>> 
>>>> As Rahul is on leave, I will answer you and make the changes needed.
>>>>> On 7 Oct 2021, at 14:43, Jan Beulich <jbeulich@suse.com> wrote:
>>>>> Independent of this - is bare metal Arm enforcing this same
>>>>> alignment restriction (unconditionally)? Iirc on x86 we felt we'd
>>>>> better synthesize misaligned accesses.
>>>> Unaligned IO access could be synthesise also on arm to but I would
>>>> rather not make such a change now without testing it (and there is
>>>> also a question of it making sense).
>>> 
>>> Yes it makes sense. I actually have an item in my TODO list to forbid unaligned access because they should not happen on any device we currently emulate.
>>> 
>>> Although, I am not aware of any issue other than the guest would shoot itself in the foot if this happens.
>>> 
>>>> So if it is ok with you I will keep that check and discuss it with Rahul
>>>> when he is back. I will add a comment in the code to make that clear.
>>> 
>>> I am OK with it.
>>> 
>>> [...]
>>> 
>>>>> Throughout this series I haven't been able to spot where the HAS_VPCI
>>>>> Kconfig symbol would get selected. Hence I cannot tell whether all of
>>>>> this is Arm64-specific. Otherwise I wonder whether size 8 actually
>>>>> can occur on Arm32.
>>>> Dabt.size could be 3 even on ARM32 but we should not allow 64bit
>>>> access on mmio regions on arm32.
>>> 
>>> Hmmm... Looking at the Armv7 and Armv8 spec, ldrd/strd (64-bit read) would not present a valid ISV. So I think it is not be possible to have dabt.size = 3 for 32-bit domain. But I agree we probably want to harden the code.
>>> 
>>>> So I will surround this code with ifdef CONFIG_ARM_64 and add a test
>>>> for len > 4 to prevent this case on 32bit.
>>>> To be completely right we should disable this also for 32bit guests but
>>>> this change would be a bit more invasive.
>>> 
>>> I think the following should be sufficient:
>>> 
>>> if ( is_domain_32bit_domain() && len > 4 )
>>>  return ...;
>> With the last request from Roger to use the function implemented in arch/x86/hw/io.c, the function will move to vpci.h so using is_32bit_domain will not be possible without ifdefery CONFIG_ARM.
>> Also I have no access to the domain there.
>> So the best I can do for now would be something like:
>> #ifdef CONFIG_ARM_32
>> If (len == 8)
>>     return false
>> #endif
>> A 32bit guest on 64bit xen would not be checked.
>> Would that be ok for now ?
> 
> I think the #ifdef is a bit pointless. My preference would be to not add the #ifdef but instead add...
> 
>> I could add a comment in the code to warn about the limitation.
> 
> .. a comment for now as this is only an hardening problem.

Agree I will do that.

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall
Stefano Stabellini Oct. 12, 2021, 9:37 p.m. UTC | #24
On Tue, 12 Oct 2021, Jan Beulich wrote:
> On 11.10.2021 20:18, Stefano Stabellini wrote:
> > On Mon, 11 Oct 2021, Jan Beulich wrote:
> >> On 11.10.2021 15:34, Bertrand Marquis wrote:
> >>>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 11.10.2021 14:41, Bertrand Marquis wrote:
> >>>>> But digging deeper into this, I would have 2 questions:
> >>>>>
> >>>>> - msi_cleanup was done there after a request from Stefano, but is not
> >>>>> done in case or iommu error, is there an issue to solve here ?
> >>>>
> >>>> Maybe, but I'm not sure. This very much depends on what a domain
> >>>> could in principle do with a partly set-up device. Plus let's
> >>>> not forget that we're talking of only Dom0 here (for now at least,
> >>>> i.e. not considering the dom0less case).
> >>>>
> >>>> But I'd also like to further defer to Stefano.
> >>>
> >>> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
> >>> could be needed so I will wait for Stefano to know if I need to keep this when
> >>> moving the block up (at the end it is theoretical right now as this is empty).
> > 
> > I know that MSIs are not supported yet on ARM (pci_cleanup_msi does
> > nothing). But I wanted to make sure that the pci_cleanup_msi() calls are
> > present anywhere necessary, especially on the error paths. So that once
> > we add MSI support, we don't need to search through the code to find all
> > the error paths missing a pci_cleanup_msi() call.
> > 
> > To answer your first question: you are right, we are also missing a
> > pci_cleanup_msi() call in the case of IOMMU error. So it might be better
> > to move the call to pci_cleanup_msi() under the "out" label so that we
> > can do it once for both cases.
> > 
> > To answer your second point about whether it is necessary at all: if
> > MSIs and MSI-Xs cannot be already setup at this point at all (not even
> > the enable bit), then we don't need any call to pci_cleanup_msi() in
> > pci_add_device.
> 
> Well, at the very least MSI can't be set up ahead of the traps getting
> put in place. Whether partial success of putting traps in place may
> allow a cunning guest to set up MSI may depend on further aspects.

Good point about MSIs not being setup before the traps. We should remove
the call to pci_cleanup_msi() in the error path then.
Jan Beulich Oct. 13, 2021, 6:10 a.m. UTC | #25
On 12.10.2021 23:37, Stefano Stabellini wrote:
> On Tue, 12 Oct 2021, Jan Beulich wrote:
>> On 11.10.2021 20:18, Stefano Stabellini wrote:
>>> On Mon, 11 Oct 2021, Jan Beulich wrote:
>>>> On 11.10.2021 15:34, Bertrand Marquis wrote:
>>>>>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>>>>>> But digging deeper into this, I would have 2 questions:
>>>>>>>
>>>>>>> - msi_cleanup was done there after a request from Stefano, but is not
>>>>>>> done in case or iommu error, is there an issue to solve here ?
>>>>>>
>>>>>> Maybe, but I'm not sure. This very much depends on what a domain
>>>>>> could in principle do with a partly set-up device. Plus let's
>>>>>> not forget that we're talking of only Dom0 here (for now at least,
>>>>>> i.e. not considering the dom0less case).
>>>>>>
>>>>>> But I'd also like to further defer to Stefano.
>>>>>
>>>>> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
>>>>> could be needed so I will wait for Stefano to know if I need to keep this when
>>>>> moving the block up (at the end it is theoretical right now as this is empty).
>>>
>>> I know that MSIs are not supported yet on ARM (pci_cleanup_msi does
>>> nothing). But I wanted to make sure that the pci_cleanup_msi() calls are
>>> present anywhere necessary, especially on the error paths. So that once
>>> we add MSI support, we don't need to search through the code to find all
>>> the error paths missing a pci_cleanup_msi() call.
>>>
>>> To answer your first question: you are right, we are also missing a
>>> pci_cleanup_msi() call in the case of IOMMU error. So it might be better
>>> to move the call to pci_cleanup_msi() under the "out" label so that we
>>> can do it once for both cases.
>>>
>>> To answer your second point about whether it is necessary at all: if
>>> MSIs and MSI-Xs cannot be already setup at this point at all (not even
>>> the enable bit), then we don't need any call to pci_cleanup_msi() in
>>> pci_add_device.
>>
>> Well, at the very least MSI can't be set up ahead of the traps getting
>> put in place. Whether partial success of putting traps in place may
>> allow a cunning guest to set up MSI may depend on further aspects.
> 
> Good point about MSIs not being setup before the traps. We should remove
> the call to pci_cleanup_msi() in the error path then.

Your reply makes me fear you didn't pay enough attention to the "partial"
in my earlier reply. The traps for the various registers can't all be set
up atomically, so there may be a transient period where enough traps are
already in place for a cunning guest to arrange for setup. Unless, as
said, there are further setup steps needed before a guest could succeed
in doing so.

But even if partial trap setup alone was sufficient, I think the cleaning
up of MSI then might still better go on the error path there than on that
of pci_add_device().

Jan
Roger Pau Monné Oct. 13, 2021, 8:45 a.m. UTC | #26
On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> The existing VPCI support available for X86 is adapted for Arm.
> When the device is added to XEN via the hyper call
> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
> access is added to the Xen to emulate the PCI devices config space.
> 
> A MMIO trap handler for the PCI ECAM space is registered in XEN
> so that when guest is trying to access the PCI config space,XEN
> will trap the access and emulate read/write using the VPCI and
> not the real PCI hardware.
> 
> For Dom0less systems scan_pci_devices() would be used to discover the
> PCI device in XEN and VPCI handler will be added during XEN boots.
> 
> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
> Change in v5:
> - Add pci_cleanup_msi(pdev) in cleanup part.
> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> Change in v4:
> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
> Change in v3:
> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
> Change in v2:
> - Add new XEN_DOMCTL_CDF_vpci flag
> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
> - enable vpci support when pci-passthough option is enabled.
> ---
> ---
>  xen/arch/arm/Makefile         |   1 +
>  xen/arch/arm/domain.c         |   4 ++
>  xen/arch/arm/domain_build.c   |   3 +
>  xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vpci.h           |  36 ++++++++++++
>  xen/drivers/passthrough/pci.c |  18 ++++++
>  xen/include/asm-arm/domain.h  |   7 ++-
>  xen/include/asm-x86/pci.h     |   2 -
>  xen/include/public/arch-arm.h |   7 +++
>  xen/include/xen/pci.h         |   2 +
>  10 files changed, 179 insertions(+), 3 deletions(-)
>  create mode 100644 xen/arch/arm/vpci.c
>  create mode 100644 xen/arch/arm/vpci.h
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 44d7cc81fa..fb9c976ea2 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>  obj-y += platforms/
>  endif
>  obj-$(CONFIG_TEE) += tee/
> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>  
>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>  obj-y += bootfdt.init.o
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 36138c1b2e..fbb52f78f1 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -39,6 +39,7 @@
>  #include <asm/vgic.h>
>  #include <asm/vtimer.h>
>  
> +#include "vpci.h"
>  #include "vuart.h"
>  
>  DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>          goto fail;
>  
> +    if ( (rc = domain_vpci_init(d)) != 0 )
> +        goto fail;
> +
>      return 0;
>  
>  fail:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index c5afbe2e05..f4c89bde8c 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>      if ( iommu_enabled )
>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>  
> +    if ( is_pci_passthrough_enabled() )
> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
> +
>      dom0 = domain_create(0, &dom0_cfg, true);
>      if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>          panic("Error creating domain 0\n");
> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> new file mode 100644
> index 0000000000..76c12b9281
> --- /dev/null
> +++ b/xen/arch/arm/vpci.c
> @@ -0,0 +1,102 @@
> +/*
> + * xen/arch/arm/vpci.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +#include <xen/sched.h>
> +
> +#include <asm/mmio.h>
> +
> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> +
> +/* Do some sanity checks. */
> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
> +{
> +    /* Check access size. */
> +    if ( len > 8 )
> +        return false;
> +
> +    /* Check that access is size aligned. */
> +    if ( (reg & (len - 1)) )
> +        return false;
> +
> +    return true;
> +}
> +
> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                          register_t *r, void *p)
> +{
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +    unsigned long data = ~0UL;
> +    unsigned int size = 1U << info->dabt.size;
> +
> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> +    reg = REGISTER_OFFSET(info->gpa);
> +
> +    if ( !vpci_mmio_access_allowed(reg, size) )
> +        return 0;
> +
> +    data = vpci_read(sbdf, reg, min(4u, size));
> +    if ( size == 8 )
> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
> +
> +    *r = data;
> +
> +    return 1;
> +}
> +
> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
> +                           register_t r, void *p)
> +{
> +    unsigned int reg;
> +    pci_sbdf_t sbdf;
> +    unsigned long data = r;
> +    unsigned int size = 1U << info->dabt.size;
> +
> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> +    reg = REGISTER_OFFSET(info->gpa);
> +
> +    if ( !vpci_mmio_access_allowed(reg, size) )
> +        return 0;
> +
> +    vpci_write(sbdf, reg, min(4u, size), data);
> +    if ( size == 8 )
> +        vpci_write(sbdf, reg + 4, 4, data >> 32);

I think those two helpers (and vpci_mmio_access_allowed) are very
similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
the point where I would consider moving the shared code to vpci.c as
vpci_ecam_{read,write} and call them from the arch specific trap
handlers.

Thanks, Roger.
Bertrand Marquis Oct. 13, 2021, 9:48 a.m. UTC | #27
Hi Roger,

> On 13 Oct 2021, at 09:45, Roger Pau Monné <roger.pau@citrix.com> wrote:
> 
> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>> The existing VPCI support available for X86 is adapted for Arm.
>> When the device is added to XEN via the hyper call
>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>> access is added to the Xen to emulate the PCI devices config space.
>> 
>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>> so that when guest is trying to access the PCI config space,XEN
>> will trap the access and emulate read/write using the VPCI and
>> not the real PCI hardware.
>> 
>> For Dom0less systems scan_pci_devices() would be used to discover the
>> PCI device in XEN and VPCI handler will be added during XEN boots.
>> 
>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> ---
>> Change in v5:
>> - Add pci_cleanup_msi(pdev) in cleanup part.
>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> Change in v4:
>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>> Change in v3:
>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>> Change in v2:
>> - Add new XEN_DOMCTL_CDF_vpci flag
>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>> - enable vpci support when pci-passthough option is enabled.
>> ---
>> ---
>> xen/arch/arm/Makefile         |   1 +
>> xen/arch/arm/domain.c         |   4 ++
>> xen/arch/arm/domain_build.c   |   3 +
>> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>> xen/arch/arm/vpci.h           |  36 ++++++++++++
>> xen/drivers/passthrough/pci.c |  18 ++++++
>> xen/include/asm-arm/domain.h  |   7 ++-
>> xen/include/asm-x86/pci.h     |   2 -
>> xen/include/public/arch-arm.h |   7 +++
>> xen/include/xen/pci.h         |   2 +
>> 10 files changed, 179 insertions(+), 3 deletions(-)
>> create mode 100644 xen/arch/arm/vpci.c
>> create mode 100644 xen/arch/arm/vpci.h
>> 
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 44d7cc81fa..fb9c976ea2 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>> obj-y += platforms/
>> endif
>> obj-$(CONFIG_TEE) += tee/
>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>> 
>> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>> obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 36138c1b2e..fbb52f78f1 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -39,6 +39,7 @@
>> #include <asm/vgic.h>
>> #include <asm/vtimer.h>
>> 
>> +#include "vpci.h"
>> #include "vuart.h"
>> 
>> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>         goto fail;
>> 
>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>> +        goto fail;
>> +
>>     return 0;
>> 
>> fail:
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index c5afbe2e05..f4c89bde8c 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>     if ( iommu_enabled )
>>         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>> 
>> +    if ( is_pci_passthrough_enabled() )
>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>> +
>>     dom0 = domain_create(0, &dom0_cfg, true);
>>     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>>         panic("Error creating domain 0\n");
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> new file mode 100644
>> index 0000000000..76c12b9281
>> --- /dev/null
>> +++ b/xen/arch/arm/vpci.c
>> @@ -0,0 +1,102 @@
>> +/*
>> + * xen/arch/arm/vpci.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +#include <xen/sched.h>
>> +
>> +#include <asm/mmio.h>
>> +
>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>> +
>> +/* Do some sanity checks. */
>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>> +{
>> +    /* Check access size. */
>> +    if ( len > 8 )
>> +        return false;
>> +
>> +    /* Check that access is size aligned. */
>> +    if ( (reg & (len - 1)) )
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>> +                          register_t *r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = ~0UL;
>> +    unsigned int size = 1U << info->dabt.size;
>> +
>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>> +    reg = REGISTER_OFFSET(info->gpa);
>> +
>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>> +        return 0;
>> +
>> +    data = vpci_read(sbdf, reg, min(4u, size));
>> +    if ( size == 8 )
>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>> +
>> +    *r = data;
>> +
>> +    return 1;
>> +}
>> +
>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>> +                           register_t r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = r;
>> +    unsigned int size = 1U << info->dabt.size;
>> +
>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>> +    reg = REGISTER_OFFSET(info->gpa);
>> +
>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>> +        return 0;
>> +
>> +    vpci_write(sbdf, reg, min(4u, size), data);
>> +    if ( size == 8 )
>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
> 
> I think those two helpers (and vpci_mmio_access_allowed) are very
> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
> the point where I would consider moving the shared code to vpci.c as
> vpci_ecam_{read,write} and call them from the arch specific trap
> handlers.

Would it be ok to have ecam specific code in the vpci common code ?

This is an optimisation and we could do that later on as this is an other
change to be done and tested which will start to make things very
challenging with the friday deadline.

Cheers
Bertrand
Bertrand Marquis Oct. 13, 2021, 10:02 a.m. UTC | #28
Hi Jan,

> On 13 Oct 2021, at 07:10, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 12.10.2021 23:37, Stefano Stabellini wrote:
>> On Tue, 12 Oct 2021, Jan Beulich wrote:
>>> On 11.10.2021 20:18, Stefano Stabellini wrote:
>>>> On Mon, 11 Oct 2021, Jan Beulich wrote:
>>>>> On 11.10.2021 15:34, Bertrand Marquis wrote:
>>>>>>> On 11 Oct 2021, at 14:09, Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>> On 11.10.2021 14:41, Bertrand Marquis wrote:
>>>>>>>> But digging deeper into this, I would have 2 questions:
>>>>>>>> 
>>>>>>>> - msi_cleanup was done there after a request from Stefano, but is not
>>>>>>>> done in case or iommu error, is there an issue to solve here ?
>>>>>>> 
>>>>>>> Maybe, but I'm not sure. This very much depends on what a domain
>>>>>>> could in principle do with a partly set-up device. Plus let's
>>>>>>> not forget that we're talking of only Dom0 here (for now at least,
>>>>>>> i.e. not considering the dom0less case).
>>>>>>> 
>>>>>>> But I'd also like to further defer to Stefano.
>>>>>> 
>>>>>> Ok, I must admit I do not really see at that stage why doing an MSI cleanup
>>>>>> could be needed so I will wait for Stefano to know if I need to keep this when
>>>>>> moving the block up (at the end it is theoretical right now as this is empty).
>>>> 
>>>> I know that MSIs are not supported yet on ARM (pci_cleanup_msi does
>>>> nothing). But I wanted to make sure that the pci_cleanup_msi() calls are
>>>> present anywhere necessary, especially on the error paths. So that once
>>>> we add MSI support, we don't need to search through the code to find all
>>>> the error paths missing a pci_cleanup_msi() call.
>>>> 
>>>> To answer your first question: you are right, we are also missing a
>>>> pci_cleanup_msi() call in the case of IOMMU error. So it might be better
>>>> to move the call to pci_cleanup_msi() under the "out" label so that we
>>>> can do it once for both cases.
>>>> 
>>>> To answer your second point about whether it is necessary at all: if
>>>> MSIs and MSI-Xs cannot be already setup at this point at all (not even
>>>> the enable bit), then we don't need any call to pci_cleanup_msi() in
>>>> pci_add_device.
>>> 
>>> Well, at the very least MSI can't be set up ahead of the traps getting
>>> put in place. Whether partial success of putting traps in place may
>>> allow a cunning guest to set up MSI may depend on further aspects.
>> 
>> Good point about MSIs not being setup before the traps. We should remove
>> the call to pci_cleanup_msi() in the error path then.
> 
> Your reply makes me fear you didn't pay enough attention to the "partial"
> in my earlier reply. The traps for the various registers can't all be set
> up atomically, so there may be a transient period where enough traps are
> already in place for a cunning guest to arrange for setup. Unless, as
> said, there are further setup steps needed before a guest could succeed
> in doing so.
> 
> But even if partial trap setup alone was sufficient, I think the cleaning
> up of MSI then might still better go on the error path there than on that
> of pci_add_device().

I think I should put the msi_cleanup in the exit path if pdev is not null but
we got a non null ret (in an else if ( pdev ) ).
This would cover all exit paths, especially as I will move the add_handler
before the iommu init.

Would that be ok for everyone ?

Cheers
Bertrand
Roger Pau Monné Oct. 13, 2021, 10:33 a.m. UTC | #29
On Wed, Oct 13, 2021 at 09:48:42AM +0000, Bertrand Marquis wrote:
> Hi Roger,
> 
> > On 13 Oct 2021, at 09:45, Roger Pau Monné <roger.pau@citrix.com> wrote:
> > 
> > On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> >> The existing VPCI support available for X86 is adapted for Arm.
> >> When the device is added to XEN via the hyper call
> >> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
> >> access is added to the Xen to emulate the PCI devices config space.
> >> 
> >> A MMIO trap handler for the PCI ECAM space is registered in XEN
> >> so that when guest is trying to access the PCI config space,XEN
> >> will trap the access and emulate read/write using the VPCI and
> >> not the real PCI hardware.
> >> 
> >> For Dom0less systems scan_pci_devices() would be used to discover the
> >> PCI device in XEN and VPCI handler will be added during XEN boots.
> >> 
> >> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
> >> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >> ---
> >> Change in v5:
> >> - Add pci_cleanup_msi(pdev) in cleanup part.
> >> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> >> Change in v4:
> >> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
> >> Change in v3:
> >> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
> >> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
> >> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
> >> Change in v2:
> >> - Add new XEN_DOMCTL_CDF_vpci flag
> >> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
> >> - enable vpci support when pci-passthough option is enabled.
> >> ---
> >> ---
> >> xen/arch/arm/Makefile         |   1 +
> >> xen/arch/arm/domain.c         |   4 ++
> >> xen/arch/arm/domain_build.c   |   3 +
> >> xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
> >> xen/arch/arm/vpci.h           |  36 ++++++++++++
> >> xen/drivers/passthrough/pci.c |  18 ++++++
> >> xen/include/asm-arm/domain.h  |   7 ++-
> >> xen/include/asm-x86/pci.h     |   2 -
> >> xen/include/public/arch-arm.h |   7 +++
> >> xen/include/xen/pci.h         |   2 +
> >> 10 files changed, 179 insertions(+), 3 deletions(-)
> >> create mode 100644 xen/arch/arm/vpci.c
> >> create mode 100644 xen/arch/arm/vpci.h
> >> 
> >> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> >> index 44d7cc81fa..fb9c976ea2 100644
> >> --- a/xen/arch/arm/Makefile
> >> +++ b/xen/arch/arm/Makefile
> >> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
> >> obj-y += platforms/
> >> endif
> >> obj-$(CONFIG_TEE) += tee/
> >> +obj-$(CONFIG_HAS_VPCI) += vpci.o
> >> 
> >> obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> >> obj-y += bootfdt.init.o
> >> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> >> index 36138c1b2e..fbb52f78f1 100644
> >> --- a/xen/arch/arm/domain.c
> >> +++ b/xen/arch/arm/domain.c
> >> @@ -39,6 +39,7 @@
> >> #include <asm/vgic.h>
> >> #include <asm/vtimer.h>
> >> 
> >> +#include "vpci.h"
> >> #include "vuart.h"
> >> 
> >> DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
> >> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
> >>     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
> >>         goto fail;
> >> 
> >> +    if ( (rc = domain_vpci_init(d)) != 0 )
> >> +        goto fail;
> >> +
> >>     return 0;
> >> 
> >> fail:
> >> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> >> index c5afbe2e05..f4c89bde8c 100644
> >> --- a/xen/arch/arm/domain_build.c
> >> +++ b/xen/arch/arm/domain_build.c
> >> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
> >>     if ( iommu_enabled )
> >>         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
> >> 
> >> +    if ( is_pci_passthrough_enabled() )
> >> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
> >> +
> >>     dom0 = domain_create(0, &dom0_cfg, true);
> >>     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
> >>         panic("Error creating domain 0\n");
> >> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
> >> new file mode 100644
> >> index 0000000000..76c12b9281
> >> --- /dev/null
> >> +++ b/xen/arch/arm/vpci.c
> >> @@ -0,0 +1,102 @@
> >> +/*
> >> + * xen/arch/arm/vpci.c
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation; either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + */
> >> +#include <xen/sched.h>
> >> +
> >> +#include <asm/mmio.h>
> >> +
> >> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> >> +
> >> +/* Do some sanity checks. */
> >> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
> >> +{
> >> +    /* Check access size. */
> >> +    if ( len > 8 )
> >> +        return false;
> >> +
> >> +    /* Check that access is size aligned. */
> >> +    if ( (reg & (len - 1)) )
> >> +        return false;
> >> +
> >> +    return true;
> >> +}
> >> +
> >> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
> >> +                          register_t *r, void *p)
> >> +{
> >> +    unsigned int reg;
> >> +    pci_sbdf_t sbdf;
> >> +    unsigned long data = ~0UL;
> >> +    unsigned int size = 1U << info->dabt.size;
> >> +
> >> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> >> +    reg = REGISTER_OFFSET(info->gpa);
> >> +
> >> +    if ( !vpci_mmio_access_allowed(reg, size) )
> >> +        return 0;
> >> +
> >> +    data = vpci_read(sbdf, reg, min(4u, size));
> >> +    if ( size == 8 )
> >> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
> >> +
> >> +    *r = data;
> >> +
> >> +    return 1;
> >> +}
> >> +
> >> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
> >> +                           register_t r, void *p)
> >> +{
> >> +    unsigned int reg;
> >> +    pci_sbdf_t sbdf;
> >> +    unsigned long data = r;
> >> +    unsigned int size = 1U << info->dabt.size;
> >> +
> >> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> >> +    reg = REGISTER_OFFSET(info->gpa);
> >> +
> >> +    if ( !vpci_mmio_access_allowed(reg, size) )
> >> +        return 0;
> >> +
> >> +    vpci_write(sbdf, reg, min(4u, size), data);
> >> +    if ( size == 8 )
> >> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
> > 
> > I think those two helpers (and vpci_mmio_access_allowed) are very
> > similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
> > the point where I would consider moving the shared code to vpci.c as
> > vpci_ecam_{read,write} and call them from the arch specific trap
> > handlers.
> 
> Would it be ok to have ecam specific code in the vpci common code ?

I think so, ECAM is part of the PCI specification and architecture
agnostic, so fine to place in common code.

> This is an optimisation and we could do that later on as this is an other
> change to be done and tested which will start to make things very
> challenging with the friday deadline.

I guess this could be done, by adding a TODO note that the handlers
should be unified with the x86 ones and placed in common code.

I however get the feeling that such work would ultimately be ignored,
as there's always more pressing stuff to be done, and people tend to
forget about those cleanups once things get committed.

Thanks, Roger.
Jan Beulich Oct. 13, 2021, 12:21 p.m. UTC | #30
On 13.10.2021 12:02, Bertrand Marquis wrote:
>> On 13 Oct 2021, at 07:10, Jan Beulich <jbeulich@suse.com> wrote:
>> On 12.10.2021 23:37, Stefano Stabellini wrote:
>>> Good point about MSIs not being setup before the traps. We should remove
>>> the call to pci_cleanup_msi() in the error path then.
>>
>> Your reply makes me fear you didn't pay enough attention to the "partial"
>> in my earlier reply. The traps for the various registers can't all be set
>> up atomically, so there may be a transient period where enough traps are
>> already in place for a cunning guest to arrange for setup. Unless, as
>> said, there are further setup steps needed before a guest could succeed
>> in doing so.
>>
>> But even if partial trap setup alone was sufficient, I think the cleaning
>> up of MSI then might still better go on the error path there than on that
>> of pci_add_device().
> 
> I think I should put the msi_cleanup in the exit path if pdev is not null but
> we got a non null ret (in an else if ( pdev ) ).
> This would cover all exit paths, especially as I will move the add_handler
> before the iommu init.
> 
> Would that be ok for everyone ?

Sounds reasonable at the first glance.

Jan
Jan Beulich Oct. 13, 2021, 1 p.m. UTC | #31
On 13.10.2021 10:45, Roger Pau Monné wrote:
> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>> The existing VPCI support available for X86 is adapted for Arm.
>> When the device is added to XEN via the hyper call
>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>> access is added to the Xen to emulate the PCI devices config space.
>>
>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>> so that when guest is trying to access the PCI config space,XEN
>> will trap the access and emulate read/write using the VPCI and
>> not the real PCI hardware.
>>
>> For Dom0less systems scan_pci_devices() would be used to discover the
>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>
>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> ---
>> Change in v5:
>> - Add pci_cleanup_msi(pdev) in cleanup part.
>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> Change in v4:
>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>> Change in v3:
>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>> Change in v2:
>> - Add new XEN_DOMCTL_CDF_vpci flag
>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>> - enable vpci support when pci-passthough option is enabled.
>> ---
>> ---
>>  xen/arch/arm/Makefile         |   1 +
>>  xen/arch/arm/domain.c         |   4 ++
>>  xen/arch/arm/domain_build.c   |   3 +
>>  xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/vpci.h           |  36 ++++++++++++
>>  xen/drivers/passthrough/pci.c |  18 ++++++
>>  xen/include/asm-arm/domain.h  |   7 ++-
>>  xen/include/asm-x86/pci.h     |   2 -
>>  xen/include/public/arch-arm.h |   7 +++
>>  xen/include/xen/pci.h         |   2 +
>>  10 files changed, 179 insertions(+), 3 deletions(-)
>>  create mode 100644 xen/arch/arm/vpci.c
>>  create mode 100644 xen/arch/arm/vpci.h
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 44d7cc81fa..fb9c976ea2 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>  obj-y += platforms/
>>  endif
>>  obj-$(CONFIG_TEE) += tee/
>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>  
>>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>  obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 36138c1b2e..fbb52f78f1 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -39,6 +39,7 @@
>>  #include <asm/vgic.h>
>>  #include <asm/vtimer.h>
>>  
>> +#include "vpci.h"
>>  #include "vuart.h"
>>  
>>  DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>      if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>          goto fail;
>>  
>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>> +        goto fail;
>> +
>>      return 0;
>>  
>>  fail:
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index c5afbe2e05..f4c89bde8c 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>      if ( iommu_enabled )
>>          dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>  
>> +    if ( is_pci_passthrough_enabled() )
>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>> +
>>      dom0 = domain_create(0, &dom0_cfg, true);
>>      if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>>          panic("Error creating domain 0\n");
>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>> new file mode 100644
>> index 0000000000..76c12b9281
>> --- /dev/null
>> +++ b/xen/arch/arm/vpci.c
>> @@ -0,0 +1,102 @@
>> +/*
>> + * xen/arch/arm/vpci.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +#include <xen/sched.h>
>> +
>> +#include <asm/mmio.h>
>> +
>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>> +
>> +/* Do some sanity checks. */
>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>> +{
>> +    /* Check access size. */
>> +    if ( len > 8 )
>> +        return false;
>> +
>> +    /* Check that access is size aligned. */
>> +    if ( (reg & (len - 1)) )
>> +        return false;
>> +
>> +    return true;
>> +}
>> +
>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>> +                          register_t *r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = ~0UL;
>> +    unsigned int size = 1U << info->dabt.size;
>> +
>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>> +    reg = REGISTER_OFFSET(info->gpa);
>> +
>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>> +        return 0;
>> +
>> +    data = vpci_read(sbdf, reg, min(4u, size));
>> +    if ( size == 8 )
>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>> +
>> +    *r = data;
>> +
>> +    return 1;
>> +}
>> +
>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>> +                           register_t r, void *p)
>> +{
>> +    unsigned int reg;
>> +    pci_sbdf_t sbdf;
>> +    unsigned long data = r;
>> +    unsigned int size = 1U << info->dabt.size;
>> +
>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>> +    reg = REGISTER_OFFSET(info->gpa);
>> +
>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>> +        return 0;
>> +
>> +    vpci_write(sbdf, reg, min(4u, size), data);
>> +    if ( size == 8 )
>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
> 
> I think those two helpers (and vpci_mmio_access_allowed) are very
> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
> the point where I would consider moving the shared code to vpci.c as
> vpci_ecam_{read,write} and call them from the arch specific trap
> handlers.

Except that please can we stick to mcfg or mmcfg instead of ecam
in names, as that's how the thing has been named in Xen from its
introduction? I've just grep-ed the code base (case insensitively)
and found no mention of ECAM. There are only a few "became".

Jan
Oleksandr Andrushchenko Oct. 13, 2021, 2:51 p.m. UTC | #32
On 13.10.21 16:00, Jan Beulich wrote:
> On 13.10.2021 10:45, Roger Pau Monné wrote:
>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>> The existing VPCI support available for X86 is adapted for Arm.
>>> When the device is added to XEN via the hyper call
>>> “PHYSDEVOP_pci_device_add”, VPCI handler for the config space
>>> access is added to the Xen to emulate the PCI devices config space.
>>>
>>> A MMIO trap handler for the PCI ECAM space is registered in XEN
>>> so that when guest is trying to access the PCI config space,XEN
>>> will trap the access and emulate read/write using the VPCI and
>>> not the real PCI hardware.
>>>
>>> For Dom0less systems scan_pci_devices() would be used to discover the
>>> PCI device in XEN and VPCI handler will be added during XEN boots.
>>>
>>> Signed-off-by: Rahul Singh <rahul.singh@arm.com>
>>> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>> ---
>>> Change in v5:
>>> - Add pci_cleanup_msi(pdev) in cleanup part.
>>> - Added Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>>> Change in v4:
>>> - Move addition of XEN_DOMCTL_CDF_vpci flag to separate patch
>>> Change in v3:
>>> - Use is_pci_passthrough_enabled() in place of pci_passthrough_enabled variable
>>> - Reject XEN_DOMCTL_CDF_vpci for x86 in arch_sanitise_domain_config()
>>> - Remove IS_ENABLED(CONFIG_HAS_VPCI) from has_vpci()
>>> Change in v2:
>>> - Add new XEN_DOMCTL_CDF_vpci flag
>>> - modify has_vpci() to include XEN_DOMCTL_CDF_vpci
>>> - enable vpci support when pci-passthough option is enabled.
>>> ---
>>> ---
>>>   xen/arch/arm/Makefile         |   1 +
>>>   xen/arch/arm/domain.c         |   4 ++
>>>   xen/arch/arm/domain_build.c   |   3 +
>>>   xen/arch/arm/vpci.c           | 102 ++++++++++++++++++++++++++++++++++
>>>   xen/arch/arm/vpci.h           |  36 ++++++++++++
>>>   xen/drivers/passthrough/pci.c |  18 ++++++
>>>   xen/include/asm-arm/domain.h  |   7 ++-
>>>   xen/include/asm-x86/pci.h     |   2 -
>>>   xen/include/public/arch-arm.h |   7 +++
>>>   xen/include/xen/pci.h         |   2 +
>>>   10 files changed, 179 insertions(+), 3 deletions(-)
>>>   create mode 100644 xen/arch/arm/vpci.c
>>>   create mode 100644 xen/arch/arm/vpci.h
>>>
>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>> index 44d7cc81fa..fb9c976ea2 100644
>>> --- a/xen/arch/arm/Makefile
>>> +++ b/xen/arch/arm/Makefile
>>> @@ -7,6 +7,7 @@ ifneq ($(CONFIG_NO_PLAT),y)
>>>   obj-y += platforms/
>>>   endif
>>>   obj-$(CONFIG_TEE) += tee/
>>> +obj-$(CONFIG_HAS_VPCI) += vpci.o
>>>   
>>>   obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>>   obj-y += bootfdt.init.o
>>> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>>> index 36138c1b2e..fbb52f78f1 100644
>>> --- a/xen/arch/arm/domain.c
>>> +++ b/xen/arch/arm/domain.c
>>> @@ -39,6 +39,7 @@
>>>   #include <asm/vgic.h>
>>>   #include <asm/vtimer.h>
>>>   
>>> +#include "vpci.h"
>>>   #include "vuart.h"
>>>   
>>>   DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
>>> @@ -767,6 +768,9 @@ int arch_domain_create(struct domain *d,
>>>       if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
>>>           goto fail;
>>>   
>>> +    if ( (rc = domain_vpci_init(d)) != 0 )
>>> +        goto fail;
>>> +
>>>       return 0;
>>>   
>>>   fail:
>>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>>> index c5afbe2e05..f4c89bde8c 100644
>>> --- a/xen/arch/arm/domain_build.c
>>> +++ b/xen/arch/arm/domain_build.c
>>> @@ -3053,6 +3053,9 @@ void __init create_dom0(void)
>>>       if ( iommu_enabled )
>>>           dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>>>   
>>> +    if ( is_pci_passthrough_enabled() )
>>> +        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
>>> +
>>>       dom0 = domain_create(0, &dom0_cfg, true);
>>>       if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
>>>           panic("Error creating domain 0\n");
>>> diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
>>> new file mode 100644
>>> index 0000000000..76c12b9281
>>> --- /dev/null
>>> +++ b/xen/arch/arm/vpci.c
>>> @@ -0,0 +1,102 @@
>>> +/*
>>> + * xen/arch/arm/vpci.c
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + */
>>> +#include <xen/sched.h>
>>> +
>>> +#include <asm/mmio.h>
>>> +
>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>> +
>>> +/* Do some sanity checks. */
>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>> +{
>>> +    /* Check access size. */
>>> +    if ( len > 8 )
>>> +        return false;
>>> +
>>> +    /* Check that access is size aligned. */
>>> +    if ( (reg & (len - 1)) )
>>> +        return false;
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>> +                          register_t *r, void *p)
>>> +{
>>> +    unsigned int reg;
>>> +    pci_sbdf_t sbdf;
>>> +    unsigned long data = ~0UL;
>>> +    unsigned int size = 1U << info->dabt.size;
>>> +
>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>> +    reg = REGISTER_OFFSET(info->gpa);
>>> +
>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>> +        return 0;
>>> +
>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>> +    if ( size == 8 )
>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>> +
>>> +    *r = data;
>>> +
>>> +    return 1;
>>> +}
>>> +
>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>> +                           register_t r, void *p)
>>> +{
>>> +    unsigned int reg;
>>> +    pci_sbdf_t sbdf;
>>> +    unsigned long data = r;
>>> +    unsigned int size = 1U << info->dabt.size;
>>> +
>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>> +    reg = REGISTER_OFFSET(info->gpa);
>>> +
>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>> +        return 0;
>>> +
>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>> +    if ( size == 8 )
>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>> I think those two helpers (and vpci_mmio_access_allowed) are very
>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>> the point where I would consider moving the shared code to vpci.c as
>> vpci_ecam_{read,write} and call them from the arch specific trap
>> handlers.
> Except that please can we stick to mcfg or mmcfg instead of ecam
> in names, as that's how the thing has been named in Xen from its
> introduction? I've just grep-ed the code base (case insensitively)
> and found no mention of ECAM. There are only a few "became".
I do understand that this is historically that we do not have ECAM in Xen,
but PCI is not about Xen. Thus, I think it is also acceptable to use
a commonly known ECAM for the code that works with ECAM.
At least for the new code
> Jan
>
>
Jan Beulich Oct. 13, 2021, 3:15 p.m. UTC | #33
On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
> On 13.10.21 16:00, Jan Beulich wrote:
>> On 13.10.2021 10:45, Roger Pau Monné wrote:
>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/vpci.c
>>>> @@ -0,0 +1,102 @@
>>>> +/*
>>>> + * xen/arch/arm/vpci.c
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>> + * (at your option) any later version.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + */
>>>> +#include <xen/sched.h>
>>>> +
>>>> +#include <asm/mmio.h>
>>>> +
>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>> +
>>>> +/* Do some sanity checks. */
>>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>>> +{
>>>> +    /* Check access size. */
>>>> +    if ( len > 8 )
>>>> +        return false;
>>>> +
>>>> +    /* Check that access is size aligned. */
>>>> +    if ( (reg & (len - 1)) )
>>>> +        return false;
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>> +                          register_t *r, void *p)
>>>> +{
>>>> +    unsigned int reg;
>>>> +    pci_sbdf_t sbdf;
>>>> +    unsigned long data = ~0UL;
>>>> +    unsigned int size = 1U << info->dabt.size;
>>>> +
>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>> +
>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>> +        return 0;
>>>> +
>>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>>> +    if ( size == 8 )
>>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>> +
>>>> +    *r = data;
>>>> +
>>>> +    return 1;
>>>> +}
>>>> +
>>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>> +                           register_t r, void *p)
>>>> +{
>>>> +    unsigned int reg;
>>>> +    pci_sbdf_t sbdf;
>>>> +    unsigned long data = r;
>>>> +    unsigned int size = 1U << info->dabt.size;
>>>> +
>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>> +
>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>> +        return 0;
>>>> +
>>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>>> +    if ( size == 8 )
>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>> I think those two helpers (and vpci_mmio_access_allowed) are very
>>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>>> the point where I would consider moving the shared code to vpci.c as
>>> vpci_ecam_{read,write} and call them from the arch specific trap
>>> handlers.
>> Except that please can we stick to mcfg or mmcfg instead of ecam
>> in names, as that's how the thing has been named in Xen from its
>> introduction? I've just grep-ed the code base (case insensitively)
>> and found no mention of ECAM. There are only a few "became".
> I do understand that this is historically that we do not have ECAM in Xen,
> but PCI is not about Xen. Thus, I think it is also acceptable to use
> a commonly known ECAM for the code that works with ECAM.

ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
actually come from, I believe.

Jan
Stefano Stabellini Oct. 13, 2021, 7:27 p.m. UTC | #34
On Wed, 13 Oct 2021, Jan Beulich wrote:
> On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
> > On 13.10.21 16:00, Jan Beulich wrote:
> >> On 13.10.2021 10:45, Roger Pau Monné wrote:
> >>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
> >>>> --- /dev/null
> >>>> +++ b/xen/arch/arm/vpci.c
> >>>> @@ -0,0 +1,102 @@
> >>>> +/*
> >>>> + * xen/arch/arm/vpci.c
> >>>> + *
> >>>> + * This program is free software; you can redistribute it and/or modify
> >>>> + * it under the terms of the GNU General Public License as published by
> >>>> + * the Free Software Foundation; either version 2 of the License, or
> >>>> + * (at your option) any later version.
> >>>> + *
> >>>> + * This program is distributed in the hope that it will be useful,
> >>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >>>> + * GNU General Public License for more details.
> >>>> + */
> >>>> +#include <xen/sched.h>
> >>>> +
> >>>> +#include <asm/mmio.h>
> >>>> +
> >>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
> >>>> +
> >>>> +/* Do some sanity checks. */
> >>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
> >>>> +{
> >>>> +    /* Check access size. */
> >>>> +    if ( len > 8 )
> >>>> +        return false;
> >>>> +
> >>>> +    /* Check that access is size aligned. */
> >>>> +    if ( (reg & (len - 1)) )
> >>>> +        return false;
> >>>> +
> >>>> +    return true;
> >>>> +}
> >>>> +
> >>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
> >>>> +                          register_t *r, void *p)
> >>>> +{
> >>>> +    unsigned int reg;
> >>>> +    pci_sbdf_t sbdf;
> >>>> +    unsigned long data = ~0UL;
> >>>> +    unsigned int size = 1U << info->dabt.size;
> >>>> +
> >>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> >>>> +    reg = REGISTER_OFFSET(info->gpa);
> >>>> +
> >>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
> >>>> +        return 0;
> >>>> +
> >>>> +    data = vpci_read(sbdf, reg, min(4u, size));
> >>>> +    if ( size == 8 )
> >>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
> >>>> +
> >>>> +    *r = data;
> >>>> +
> >>>> +    return 1;
> >>>> +}
> >>>> +
> >>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
> >>>> +                           register_t r, void *p)
> >>>> +{
> >>>> +    unsigned int reg;
> >>>> +    pci_sbdf_t sbdf;
> >>>> +    unsigned long data = r;
> >>>> +    unsigned int size = 1U << info->dabt.size;
> >>>> +
> >>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
> >>>> +    reg = REGISTER_OFFSET(info->gpa);
> >>>> +
> >>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
> >>>> +        return 0;
> >>>> +
> >>>> +    vpci_write(sbdf, reg, min(4u, size), data);
> >>>> +    if ( size == 8 )
> >>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
> >>> I think those two helpers (and vpci_mmio_access_allowed) are very
> >>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
> >>> the point where I would consider moving the shared code to vpci.c as
> >>> vpci_ecam_{read,write} and call them from the arch specific trap
> >>> handlers.
> >> Except that please can we stick to mcfg or mmcfg instead of ecam
> >> in names, as that's how the thing has been named in Xen from its
> >> introduction? I've just grep-ed the code base (case insensitively)
> >> and found no mention of ECAM. There are only a few "became".
> > I do understand that this is historically that we do not have ECAM in Xen,
> > but PCI is not about Xen. Thus, I think it is also acceptable to use
> > a commonly known ECAM for the code that works with ECAM.
> 
> ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
> actually come from, I believe.

My understanding is that "MCFG" is the name of the ACPI table that
describes the PCI config space [1]. The underlying PCI standard for the
memory mapped layout of the PCI config space is called ECAM. Here, it
makes sense to call it ECAM as it is firmware independent.

[1] https://wiki.osdev.org/PCI_Express
Jan Beulich Oct. 14, 2021, 6:33 a.m. UTC | #35
On 13.10.2021 21:27, Stefano Stabellini wrote:
> On Wed, 13 Oct 2021, Jan Beulich wrote:
>> On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
>>> On 13.10.21 16:00, Jan Beulich wrote:
>>>> On 13.10.2021 10:45, Roger Pau Monné wrote:
>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>> --- /dev/null
>>>>>> +++ b/xen/arch/arm/vpci.c
>>>>>> @@ -0,0 +1,102 @@
>>>>>> +/*
>>>>>> + * xen/arch/arm/vpci.c
>>>>>> + *
>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>> + * it under the terms of the GNU General Public License as published by
>>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>>> + * (at your option) any later version.
>>>>>> + *
>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>> + * GNU General Public License for more details.
>>>>>> + */
>>>>>> +#include <xen/sched.h>
>>>>>> +
>>>>>> +#include <asm/mmio.h>
>>>>>> +
>>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>>> +
>>>>>> +/* Do some sanity checks. */
>>>>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>>>>> +{
>>>>>> +    /* Check access size. */
>>>>>> +    if ( len > 8 )
>>>>>> +        return false;
>>>>>> +
>>>>>> +    /* Check that access is size aligned. */
>>>>>> +    if ( (reg & (len - 1)) )
>>>>>> +        return false;
>>>>>> +
>>>>>> +    return true;
>>>>>> +}
>>>>>> +
>>>>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>> +                          register_t *r, void *p)
>>>>>> +{
>>>>>> +    unsigned int reg;
>>>>>> +    pci_sbdf_t sbdf;
>>>>>> +    unsigned long data = ~0UL;
>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>> +
>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>> +
>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>>>>> +    if ( size == 8 )
>>>>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>>>> +
>>>>>> +    *r = data;
>>>>>> +
>>>>>> +    return 1;
>>>>>> +}
>>>>>> +
>>>>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>>> +                           register_t r, void *p)
>>>>>> +{
>>>>>> +    unsigned int reg;
>>>>>> +    pci_sbdf_t sbdf;
>>>>>> +    unsigned long data = r;
>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>> +
>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>> +
>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>> +        return 0;
>>>>>> +
>>>>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>>>>> +    if ( size == 8 )
>>>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>>> I think those two helpers (and vpci_mmio_access_allowed) are very
>>>>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>>>>> the point where I would consider moving the shared code to vpci.c as
>>>>> vpci_ecam_{read,write} and call them from the arch specific trap
>>>>> handlers.
>>>> Except that please can we stick to mcfg or mmcfg instead of ecam
>>>> in names, as that's how the thing has been named in Xen from its
>>>> introduction? I've just grep-ed the code base (case insensitively)
>>>> and found no mention of ECAM. There are only a few "became".
>>> I do understand that this is historically that we do not have ECAM in Xen,
>>> but PCI is not about Xen. Thus, I think it is also acceptable to use
>>> a commonly known ECAM for the code that works with ECAM.
>>
>> ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
>> actually come from, I believe.
> 
> My understanding is that "MCFG" is the name of the ACPI table that
> describes the PCI config space [1]. The underlying PCI standard for the
> memory mapped layout of the PCI config space is called ECAM. Here, it
> makes sense to call it ECAM as it is firmware independent.
> 
> [1] https://wiki.osdev.org/PCI_Express

Okay, I can accept this, but then I'd expect all existing uses of
MCFG / MMCFG in the code which aren't directly ACPI-related to get
replaced. Otherwise it's needlessly harder to identify that one
piece of code relates to certain other pieces.

Jan
Bertrand Marquis Oct. 14, 2021, 7:53 a.m. UTC | #36
Hi,

> On 14 Oct 2021, at 07:33, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 13.10.2021 21:27, Stefano Stabellini wrote:
>> On Wed, 13 Oct 2021, Jan Beulich wrote:
>>> On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
>>>> On 13.10.21 16:00, Jan Beulich wrote:
>>>>> On 13.10.2021 10:45, Roger Pau Monné wrote:
>>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>>> --- /dev/null
>>>>>>> +++ b/xen/arch/arm/vpci.c
>>>>>>> @@ -0,0 +1,102 @@
>>>>>>> +/*
>>>>>>> + * xen/arch/arm/vpci.c
>>>>>>> + *
>>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>>> + * it under the terms of the GNU General Public License as published by
>>>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>>>> + * (at your option) any later version.
>>>>>>> + *
>>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>>> + * GNU General Public License for more details.
>>>>>>> + */
>>>>>>> +#include <xen/sched.h>
>>>>>>> +
>>>>>>> +#include <asm/mmio.h>
>>>>>>> +
>>>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>>>> +
>>>>>>> +/* Do some sanity checks. */
>>>>>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>>>>>> +{
>>>>>>> +    /* Check access size. */
>>>>>>> +    if ( len > 8 )
>>>>>>> +        return false;
>>>>>>> +
>>>>>>> +    /* Check that access is size aligned. */
>>>>>>> +    if ( (reg & (len - 1)) )
>>>>>>> +        return false;
>>>>>>> +
>>>>>>> +    return true;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>>> +                          register_t *r, void *p)
>>>>>>> +{
>>>>>>> +    unsigned int reg;
>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>> +    unsigned long data = ~0UL;
>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>> +
>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>> +
>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>>>>>> +    if ( size == 8 )
>>>>>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>>>>> +
>>>>>>> +    *r = data;
>>>>>>> +
>>>>>>> +    return 1;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>>>> +                           register_t r, void *p)
>>>>>>> +{
>>>>>>> +    unsigned int reg;
>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>> +    unsigned long data = r;
>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>> +
>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>> +
>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>>>>>> +    if ( size == 8 )
>>>>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>>>> I think those two helpers (and vpci_mmio_access_allowed) are very
>>>>>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>>>>>> the point where I would consider moving the shared code to vpci.c as
>>>>>> vpci_ecam_{read,write} and call them from the arch specific trap
>>>>>> handlers.
>>>>> Except that please can we stick to mcfg or mmcfg instead of ecam
>>>>> in names, as that's how the thing has been named in Xen from its
>>>>> introduction? I've just grep-ed the code base (case insensitively)
>>>>> and found no mention of ECAM. There are only a few "became".
>>>> I do understand that this is historically that we do not have ECAM in Xen,
>>>> but PCI is not about Xen. Thus, I think it is also acceptable to use
>>>> a commonly known ECAM for the code that works with ECAM.
>>> 
>>> ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
>>> actually come from, I believe.
>> 
>> My understanding is that "MCFG" is the name of the ACPI table that
>> describes the PCI config space [1]. The underlying PCI standard for the
>> memory mapped layout of the PCI config space is called ECAM. Here, it
>> makes sense to call it ECAM as it is firmware independent.
>> 
>> [1] https://wiki.osdev.org/PCI_Express
> 
> Okay, I can accept this, but then I'd expect all existing uses of
> MCFG / MMCFG in the code which aren't directly ACPI-related to get
> replaced. Otherwise it's needlessly harder to identify that one
> piece of code relates to certain other pieces.

To sum up on this subject, here what I will do in version 6:
- move vpci_ecam_{read/write} to vpci.c and rename them
- same for access allowed and I will rename it to vpci_ecam_access_allowed

I will work on this and try to push v6 before the end of the day (also handling other remarks on this patch).

Cheers
Bertrand

> 
> Jan
>
Bertrand Marquis Oct. 14, 2021, 9:03 a.m. UTC | #37
Hi Jan,

> On 14 Oct 2021, at 07:33, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 13.10.2021 21:27, Stefano Stabellini wrote:
>> On Wed, 13 Oct 2021, Jan Beulich wrote:
>>> On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
>>>> On 13.10.21 16:00, Jan Beulich wrote:
>>>>> On 13.10.2021 10:45, Roger Pau Monné wrote:
>>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>>> --- /dev/null
>>>>>>> +++ b/xen/arch/arm/vpci.c
>>>>>>> @@ -0,0 +1,102 @@
>>>>>>> +/*
>>>>>>> + * xen/arch/arm/vpci.c
>>>>>>> + *
>>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>>> + * it under the terms of the GNU General Public License as published by
>>>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>>>> + * (at your option) any later version.
>>>>>>> + *
>>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>>> + * GNU General Public License for more details.
>>>>>>> + */
>>>>>>> +#include <xen/sched.h>
>>>>>>> +
>>>>>>> +#include <asm/mmio.h>
>>>>>>> +
>>>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>>>> +
>>>>>>> +/* Do some sanity checks. */
>>>>>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>>>>>> +{
>>>>>>> +    /* Check access size. */
>>>>>>> +    if ( len > 8 )
>>>>>>> +        return false;
>>>>>>> +
>>>>>>> +    /* Check that access is size aligned. */
>>>>>>> +    if ( (reg & (len - 1)) )
>>>>>>> +        return false;
>>>>>>> +
>>>>>>> +    return true;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>>> +                          register_t *r, void *p)
>>>>>>> +{
>>>>>>> +    unsigned int reg;
>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>> +    unsigned long data = ~0UL;
>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>> +
>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>> +
>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>>>>>> +    if ( size == 8 )
>>>>>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>>>>> +
>>>>>>> +    *r = data;
>>>>>>> +
>>>>>>> +    return 1;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>>>> +                           register_t r, void *p)
>>>>>>> +{
>>>>>>> +    unsigned int reg;
>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>> +    unsigned long data = r;
>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>> +
>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>> +
>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>> +        return 0;
>>>>>>> +
>>>>>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>>>>>> +    if ( size == 8 )
>>>>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>>>> I think those two helpers (and vpci_mmio_access_allowed) are very
>>>>>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>>>>>> the point where I would consider moving the shared code to vpci.c as
>>>>>> vpci_ecam_{read,write} and call them from the arch specific trap
>>>>>> handlers.
>>>>> Except that please can we stick to mcfg or mmcfg instead of ecam
>>>>> in names, as that's how the thing has been named in Xen from its
>>>>> introduction? I've just grep-ed the code base (case insensitively)
>>>>> and found no mention of ECAM. There are only a few "became".
>>>> I do understand that this is historically that we do not have ECAM in Xen,
>>>> but PCI is not about Xen. Thus, I think it is also acceptable to use
>>>> a commonly known ECAM for the code that works with ECAM.
>>> 
>>> ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
>>> actually come from, I believe.
>> 
>> My understanding is that "MCFG" is the name of the ACPI table that
>> describes the PCI config space [1]. The underlying PCI standard for the
>> memory mapped layout of the PCI config space is called ECAM. Here, it
>> makes sense to call it ECAM as it is firmware independent.
>> 
>> [1] https://wiki.osdev.org/PCI_Express
> 
> Okay, I can accept this, but then I'd expect all existing uses of
> MCFG / MMCFG in the code which aren't directly ACPI-related to get
> replaced. Otherwise it's needlessly harder to identify that one
> piece of code relates to certain other pieces.

If that is ok with I will:
- move function from x86/hw/io.c to vpci.c renaming them to ECAM
- rename MMCFG_{BDF/REG_OFFSET) to ECAM_{BDF/REG_OFFSET}

For the rest of the occurrences in x86 I will not modify any of them as some
are related to ACPI so using (M)MCFG is right and for the others I am not 100%
sure.

Do you agree ?

Cheers
Bertrand

> 
> Jan
>
Jan Beulich Oct. 14, 2021, 9:24 a.m. UTC | #38
On 14.10.2021 11:03, Bertrand Marquis wrote:
> Hi Jan,
> 
>> On 14 Oct 2021, at 07:33, Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 13.10.2021 21:27, Stefano Stabellini wrote:
>>> On Wed, 13 Oct 2021, Jan Beulich wrote:
>>>> On 13.10.2021 16:51, Oleksandr Andrushchenko wrote:
>>>>> On 13.10.21 16:00, Jan Beulich wrote:
>>>>>> On 13.10.2021 10:45, Roger Pau Monné wrote:
>>>>>>> On Wed, Oct 06, 2021 at 06:40:34PM +0100, Rahul Singh wrote:
>>>>>>>> --- /dev/null
>>>>>>>> +++ b/xen/arch/arm/vpci.c
>>>>>>>> @@ -0,0 +1,102 @@
>>>>>>>> +/*
>>>>>>>> + * xen/arch/arm/vpci.c
>>>>>>>> + *
>>>>>>>> + * This program is free software; you can redistribute it and/or modify
>>>>>>>> + * it under the terms of the GNU General Public License as published by
>>>>>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>>>>>> + * (at your option) any later version.
>>>>>>>> + *
>>>>>>>> + * This program is distributed in the hope that it will be useful,
>>>>>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>>>> + * GNU General Public License for more details.
>>>>>>>> + */
>>>>>>>> +#include <xen/sched.h>
>>>>>>>> +
>>>>>>>> +#include <asm/mmio.h>
>>>>>>>> +
>>>>>>>> +#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
>>>>>>>> +
>>>>>>>> +/* Do some sanity checks. */
>>>>>>>> +static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
>>>>>>>> +{
>>>>>>>> +    /* Check access size. */
>>>>>>>> +    if ( len > 8 )
>>>>>>>> +        return false;
>>>>>>>> +
>>>>>>>> +    /* Check that access is size aligned. */
>>>>>>>> +    if ( (reg & (len - 1)) )
>>>>>>>> +        return false;
>>>>>>>> +
>>>>>>>> +    return true;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
>>>>>>>> +                          register_t *r, void *p)
>>>>>>>> +{
>>>>>>>> +    unsigned int reg;
>>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>>> +    unsigned long data = ~0UL;
>>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>>> +
>>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>>> +
>>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>>> +        return 0;
>>>>>>>> +
>>>>>>>> +    data = vpci_read(sbdf, reg, min(4u, size));
>>>>>>>> +    if ( size == 8 )
>>>>>>>> +        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
>>>>>>>> +
>>>>>>>> +    *r = data;
>>>>>>>> +
>>>>>>>> +    return 1;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>>>>>> +                           register_t r, void *p)
>>>>>>>> +{
>>>>>>>> +    unsigned int reg;
>>>>>>>> +    pci_sbdf_t sbdf;
>>>>>>>> +    unsigned long data = r;
>>>>>>>> +    unsigned int size = 1U << info->dabt.size;
>>>>>>>> +
>>>>>>>> +    sbdf.sbdf = MMCFG_BDF(info->gpa);
>>>>>>>> +    reg = REGISTER_OFFSET(info->gpa);
>>>>>>>> +
>>>>>>>> +    if ( !vpci_mmio_access_allowed(reg, size) )
>>>>>>>> +        return 0;
>>>>>>>> +
>>>>>>>> +    vpci_write(sbdf, reg, min(4u, size), data);
>>>>>>>> +    if ( size == 8 )
>>>>>>>> +        vpci_write(sbdf, reg + 4, 4, data >> 32);
>>>>>>> I think those two helpers (and vpci_mmio_access_allowed) are very
>>>>>>> similar to the existing x86 ones (see vpci_mmcfg_{read,write}), up to
>>>>>>> the point where I would consider moving the shared code to vpci.c as
>>>>>>> vpci_ecam_{read,write} and call them from the arch specific trap
>>>>>>> handlers.
>>>>>> Except that please can we stick to mcfg or mmcfg instead of ecam
>>>>>> in names, as that's how the thing has been named in Xen from its
>>>>>> introduction? I've just grep-ed the code base (case insensitively)
>>>>>> and found no mention of ECAM. There are only a few "became".
>>>>> I do understand that this is historically that we do not have ECAM in Xen,
>>>>> but PCI is not about Xen. Thus, I think it is also acceptable to use
>>>>> a commonly known ECAM for the code that works with ECAM.
>>>>
>>>> ACPI, afaik, also doesn't call this ECAM. That's where MCFG / MMCFG
>>>> actually come from, I believe.
>>>
>>> My understanding is that "MCFG" is the name of the ACPI table that
>>> describes the PCI config space [1]. The underlying PCI standard for the
>>> memory mapped layout of the PCI config space is called ECAM. Here, it
>>> makes sense to call it ECAM as it is firmware independent.
>>>
>>> [1] https://wiki.osdev.org/PCI_Express
>>
>> Okay, I can accept this, but then I'd expect all existing uses of
>> MCFG / MMCFG in the code which aren't directly ACPI-related to get
>> replaced. Otherwise it's needlessly harder to identify that one
>> piece of code relates to certain other pieces.
> 
> If that is ok with I will:
> - move function from x86/hw/io.c to vpci.c renaming them to ECAM
> - rename MMCFG_{BDF/REG_OFFSET) to ECAM_{BDF/REG_OFFSET}
> 
> For the rest of the occurrences in x86 I will not modify any of them as some
> are related to ACPI so using (M)MCFG is right and for the others I am not 100%
> sure.
> 
> Do you agree ?

Probably not, but I don't want to put time into checking existing
names right now. Yet I'm getting the impression that I'm being
overruled here anyway, so there may be little point in investing any
time here in the first place.

Jan
diff mbox series

Patch

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 44d7cc81fa..fb9c976ea2 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -7,6 +7,7 @@  ifneq ($(CONFIG_NO_PLAT),y)
 obj-y += platforms/
 endif
 obj-$(CONFIG_TEE) += tee/
+obj-$(CONFIG_HAS_VPCI) += vpci.o
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 36138c1b2e..fbb52f78f1 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -39,6 +39,7 @@ 
 #include <asm/vgic.h>
 #include <asm/vtimer.h>
 
+#include "vpci.h"
 #include "vuart.h"
 
 DEFINE_PER_CPU(struct vcpu *, curr_vcpu);
@@ -767,6 +768,9 @@  int arch_domain_create(struct domain *d,
     if ( is_hardware_domain(d) && (rc = domain_vuart_init(d)) )
         goto fail;
 
+    if ( (rc = domain_vpci_init(d)) != 0 )
+        goto fail;
+
     return 0;
 
 fail:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c5afbe2e05..f4c89bde8c 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3053,6 +3053,9 @@  void __init create_dom0(void)
     if ( iommu_enabled )
         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
+    if ( is_pci_passthrough_enabled() )
+        dom0_cfg.flags |= XEN_DOMCTL_CDF_vpci;
+
     dom0 = domain_create(0, &dom0_cfg, true);
     if ( IS_ERR(dom0) || (alloc_dom0_vcpu0(dom0) == NULL) )
         panic("Error creating domain 0\n");
diff --git a/xen/arch/arm/vpci.c b/xen/arch/arm/vpci.c
new file mode 100644
index 0000000000..76c12b9281
--- /dev/null
+++ b/xen/arch/arm/vpci.c
@@ -0,0 +1,102 @@ 
+/*
+ * xen/arch/arm/vpci.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include <xen/sched.h>
+
+#include <asm/mmio.h>
+
+#define REGISTER_OFFSET(addr)  ( (addr) & 0x00000fff)
+
+/* Do some sanity checks. */
+static bool vpci_mmio_access_allowed(unsigned int reg, unsigned int len)
+{
+    /* Check access size. */
+    if ( len > 8 )
+        return false;
+
+    /* Check that access is size aligned. */
+    if ( (reg & (len - 1)) )
+        return false;
+
+    return true;
+}
+
+static int vpci_mmio_read(struct vcpu *v, mmio_info_t *info,
+                          register_t *r, void *p)
+{
+    unsigned int reg;
+    pci_sbdf_t sbdf;
+    unsigned long data = ~0UL;
+    unsigned int size = 1U << info->dabt.size;
+
+    sbdf.sbdf = MMCFG_BDF(info->gpa);
+    reg = REGISTER_OFFSET(info->gpa);
+
+    if ( !vpci_mmio_access_allowed(reg, size) )
+        return 0;
+
+    data = vpci_read(sbdf, reg, min(4u, size));
+    if ( size == 8 )
+        data |= (uint64_t)vpci_read(sbdf, reg + 4, 4) << 32;
+
+    *r = data;
+
+    return 1;
+}
+
+static int vpci_mmio_write(struct vcpu *v, mmio_info_t *info,
+                           register_t r, void *p)
+{
+    unsigned int reg;
+    pci_sbdf_t sbdf;
+    unsigned long data = r;
+    unsigned int size = 1U << info->dabt.size;
+
+    sbdf.sbdf = MMCFG_BDF(info->gpa);
+    reg = REGISTER_OFFSET(info->gpa);
+
+    if ( !vpci_mmio_access_allowed(reg, size) )
+        return 0;
+
+    vpci_write(sbdf, reg, min(4u, size), data);
+    if ( size == 8 )
+        vpci_write(sbdf, reg + 4, 4, data >> 32);
+
+    return 1;
+}
+
+static const struct mmio_handler_ops vpci_mmio_handler = {
+    .read  = vpci_mmio_read,
+    .write = vpci_mmio_write,
+};
+
+int domain_vpci_init(struct domain *d)
+{
+    if ( !has_vpci(d) )
+        return 0;
+
+    register_mmio_handler(d, &vpci_mmio_handler,
+                          GUEST_VPCI_ECAM_BASE, GUEST_VPCI_ECAM_SIZE, NULL);
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
+
diff --git a/xen/arch/arm/vpci.h b/xen/arch/arm/vpci.h
new file mode 100644
index 0000000000..d8a7b0e3e8
--- /dev/null
+++ b/xen/arch/arm/vpci.h
@@ -0,0 +1,36 @@ 
+/*
+ * xen/arch/arm/vpci.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ARCH_ARM_VPCI_H__
+#define __ARCH_ARM_VPCI_H__
+
+#ifdef CONFIG_HAS_VPCI
+int domain_vpci_init(struct domain *d);
+#else
+static inline int domain_vpci_init(struct domain *d)
+{
+    return 0;
+}
+#endif
+
+#endif /* __ARCH_ARM_VPCI_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e1b735d9e8..e568457e60 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -766,6 +766,24 @@  int pci_add_device(u16 seg, u8 bus, u8 devfn,
     else
         iommu_enable_device(pdev);
 
+#ifdef CONFIG_ARM
+    /*
+     * On ARM PCI devices discovery will be done by Dom0. Add vpci handler when
+     * Dom0 inform XEN to add the PCI devices in XEN.
+     */
+    ret = vpci_add_handlers(pdev);
+    if ( ret )
+    {
+        printk(XENLOG_ERR "setup of vPCI failed: %d\n", ret);
+        pci_cleanup_msi(pdev);
+        ret = iommu_remove_device(pdev);
+        if ( pdev->domain )
+            list_del(&pdev->domain_list);
+        free_pdev(pseg, pdev);
+        goto out;
+    }
+#endif
+
     pci_enable_acs(pdev);
 
 out:
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index c9277b5c6d..91d614b37e 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -2,6 +2,7 @@ 
 #define __ASM_DOMAIN_H__
 
 #include <xen/cache.h>
+#include <xen/nospec.h>
 #include <xen/timer.h>
 #include <asm/page.h>
 #include <asm/p2m.h>
@@ -262,7 +263,11 @@  static inline void arch_vcpu_block(struct vcpu *v) {}
 
 #define arch_vm_assist_valid_mask(d) (1UL << VMASST_TYPE_runstate_update_flag)
 
-#define has_vpci(d)    ({ (void)(d); false; })
+/*
+ * For X86 VPCI is enabled and tested for PVH DOM0 only but
+ * for ARM we enable support VPCI for guest domain also.
+ */
+#define has_vpci(d) evaluate_nospec((d)->options & XEN_DOMCTL_CDF_vpci)
 
 #endif /* __ASM_DOMAIN_H__ */
 
diff --git a/xen/include/asm-x86/pci.h b/xen/include/asm-x86/pci.h
index e076951032..c4a4fdcbc2 100644
--- a/xen/include/asm-x86/pci.h
+++ b/xen/include/asm-x86/pci.h
@@ -6,8 +6,6 @@ 
 #define CF8_ADDR_HI(cf8) (  ((cf8) & 0x0f000000) >> 16)
 #define CF8_ENABLED(cf8) (!!((cf8) & 0x80000000))
 
-#define MMCFG_BDF(addr)  ( ((addr) & 0x0ffff000) >> 12)
-
 #define IS_SNB_GFX(id) (id == 0x01068086 || id == 0x01168086 \
                         || id == 0x01268086 || id == 0x01028086 \
                         || id == 0x01128086 || id == 0x01228086 \
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index d46c61fca9..44be337dec 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -418,6 +418,13 @@  typedef uint64_t xen_callback_t;
 #define GUEST_GICV3_GICR0_BASE     xen_mk_ullong(0x03020000) /* vCPU0..127 */
 #define GUEST_GICV3_GICR0_SIZE     xen_mk_ullong(0x01000000)
 
+/*
+ * 256 MB is reserved for VPCI configuration space based on calculation
+ * 256 buses × 32 devices × 8 functions × 4 KB = 256 MB
+ */
+#define GUEST_VPCI_ECAM_BASE    xen_mk_ullong(0x10000000)
+#define GUEST_VPCI_ECAM_SIZE    xen_mk_ullong(0x10000000)
+
 /* ACPI tables physical address */
 #define GUEST_ACPI_BASE xen_mk_ullong(0x20000000)
 #define GUEST_ACPI_SIZE xen_mk_ullong(0x02000000)
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 70ac25345c..43b8a08170 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -40,6 +40,8 @@ 
 #define PCI_SBDF3(s,b,df) \
     ((pci_sbdf_t){ .sbdf = (((s) & 0xffff) << 16) | PCI_BDF2(b, df) })
 
+#define MMCFG_BDF(addr)  (((addr) & 0x0ffff000) >> 12)
+
 typedef union {
     uint32_t sbdf;
     struct {