diff mbox series

[v2,2/8] vpci/header: Emulate legacy capability list for host

Message ID 20250409064528.405573-3-Jiqian.Chen@amd.com (mailing list archive)
State New
Headers show
Series Support hiding capability when its initialization fails | expand

Commit Message

Chen, Jiqian April 9, 2025, 6:45 a.m. UTC
Current logic of init_header() only emulates legacy capability list
for guest, expand it to emulate for host too. So that it will be
easy to hide a capability whose initialization fails and no need
to distinguish host or guest.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v1->v2 changes:
new patch

Best regards,
Jiqian Chen.
---
 xen/drivers/vpci/header.c | 139 ++++++++++++++++++++------------------
 1 file changed, 74 insertions(+), 65 deletions(-)

Comments

Jan Beulich April 10, 2025, 12:40 p.m. UTC | #1
On 09.04.2025 08:45, Jiqian Chen wrote:
> Current logic of init_header() only emulates legacy capability list
> for guest, expand it to emulate for host too. So that it will be
> easy to hide a capability whose initialization fails and no need
> to distinguish host or guest.

Before I (maybe) look at the patch as a hole: When you say "guest" you
mean "DomU" (which is a common pair of synonyms we use), whereas when
you say "host" you mean "Dom0"? The latter is not a common pair of
synonyms; "host" generally covers the entire system, including Xen and
all domains. See e.g. wording that we use in XSAs.

Jan
Chen, Jiqian April 11, 2025, 2:54 a.m. UTC | #2
On 2025/4/10 20:40, Jan Beulich wrote:
> On 09.04.2025 08:45, Jiqian Chen wrote:
>> Current logic of init_header() only emulates legacy capability list
>> for guest, expand it to emulate for host too. So that it will be
>> easy to hide a capability whose initialization fails and no need
>> to distinguish host or guest.
> 
> Before I (maybe) look at the patch as a hole: When you say "guest" you
> mean "DomU" (which is a common pair of synonyms we use), whereas when
> you say "host" you mean "Dom0"? The latter is not a common pair of
> synonyms; "host" generally covers the entire system, including Xen and
> all domains. See e.g. wording that we use in XSAs.
Yes, guest means domU, host means dom0(or hardware domain?), I will change my word in next version
Thank you!

> 
> Jan
Jan Beulich April 11, 2025, 8:01 a.m. UTC | #3
On 11.04.2025 04:54, Chen, Jiqian wrote:
> On 2025/4/10 20:40, Jan Beulich wrote:
>> On 09.04.2025 08:45, Jiqian Chen wrote:
>>> Current logic of init_header() only emulates legacy capability list
>>> for guest, expand it to emulate for host too. So that it will be
>>> easy to hide a capability whose initialization fails and no need
>>> to distinguish host or guest.
>>
>> Before I (maybe) look at the patch as a hole: When you say "guest" you
>> mean "DomU" (which is a common pair of synonyms we use), whereas when
>> you say "host" you mean "Dom0"? The latter is not a common pair of
>> synonyms; "host" generally covers the entire system, including Xen and
>> all domains. See e.g. wording that we use in XSAs.
> Yes, guest means domU, host means dom0(or hardware domain?),

We're less strict about that latter distinction - in most contexts
dom0 == hwdom is assumed.

> I will change my word in next version

Thanks.

Jan
Roger Pau Monné April 15, 2025, 9:25 a.m. UTC | #4
On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
> Current logic of init_header() only emulates legacy capability list
> for guest, expand it to emulate for host too. So that it will be
> easy to hide a capability whose initialization fails and no need
> to distinguish host or guest.

It might be best if the initial code movement of the logic in
init_header() into it's own separate function was done as a
non-functional change, and a later patch added support for dom0.

It's easier to then spot the differences that you are adding to
support dom0.

> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v1->v2 changes:
> new patch
> 
> Best regards,
> Jiqian Chen.
> ---
>  xen/drivers/vpci/header.c | 139 ++++++++++++++++++++------------------
>  1 file changed, 74 insertions(+), 65 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ef6c965c081c..0910eb940e23 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -745,6 +745,76 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
>      return !bar->mem ? -ENOMEM : 0;
>  }
>  
> +/* These capabilities can be exposed to the guest, that vPCI can handle. */
> +static const unsigned int guest_supported_caps[] = {
> +    PCI_CAP_ID_MSI,
> +    PCI_CAP_ID_MSIX,
> +};

Is there a reason this needs to be defined outside of the function
scope?  So far it's only used by vpci_init_capability_list().

> +
> +static int vpci_init_capability_list(struct pci_dev *pdev)
> +{
> +    int rc;
> +    bool mask_cap_list = false;
> +    bool is_hwdom = is_hardware_domain(pdev->domain);
> +    const unsigned int *caps = is_hwdom ? NULL : guest_supported_caps;
> +    const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(guest_supported_caps);
> +
> +    if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> +    {
> +        unsigned int next, ttl = 48;
> +
> +        next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> +                                     caps, n, &ttl);
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> +                               PCI_CAPABILITY_LIST, 1,
> +                               (void *)(uintptr_t)next);
> +        if ( rc )
> +            return rc;
> +
> +        next &= ~3;
> +
> +        if ( !next && !is_hwdom )
> +            /*
> +             * If we don't have any supported capabilities to expose to the
> +             * guest, mask the PCI_STATUS_CAP_LIST bit in the status register.
> +             */
> +            mask_cap_list = true;
> +
> +        while ( next && ttl )
> +        {
> +            unsigned int pos = next;
> +
> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
> +                                         caps, n, &ttl);
> +
> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);

There's no need to add this handler for the hardware domain, that's
already the default behavior in that case.

Thanks, Roger.
Chen, Jiqian April 15, 2025, 10:07 a.m. UTC | #5
On 2025/4/15 17:25, Roger Pau Monné wrote:
> On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
>> Current logic of init_header() only emulates legacy capability list
>> for guest, expand it to emulate for host too. So that it will be
>> easy to hide a capability whose initialization fails and no need
>> to distinguish host or guest.
> 
> It might be best if the initial code movement of the logic in
> init_header() into it's own separate function was done as a
> non-functional change, and a later patch added support for dom0.
> 
> It's easier to then spot the differences that you are adding to
> support dom0.
Got it, I will re-arrange my patch in next version.

> 
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v1->v2 changes:
>> new patch
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>>  xen/drivers/vpci/header.c | 139 ++++++++++++++++++++------------------
>>  1 file changed, 74 insertions(+), 65 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index ef6c965c081c..0910eb940e23 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -745,6 +745,76 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
>>      return !bar->mem ? -ENOMEM : 0;
>>  }
>>  
>> +/* These capabilities can be exposed to the guest, that vPCI can handle. */
>> +static const unsigned int guest_supported_caps[] = {
>> +    PCI_CAP_ID_MSI,
>> +    PCI_CAP_ID_MSIX,
>> +};
> 
> Is there a reason this needs to be defined outside of the function
> scope?  So far it's only used by vpci_init_capability_list().
Because, for dom0 I don't need to pass this array, so I need to set below parameter "caps" to be NULL or guest_supported_caps according to the type of domain.
If inside the function, I can't to do that since "caps" is const, I think.
Do you have any suggestions?

> 
>> +
>> +static int vpci_init_capability_list(struct pci_dev *pdev)
>> +{
>> +    int rc;
>> +    bool mask_cap_list = false;
>> +    bool is_hwdom = is_hardware_domain(pdev->domain);
>> +    const unsigned int *caps = is_hwdom ? NULL : guest_supported_caps;
>> +    const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(guest_supported_caps);
>> +
>> +    if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
>> +    {
>> +        unsigned int next, ttl = 48;
>> +
>> +        next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
>> +                                     caps, n, &ttl);
>> +
>> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>> +                               PCI_CAPABILITY_LIST, 1,
>> +                               (void *)(uintptr_t)next);
>> +        if ( rc )
>> +            return rc;
>> +
>> +        next &= ~3;
>> +
>> +        if ( !next && !is_hwdom )
>> +            /*
>> +             * If we don't have any supported capabilities to expose to the
>> +             * guest, mask the PCI_STATUS_CAP_LIST bit in the status register.
>> +             */
>> +            mask_cap_list = true;
>> +
>> +        while ( next && ttl )
>> +        {
>> +            unsigned int pos = next;
>> +
>> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
>> +                                         caps, n, &ttl);
>> +
>> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);
> 
> There's no need to add this handler for the hardware domain, that's
> already the default behavior in that case.
But if not, I have no handler to remove from capability list in next patch's hiding function vpci_capability_mask(),
then I can't success to hide it.

> 
> Thanks, Roger.
Jan Beulich April 15, 2025, 10:08 a.m. UTC | #6
On 15.04.2025 12:07, Chen, Jiqian wrote:
> On 2025/4/15 17:25, Roger Pau Monné wrote:
>> On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
>>> --- a/xen/drivers/vpci/header.c
>>> +++ b/xen/drivers/vpci/header.c
>>> @@ -745,6 +745,76 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
>>>      return !bar->mem ? -ENOMEM : 0;
>>>  }
>>>  
>>> +/* These capabilities can be exposed to the guest, that vPCI can handle. */
>>> +static const unsigned int guest_supported_caps[] = {
>>> +    PCI_CAP_ID_MSI,
>>> +    PCI_CAP_ID_MSIX,
>>> +};
>>
>> Is there a reason this needs to be defined outside of the function
>> scope?  So far it's only used by vpci_init_capability_list().
> Because, for dom0 I don't need to pass this array, so I need to set below parameter "caps" to be NULL or guest_supported_caps according to the type of domain.
> If inside the function, I can't to do that since "caps" is const, I think.

Why would the more narrow scope of the array affect how it can be used?

Jan
Roger Pau Monné April 15, 2025, 10:10 a.m. UTC | #7
On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
> Current logic of init_header() only emulates legacy capability list
> for guest, expand it to emulate for host too. So that it will be
> easy to hide a capability whose initialization fails and no need
> to distinguish host or guest.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v1->v2 changes:
> new patch
> 
> Best regards,
> Jiqian Chen.
> ---
>  xen/drivers/vpci/header.c | 139 ++++++++++++++++++++------------------
>  1 file changed, 74 insertions(+), 65 deletions(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index ef6c965c081c..0910eb940e23 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -745,6 +745,76 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
>      return !bar->mem ? -ENOMEM : 0;
>  }
>  
> +/* These capabilities can be exposed to the guest, that vPCI can handle. */
> +static const unsigned int guest_supported_caps[] = {
> +    PCI_CAP_ID_MSI,
> +    PCI_CAP_ID_MSIX,
> +};
> +
> +static int vpci_init_capability_list(struct pci_dev *pdev)
> +{
> +    int rc;
> +    bool mask_cap_list = false;
> +    bool is_hwdom = is_hardware_domain(pdev->domain);
> +    const unsigned int *caps = is_hwdom ? NULL : guest_supported_caps;
> +    const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(guest_supported_caps);
> +
> +    if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> +    {
> +        unsigned int next, ttl = 48;
> +
> +        next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> +                                     caps, n, &ttl);
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> +                               PCI_CAPABILITY_LIST, 1,
> +                               (void *)(uintptr_t)next);
> +        if ( rc )
> +            return rc;
> +
> +        next &= ~3;
> +
> +        if ( !next && !is_hwdom )
> +            /*
> +             * If we don't have any supported capabilities to expose to the
> +             * guest, mask the PCI_STATUS_CAP_LIST bit in the status register.
> +             */
> +            mask_cap_list = true;
> +
> +        while ( next && ttl )
> +        {
> +            unsigned int pos = next;
> +
> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
> +                                         caps, n, &ttl);
> +
> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);
> +            if ( rc )
> +                return rc;
> +
> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> +                                   pos + PCI_CAP_LIST_NEXT, 1,
> +                                   (void *)(uintptr_t)next);
> +            if ( rc )
> +                return rc;
> +
> +            next &= ~3;
> +        }
> +    }
> +
> +    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
> +    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
> +                                PCI_STATUS, 2, NULL,
> +                                PCI_STATUS_RO_MASK &
> +                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
> +                                PCI_STATUS_RW1C_MASK,
> +                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
> +                                PCI_STATUS_RSVDZ_MASK);

One further remark I've forgot to make on the previous reply: I'm
unsure we want to do this for dom0, as we in general allow dom0
unfiltered write access (unless it's for accesses mediated by Xen).

Thanks, Roger.
Chen, Jiqian April 15, 2025, 10:23 a.m. UTC | #8
On 2025/4/15 18:08, Jan Beulich wrote:
> On 15.04.2025 12:07, Chen, Jiqian wrote:
>> On 2025/4/15 17:25, Roger Pau Monné wrote:
>>> On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
>>>> --- a/xen/drivers/vpci/header.c
>>>> +++ b/xen/drivers/vpci/header.c
>>>> @@ -745,6 +745,76 @@ static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
>>>>      return !bar->mem ? -ENOMEM : 0;
>>>>  }
>>>>  
>>>> +/* These capabilities can be exposed to the guest, that vPCI can handle. */
>>>> +static const unsigned int guest_supported_caps[] = {
>>>> +    PCI_CAP_ID_MSI,
>>>> +    PCI_CAP_ID_MSIX,
>>>> +};
>>>
>>> Is there a reason this needs to be defined outside of the function
>>> scope?  So far it's only used by vpci_init_capability_list().
>> Because, for dom0 I don't need to pass this array, so I need to set below parameter "caps" to be NULL or guest_supported_caps according to the type of domain.
>> If inside the function, I can't to do that since "caps" is const, I think.
> 
> Why would the more narrow scope of the array affect how it can be used?
Sorry, I have thought about it again and it can be put it inside the function. I will change in the next version.

> 
> Jan
Roger Pau Monné April 15, 2025, 10:50 a.m. UTC | #9
On Tue, Apr 15, 2025 at 10:07:14AM +0000, Chen, Jiqian wrote:
> On 2025/4/15 17:25, Roger Pau Monné wrote:
> > On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
> >> +static int vpci_init_capability_list(struct pci_dev *pdev)
> >> +{
> >> +    int rc;
> >> +    bool mask_cap_list = false;
> >> +    bool is_hwdom = is_hardware_domain(pdev->domain);
> >> +    const unsigned int *caps = is_hwdom ? NULL : guest_supported_caps;
> >> +    const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(guest_supported_caps);
> >> +
> >> +    if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
> >> +    {
> >> +        unsigned int next, ttl = 48;
> >> +
> >> +        next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
> >> +                                     caps, n, &ttl);
> >> +
> >> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> >> +                               PCI_CAPABILITY_LIST, 1,
> >> +                               (void *)(uintptr_t)next);
> >> +        if ( rc )
> >> +            return rc;
> >> +
> >> +        next &= ~3;
> >> +
> >> +        if ( !next && !is_hwdom )
> >> +            /*
> >> +             * If we don't have any supported capabilities to expose to the
> >> +             * guest, mask the PCI_STATUS_CAP_LIST bit in the status register.
> >> +             */
> >> +            mask_cap_list = true;
> >> +
> >> +        while ( next && ttl )
> >> +        {
> >> +            unsigned int pos = next;
> >> +
> >> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
> >> +                                         caps, n, &ttl);
> >> +
> >> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> >> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);
> > 
> > There's no need to add this handler for the hardware domain, that's
> > already the default behavior in that case.
> But if not, I have no handler to remove from capability list in next patch's hiding function vpci_capability_mask(),
> then I can't success to hide it.

Oh, I see, I have further comments on that approach, see the comments
on the followup patches.

Thanks, Roger.
Chen, Jiqian April 16, 2025, 2:51 a.m. UTC | #10
On 2025/4/15 18:10, Roger Pau Monné wrote:
> On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
>> +        while ( next && ttl )
>> +        {
>> +            unsigned int pos = next;
>> +
>> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
>> +                                         caps, n, &ttl);
>> +
>> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
>> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);
>> +            if ( rc )
>> +                return rc;
>> +
>> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>> +                                   pos + PCI_CAP_LIST_NEXT, 1,
>> +                                   (void *)(uintptr_t)next);
>> +            if ( rc )
>> +                return rc;
>> +
>> +            next &= ~3;
>> +        }
>> +    }
>> +
>> +    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
>> +    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
>> +                                PCI_STATUS, 2, NULL,
>> +                                PCI_STATUS_RO_MASK &
>> +                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
>> +                                PCI_STATUS_RW1C_MASK,
>> +                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
>> +                                PCI_STATUS_RSVDZ_MASK);
> 
> One further remark I've forgot to make on the previous reply: I'm
> unsure we want to do this for dom0, as we in general allow dom0
> unfiltered write access (unless it's for accesses mediated by Xen).
This part is the original implementation before my patches.
If you want to restrict this only for domUs, not for dom0.
I will add a new first patch to do that.

> 
> Thanks, Roger.
Roger Pau Monné April 16, 2025, 7:32 a.m. UTC | #11
On Wed, Apr 16, 2025 at 02:51:18AM +0000, Chen, Jiqian wrote:
> On 2025/4/15 18:10, Roger Pau Monné wrote:
> > On Wed, Apr 09, 2025 at 02:45:22PM +0800, Jiqian Chen wrote:
> >> +        while ( next && ttl )
> >> +        {
> >> +            unsigned int pos = next;
> >> +
> >> +            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
> >> +                                         caps, n, &ttl);
> >> +
> >> +            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
> >> +                                   pos + PCI_CAP_LIST_ID, 1, NULL);
> >> +            if ( rc )
> >> +                return rc;
> >> +
> >> +            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
> >> +                                   pos + PCI_CAP_LIST_NEXT, 1,
> >> +                                   (void *)(uintptr_t)next);
> >> +            if ( rc )
> >> +                return rc;
> >> +
> >> +            next &= ~3;
> >> +        }
> >> +    }
> >> +
> >> +    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
> >> +    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
> >> +                                PCI_STATUS, 2, NULL,
> >> +                                PCI_STATUS_RO_MASK &
> >> +                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
> >> +                                PCI_STATUS_RW1C_MASK,
> >> +                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
> >> +                                PCI_STATUS_RSVDZ_MASK);
> > 
> > One further remark I've forgot to make on the previous reply: I'm
> > unsure we want to do this for dom0, as we in general allow dom0
> > unfiltered write access (unless it's for accesses mediated by Xen).
> This part is the original implementation before my patches.
> If you want to restrict this only for domUs, not for dom0.

Oh, indeed.  I was confused thinking it was inside the
!is_hardware_domain() conditional.  Leave it like that, your series is
already long enough.

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ef6c965c081c..0910eb940e23 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -745,6 +745,76 @@  static int bar_add_rangeset(const struct pci_dev *pdev, struct vpci_bar *bar,
     return !bar->mem ? -ENOMEM : 0;
 }
 
+/* These capabilities can be exposed to the guest, that vPCI can handle. */
+static const unsigned int guest_supported_caps[] = {
+    PCI_CAP_ID_MSI,
+    PCI_CAP_ID_MSIX,
+};
+
+static int vpci_init_capability_list(struct pci_dev *pdev)
+{
+    int rc;
+    bool mask_cap_list = false;
+    bool is_hwdom = is_hardware_domain(pdev->domain);
+    const unsigned int *caps = is_hwdom ? NULL : guest_supported_caps;
+    const unsigned int n = is_hwdom ? 0 : ARRAY_SIZE(guest_supported_caps);
+
+    if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
+    {
+        unsigned int next, ttl = 48;
+
+        next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
+                                     caps, n, &ttl);
+
+        rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+                               PCI_CAPABILITY_LIST, 1,
+                               (void *)(uintptr_t)next);
+        if ( rc )
+            return rc;
+
+        next &= ~3;
+
+        if ( !next && !is_hwdom )
+            /*
+             * If we don't have any supported capabilities to expose to the
+             * guest, mask the PCI_STATUS_CAP_LIST bit in the status register.
+             */
+            mask_cap_list = true;
+
+        while ( next && ttl )
+        {
+            unsigned int pos = next;
+
+            next = pci_find_next_cap_ttl(pdev->sbdf, pos + PCI_CAP_LIST_NEXT,
+                                         caps, n, &ttl);
+
+            rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
+                                   pos + PCI_CAP_LIST_ID, 1, NULL);
+            if ( rc )
+                return rc;
+
+            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
+                                   pos + PCI_CAP_LIST_NEXT, 1,
+                                   (void *)(uintptr_t)next);
+            if ( rc )
+                return rc;
+
+            next &= ~3;
+        }
+    }
+
+    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
+    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
+                                PCI_STATUS, 2, NULL,
+                                PCI_STATUS_RO_MASK &
+                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
+                                PCI_STATUS_RW1C_MASK,
+                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
+                                PCI_STATUS_RSVDZ_MASK);
+
+    return rc;
+}
+
 static int cf_check init_header(struct pci_dev *pdev)
 {
     uint16_t cmd;
@@ -753,7 +823,6 @@  static int cf_check init_header(struct pci_dev *pdev)
     struct vpci_header *header = &pdev->vpci->header;
     struct vpci_bar *bars = header->bars;
     int rc;
-    bool mask_cap_list = false;
     bool is_hwdom = is_hardware_domain(pdev->domain);
 
     ASSERT(rw_is_write_locked(&pdev->domain->pci_lock));
@@ -794,61 +863,12 @@  static int cf_check init_header(struct pci_dev *pdev)
     if ( rc )
         return rc;
 
+    rc = vpci_init_capability_list(pdev);
+    if ( rc )
+        return rc;
+
     if ( !is_hwdom )
     {
-        if ( pci_conf_read16(pdev->sbdf, PCI_STATUS) & PCI_STATUS_CAP_LIST )
-        {
-            /* Only expose capabilities to the guest that vPCI can handle. */
-            unsigned int next, ttl = 48;
-            static const unsigned int supported_caps[] = {
-                PCI_CAP_ID_MSI,
-                PCI_CAP_ID_MSIX,
-            };
-
-            next = pci_find_next_cap_ttl(pdev->sbdf, PCI_CAPABILITY_LIST,
-                                         supported_caps,
-                                         ARRAY_SIZE(supported_caps), &ttl);
-
-            rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
-                                   PCI_CAPABILITY_LIST, 1,
-                                   (void *)(uintptr_t)next);
-            if ( rc )
-                return rc;
-
-            next &= ~3;
-
-            if ( !next )
-                /*
-                 * If we don't have any supported capabilities to expose to the
-                 * guest, mask the PCI_STATUS_CAP_LIST bit in the status
-                 * register.
-                 */
-                mask_cap_list = true;
-
-            while ( next && ttl )
-            {
-                unsigned int pos = next;
-
-                next = pci_find_next_cap_ttl(pdev->sbdf,
-                                             pos + PCI_CAP_LIST_NEXT,
-                                             supported_caps,
-                                             ARRAY_SIZE(supported_caps), &ttl);
-
-                rc = vpci_add_register(pdev->vpci, vpci_hw_read8, NULL,
-                                       pos + PCI_CAP_LIST_ID, 1, NULL);
-                if ( rc )
-                    return rc;
-
-                rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL,
-                                       pos + PCI_CAP_LIST_NEXT, 1,
-                                       (void *)(uintptr_t)next);
-                if ( rc )
-                    return rc;
-
-                next &= ~3;
-            }
-        }
-
         /* Extended capabilities read as zero, write ignore */
         rc = vpci_add_register(pdev->vpci, vpci_read_val, NULL, 0x100, 4,
                                (void *)0);
@@ -856,17 +876,6 @@  static int cf_check init_header(struct pci_dev *pdev)
             return rc;
     }
 
-    /* Utilize rsvdp_mask to hide PCI_STATUS_CAP_LIST from the guest. */
-    rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16,
-                                PCI_STATUS, 2, NULL,
-                                PCI_STATUS_RO_MASK &
-                                    ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0),
-                                PCI_STATUS_RW1C_MASK,
-                                mask_cap_list ? PCI_STATUS_CAP_LIST : 0,
-                                PCI_STATUS_RSVDZ_MASK);
-    if ( rc )
-        return rc;
-
     if ( pdev->ignore_bars )
         return 0;