diff mbox series

[2/2] tools/light: Revoke permissions when a PCI detach for HVM domain

Message ID 20230809103305.30561-3-julien@xen.org (mailing list archive)
State New, archived
Headers show
Series Revoke IOMEM/IO port/IRQ permissions on PCI detach for HVM guest | expand

Commit Message

Julien Grall Aug. 9, 2023, 10:33 a.m. UTC
From: Julien Grall <jgrall@amazon.com>

Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
a PCI is attached (see pci_add_dm_done()) for all domain types. However,
the permissions are only revoked for non-HVM domain (see do_pci_remove()).

This means that HVM domains will be left with extra permissions. While
this look bad on the paper, the IRQ permissions should be revoked
when the Device Model call xc_physdev_unmap_pirq() and such domain
cannot directly mapped I/O port and IOMEM regions. Instead, this has to
be done by a Device Model.

The Device Model can only run in dom0 or PV stubdomain (upstream libxl
doesn't have support for HVM/PVH stubdomain).

For PV/PVH stubdomain, the permission are properly revoked, so there is
no security concern.

This leaves dom0. There are two cases:
  1) Privileged: Anyone gaining access to the Device Model would already
     have large control on the host.
  2) Deprivileged: PCI passthrough require PHYSDEV operations which
     are not accessible when the Device Model is restricted.

So overall, it is believed that the extra permissions cannot be exploited.

Rework the code so the permissions are all removed for HVM domains.
This needs to happen after the QEMU has detached the device. So
the revocation is now moved in a separate function which is called
from pci_remove_detached().

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

TODO: I am getting a bit confused with the async work in libxl. I am
not entirely sure whether pci_remove_detached() is the correct place
to revoke.

TODO: For HVM, we are now getting the following error on detach:
libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 3:xc_physdev_unmap_pirq irq=23: Invalid argument

This is because the IRQ was unmapped by QEMU. It doesn't feel
right to skip the call. So maybe we can ignore the error?
---
 tools/libs/light/libxl_pci.c | 142 ++++++++++++++++++++---------------
 1 file changed, 80 insertions(+), 62 deletions(-)

Comments

Jason Andryuk Aug. 10, 2023, 8:02 p.m. UTC | #1
On Wed, Aug 9, 2023 at 6:34 AM Julien Grall <julien@xen.org> wrote:
>
> From: Julien Grall <jgrall@amazon.com>
>
> Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
> a PCI is attached (see pci_add_dm_done()) for all domain types. However,
> the permissions are only revoked for non-HVM domain (see do_pci_remove()).
>
> This means that HVM domains will be left with extra permissions. While
> this look bad on the paper, the IRQ permissions should be revoked
> when the Device Model call xc_physdev_unmap_pirq() and such domain
> cannot directly mapped I/O port and IOMEM regions. Instead, this has to
> be done by a Device Model.
>
> The Device Model can only run in dom0 or PV stubdomain (upstream libxl
> doesn't have support for HVM/PVH stubdomain).
>
> For PV/PVH stubdomain, the permission are properly revoked, so there is
> no security concern.
>
> This leaves dom0. There are two cases:
>   1) Privileged: Anyone gaining access to the Device Model would already
>      have large control on the host.
>   2) Deprivileged: PCI passthrough require PHYSDEV operations which
>      are not accessible when the Device Model is restricted.
>
> So overall, it is believed that the extra permissions cannot be exploited.
>
> Rework the code so the permissions are all removed for HVM domains.
> This needs to happen after the QEMU has detached the device. So
> the revocation is now moved in a separate function which is called
> from pci_remove_detached().
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>

Reviewed-by: Jason Andryuk <jandryuk@gmail.com>

With one suggestion below

> ---
>
> TODO: I am getting a bit confused with the async work in libxl. I am
> not entirely sure whether pci_remove_detached() is the correct place
> to revoke.

I think the location is fine.

> TODO: For HVM, we are now getting the following error on detach:
> libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 3:xc_physdev_unmap_pirq irq=23: Invalid argument
>
> This is because the IRQ was unmapped by QEMU. It doesn't feel
> right to skip the call. So maybe we can ignore the error?

Sounds reasonable.  Would be better if we could clearly differentiate
between QEMU already unmapped and some other EINVAL error.

> ---
>  tools/libs/light/libxl_pci.c | 142 ++++++++++++++++++++---------------
>  1 file changed, 80 insertions(+), 62 deletions(-)
>
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 7f5f170e6eb0..f5a4b88eb2c0 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1943,6 +1943,79 @@ static void pci_remove_stubdom_done(libxl__egc *egc,
>  static void pci_remove_done(libxl__egc *egc,
>      pci_remove_state *prs, int rc);
>
> +static void pci_revoke_permissions(libxl__egc *egc, pci_remove_state *prs)
> +{
> +    STATE_AO_GC(prs->aodev->ao);
> +    libxl_ctx *ctx = libxl__gc_owner(gc);
> +    const libxl_device_pci *pci = &prs->pci;
> +    const char *sysfs_path;
> +    uint32_t domid = prs->domid;
> +    FILE *f;
> +    unsigned int start = 0, end = 0, flags = 0, size = 0;

These variables ...

> +    int irq = 0;
> +    int i, rc;
> +
> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource",
> +                           pci->domain, pci->bus, pci->dev, pci->func);
> +
> +    f = fopen(sysfs_path, "r");
> +    if (f == NULL) {
> +        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
> +        goto skip_bar;
> +    }
> +
> +    for (i = 0; i < PROC_PCI_NUM_RESOURCES; i++) {

... could move into the loop here.

> +        if (fscanf(f, "0x%x 0x%x 0x%x\n", &start, &end, &flags) != 3)
> +            continue;
> +        size = end - start + 1;

Regards,
Jason
Julien Grall Aug. 11, 2023, 10:32 a.m. UTC | #2
Hi Jason,

On 10/08/2023 21:02, Jason Andryuk wrote:
> On Wed, Aug 9, 2023 at 6:34 AM Julien Grall <julien@xen.org> wrote:
>>
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
>> a PCI is attached (see pci_add_dm_done()) for all domain types. However,
>> the permissions are only revoked for non-HVM domain (see do_pci_remove()).
>>
>> This means that HVM domains will be left with extra permissions. While
>> this look bad on the paper, the IRQ permissions should be revoked
>> when the Device Model call xc_physdev_unmap_pirq() and such domain
>> cannot directly mapped I/O port and IOMEM regions. Instead, this has to
>> be done by a Device Model.
>>
>> The Device Model can only run in dom0 or PV stubdomain (upstream libxl
>> doesn't have support for HVM/PVH stubdomain).
>>
>> For PV/PVH stubdomain, the permission are properly revoked, so there is
>> no security concern.
>>
>> This leaves dom0. There are two cases:
>>    1) Privileged: Anyone gaining access to the Device Model would already
>>       have large control on the host.
>>    2) Deprivileged: PCI passthrough require PHYSDEV operations which
>>       are not accessible when the Device Model is restricted.
>>
>> So overall, it is believed that the extra permissions cannot be exploited.
>>
>> Rework the code so the permissions are all removed for HVM domains.
>> This needs to happen after the QEMU has detached the device. So
>> the revocation is now moved in a separate function which is called
>> from pci_remove_detached().
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>

Thanks for the review!

> 
> With one suggestion below
> 
>> ---
>>
>> TODO: I am getting a bit confused with the async work in libxl. I am
>> not entirely sure whether pci_remove_detached() is the correct place
>> to revoke.
> 
> I think the location is fine.
> 
>> TODO: For HVM, we are now getting the following error on detach:
>> libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 3:xc_physdev_unmap_pirq irq=23: Invalid argument
>>
>> This is because the IRQ was unmapped by QEMU. It doesn't feel
>> right to skip the call. So maybe we can ignore the error?
> 
> Sounds reasonable.  Would be better if we could clearly differentiate
> between QEMU already unmapped and some other EINVAL error.

The only way I could think is to have a distinct errno in Xen to 
indicate the PIRQ is not mapped. CCing the x86 maintainers to check if 
they would be ok with that.

> 
>> ---
>>   tools/libs/light/libxl_pci.c | 142 ++++++++++++++++++++---------------
>>   1 file changed, 80 insertions(+), 62 deletions(-)
>>
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 7f5f170e6eb0..f5a4b88eb2c0 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1943,6 +1943,79 @@ static void pci_remove_stubdom_done(libxl__egc *egc,
>>   static void pci_remove_done(libxl__egc *egc,
>>       pci_remove_state *prs, int rc);
>>
>> +static void pci_revoke_permissions(libxl__egc *egc, pci_remove_state *prs)
>> +{
>> +    STATE_AO_GC(prs->aodev->ao);
>> +    libxl_ctx *ctx = libxl__gc_owner(gc);
>> +    const libxl_device_pci *pci = &prs->pci;
>> +    const char *sysfs_path;
>> +    uint32_t domid = prs->domid;
>> +    FILE *f;
>> +    unsigned int start = 0, end = 0, flags = 0, size = 0;
> 
> These variables ...
> 
>> +    int irq = 0;
>> +    int i, rc;
>> +
>> +    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource",
>> +                           pci->domain, pci->bus, pci->dev, pci->func);
>> +
>> +    f = fopen(sysfs_path, "r");
>> +    if (f == NULL) {
>> +        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
>> +        goto skip_bar;
>> +    }
>> +
>> +    for (i = 0; i < PROC_PCI_NUM_RESOURCES; i++) {
> 
> ... could move into the loop here.

You are right. I will respin the patch with this change and whatever we 
decide for the unmap error.

Cheers,
Anthony PERARD Aug. 18, 2023, 5:04 p.m. UTC | #3
On Wed, Aug 09, 2023 at 11:33:05AM +0100, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
> 
> Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
> a PCI is attached (see pci_add_dm_done()) for all domain types. However,
> the permissions are only revoked for non-HVM domain (see do_pci_remove()).
> 
> This means that HVM domains will be left with extra permissions. While
> this look bad on the paper, the IRQ permissions should be revoked
> when the Device Model call xc_physdev_unmap_pirq() and such domain
> cannot directly mapped I/O port and IOMEM regions. Instead, this has to
> be done by a Device Model.
> 
> The Device Model can only run in dom0 or PV stubdomain (upstream libxl
> doesn't have support for HVM/PVH stubdomain).
> 
> For PV/PVH stubdomain, the permission are properly revoked, so there is
> no security concern.
> 
> This leaves dom0. There are two cases:
>   1) Privileged: Anyone gaining access to the Device Model would already
>      have large control on the host.
>   2) Deprivileged: PCI passthrough require PHYSDEV operations which
>      are not accessible when the Device Model is restricted.
> 
> So overall, it is believed that the extra permissions cannot be exploited.
> 
> Rework the code so the permissions are all removed for HVM domains.
> This needs to happen after the QEMU has detached the device. So
> the revocation is now moved in a separate function which is called
> from pci_remove_detached().
> 
> Signed-off-by: Julien Grall <jgrall@amazon.com>
> 
> ---
> 
> TODO: I am getting a bit confused with the async work in libxl. I am
> not entirely sure whether pci_remove_detached() is the correct place
> to revoke.

Whenever an async task in libxl takes more than one function to
complete, the next function (or callback) that is going to be executed
is further down in the current source file (usually). This is to try to
avoid too much confusion when reading through a set of async calls. So
pci_remove_detached() is after all the DM stuff are done, and it's
before we deal with stubdom case which will go through these step again,
so it seems appropriate.

So, this new pci_revoke_permissions() function been place before
do_pci_remove() will make it harder to follow what do_pci_remove() does.
Does it need to be a separate function? Can't you inline it in
pci_remove_detached() ?

If it does needs to be a separate function, a better way to lay it down
would be to replace calls to pci_remove_detached() by
pci_revoke_permissions() as appropriate, and rename it with the prefixed
"pci_remove_", that is pci_remove_revoke_permissions().

> TODO: For HVM, we are now getting the following error on detach:
> libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 3:xc_physdev_unmap_pirq irq=23: Invalid argument
> 
> This is because the IRQ was unmapped by QEMU. It doesn't feel
> right to skip the call. So maybe we can ignore the error?

The error is already ignore. But I guess you just want to skip writing
an error message. But I think we should still write something, at least
a DEBUG message. Also add a comment that QEMU also unmap it, so errors
are expected.

> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 7f5f170e6eb0..f5a4b88eb2c0 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1980,75 +2052,19 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>              prs->xswait.timeout_ms = LIBXL_DEVICE_MODEL_START_TIMEOUT * 1000;
>              prs->xswait.callback = pci_remove_qemu_trad_watch_state_cb;
>              rc = libxl__xswait_start(gc, &prs->xswait);
> -            if (rc) goto out_fail;
> -            return;
> +            if (!rc) return;

This is confusing, we usually check for error condition in libxl, not
success condition. So the currently written code is better.

> +            break;
>          case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
>              pci_remove_qmp_device_del(egc, prs); /* must be last */
>              return;
>          default:
>              rc = ERROR_INVAL;
> -            goto out_fail;
> +            break;

You can keep the goto here, this is the usual way to deal with error.
(except a label named "out" would be more appropriate, but out_fail is
fine).

>          }
>      } else {
> +        rc = 0;

You don't need to set rc in the else block and just set it after the if.
The true block of the "if(hvm)" can skip to out_fail on error to avoid
the rc=0. That's an expected pattern in libxl.


>      }
> -skip_irq:
> -    rc = 0;
> +
>  out_fail:
>      pci_remove_detached(egc, prs, rc); /* must be last */
>  }
> @@ -2242,6 +2258,8 @@ static void pci_remove_detached(libxl__egc *egc,
>      if (rc && !prs->force)
>          goto out;
>  
> +    pci_revoke_permissions(egc, prs);
> +
>      isstubdom = libxl_is_stubdom(CTX, domid, &domainid);
>  
>      /* don't do multiple resets while some functions are still passed through */

Thanks,
Julien Grall Aug. 24, 2023, 11:15 a.m. UTC | #4
Hi Anthony,

On 18/08/2023 18:04, Anthony PERARD wrote:
> On Wed, Aug 09, 2023 at 11:33:05AM +0100, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently, libxl will grant IOMEM, I/O port and IRQ permissions when
>> a PCI is attached (see pci_add_dm_done()) for all domain types. However,
>> the permissions are only revoked for non-HVM domain (see do_pci_remove()).
>>
>> This means that HVM domains will be left with extra permissions. While
>> this look bad on the paper, the IRQ permissions should be revoked
>> when the Device Model call xc_physdev_unmap_pirq() and such domain
>> cannot directly mapped I/O port and IOMEM regions. Instead, this has to
>> be done by a Device Model.
>>
>> The Device Model can only run in dom0 or PV stubdomain (upstream libxl
>> doesn't have support for HVM/PVH stubdomain).
>>
>> For PV/PVH stubdomain, the permission are properly revoked, so there is
>> no security concern.
>>
>> This leaves dom0. There are two cases:
>>    1) Privileged: Anyone gaining access to the Device Model would already
>>       have large control on the host.
>>    2) Deprivileged: PCI passthrough require PHYSDEV operations which
>>       are not accessible when the Device Model is restricted.
>>
>> So overall, it is believed that the extra permissions cannot be exploited.
>>
>> Rework the code so the permissions are all removed for HVM domains.
>> This needs to happen after the QEMU has detached the device. So
>> the revocation is now moved in a separate function which is called
>> from pci_remove_detached().
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> ---
>>
>> TODO: I am getting a bit confused with the async work in libxl. I am
>> not entirely sure whether pci_remove_detached() is the correct place
>> to revoke.
> 
> Whenever an async task in libxl takes more than one function to
> complete, the next function (or callback) that is going to be executed
> is further down in the current source file (usually). This is to try to
> avoid too much confusion when reading through a set of async calls. So
> pci_remove_detached() is after all the DM stuff are done, and it's
> before we deal with stubdom case which will go through these step again,
> so it seems appropriate.

Ah I didn't realize there was a logic in the ordering. This will help to 
understand the code in the future.

> 
> So, this new pci_revoke_permissions() function been place before
> do_pci_remove() will make it harder to follow what do_pci_remove() does.
> Does it need to be a separate function? Can't you inline it in
> pci_remove_detached() ?

I decided to go with an inline function to avoid increasing the size of 
pci_remove_detached() and also separate the logic from cleaning-up QMP 
an resetting the PCI device.

> 
> If it does needs to be a separate function, a better way to lay it down
> would be to replace calls to pci_remove_detached() by
> pci_revoke_permissions() as appropriate, and rename it with the prefixed
> "pci_remove_", that is pci_remove_revoke_permissions().

I don't understand this suggestion. pci_revoke_permissions() is called 
right in the middle of pci_remove_detached(). So it is not clear how it 
can be called ahead.

Also, if I replace pci_remove_detached() with pci_revoke_permissions(), 
does this mean you are expecting the latter to call the former?

> 
>> TODO: For HVM, we are now getting the following error on detach:
>> libxl: error: libxl_pci.c:2009:pci_revoke_permissions: Domain 3:xc_physdev_unmap_pirq irq=23: Invalid argument
>>
>> This is because the IRQ was unmapped by QEMU. It doesn't feel
>> right to skip the call. So maybe we can ignore the error?
> 
> The error is already ignore. But I guess you just want to skip writing
> an error message. But I think we should still write something, at least
> a DEBUG message. Also add a comment that QEMU also unmap it, so errors
> are expected.
> 
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 7f5f170e6eb0..f5a4b88eb2c0 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -1980,75 +2052,19 @@ static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
>>               prs->xswait.timeout_ms = LIBXL_DEVICE_MODEL_START_TIMEOUT * 1000;
>>               prs->xswait.callback = pci_remove_qemu_trad_watch_state_cb;
>>               rc = libxl__xswait_start(gc, &prs->xswait);
>> -            if (rc) goto out_fail;
>> -            return;
>> +            if (!rc) return;
> 
> This is confusing, we usually check for error condition in libxl, not
> success condition. So the currently written code is better.
> 
>> +            break;
>>           case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
>>               pci_remove_qmp_device_del(egc, prs); /* must be last */
>>               return;
>>           default:
>>               rc = ERROR_INVAL;
>> -            goto out_fail;
>> +            break;
> 
> You can keep the goto here, this is the usual way to deal with error. > (except a label named "out" would be more appropriate, but out_fail is
> fine).
> 
>>           }
>>       } else {
>> +        rc = 0;
> 
> You don't need to set rc in the else block and just set it after the if.

This was done this way because the code after could be called from an 
error path. So setting 'rc = 0' would be wrong.

Anyway, above, you said 'goto' is the way to deal with error in libxl. 
So it would be possible to use 'rc = 0' below.

Cheers,
Anthony PERARD Aug. 24, 2023, 4:34 p.m. UTC | #5
On Thu, Aug 24, 2023 at 12:15:39PM +0100, Julien Grall wrote:
> On 18/08/2023 18:04, Anthony PERARD wrote:
> > So, this new pci_revoke_permissions() function been place before
> > do_pci_remove() will make it harder to follow what do_pci_remove() does.
> > Does it need to be a separate function? Can't you inline it in
> > pci_remove_detached() ?
> 
> I decided to go with an inline function to avoid increasing the size of
> pci_remove_detached() and also separate the logic from cleaning-up QMP an
> resetting the PCI device.

It's fine to have a long function, if there's that much to do. You can
add a comment if you want to separate a bit from the rest.

Having a new function would be ok if it was used from different places,
or if it was needed for a new callback in the chain of callbacks of a
long-running operation.

> > 
> > If it does needs to be a separate function, a better way to lay it down
> > would be to replace calls to pci_remove_detached() by
> > pci_revoke_permissions() as appropriate, and rename it with the prefixed
> > "pci_remove_", that is pci_remove_revoke_permissions().
> 
> I don't understand this suggestion. pci_revoke_permissions() is called right
> in the middle of pci_remove_detached(). So it is not clear how it can be
> called ahead.
> 
> Also, if I replace pci_remove_detached() with pci_revoke_permissions(), does
> this mean you are expecting the latter to call the former?

Let's just keep things simpler, and just add the code into
pci_remove_detached().

Cheers,
Julien Grall Aug. 24, 2023, 4:46 p.m. UTC | #6
Hi Anthony,

On 24/08/2023 17:34, Anthony PERARD wrote:
> On Thu, Aug 24, 2023 at 12:15:39PM +0100, Julien Grall wrote:
>> On 18/08/2023 18:04, Anthony PERARD wrote:
>>> So, this new pci_revoke_permissions() function been place before
>>> do_pci_remove() will make it harder to follow what do_pci_remove() does.
>>> Does it need to be a separate function? Can't you inline it in
>>> pci_remove_detached() ?
>>
>> I decided to go with an inline function to avoid increasing the size of
>> pci_remove_detached() and also separate the logic from cleaning-up QMP an
>> resetting the PCI device.
> 
> It's fine to have a long function, if there's that much to do. You can
> add a comment if you want to separate a bit from the rest.
> 
> Having a new function would be ok if it was used from different places,
> or if it was needed for a new callback in the chain of callbacks of a
> long-running operation.

I don't agree with this definition on when to create a new function. 
Smaller functions help to logically split your code and also avoids to 
have to define 20 local variables right at the beginning of the function 
(this is what will happen with folding the function) among other advantages.

> 
>>>
>>> If it does needs to be a separate function, a better way to lay it down
>>> would be to replace calls to pci_remove_detached() by
>>> pci_revoke_permissions() as appropriate, and rename it with the prefixed
>>> "pci_remove_", that is pci_remove_revoke_permissions().
>>
>> I don't understand this suggestion. pci_revoke_permissions() is called right
>> in the middle of pci_remove_detached(). So it is not clear how it can be
>> called ahead.
>>
>> Also, if I replace pci_remove_detached() with pci_revoke_permissions(), does
>> this mean you are expecting the latter to call the former?
> 
> Let's just keep things simpler, and just add the code into
> pci_remove_detached().

I will attempt to fold the code. But I am not convinced about the 
simplicity and readability.

Cheers,
Anthony PERARD Aug. 24, 2023, 4:58 p.m. UTC | #7
On Thu, Aug 24, 2023 at 05:46:45PM +0100, Julien Grall wrote:
> Hi Anthony,
> 
> On 24/08/2023 17:34, Anthony PERARD wrote:
> > On Thu, Aug 24, 2023 at 12:15:39PM +0100, Julien Grall wrote:
> > > On 18/08/2023 18:04, Anthony PERARD wrote:
> > > > So, this new pci_revoke_permissions() function been place before
> > > > do_pci_remove() will make it harder to follow what do_pci_remove() does.
> > > > Does it need to be a separate function? Can't you inline it in
> > > > pci_remove_detached() ?
> > > 
> > > I decided to go with an inline function to avoid increasing the size of
> > > pci_remove_detached() and also separate the logic from cleaning-up QMP an
> > > resetting the PCI device.
> > 
> > It's fine to have a long function, if there's that much to do. You can
> > add a comment if you want to separate a bit from the rest.
> > 
> > Having a new function would be ok if it was used from different places,
> > or if it was needed for a new callback in the chain of callbacks of a
> > long-running operation.
> 
> I don't agree with this definition on when to create a new function. Smaller
> functions help to logically split your code and also avoids to have to
> define 20 local variables right at the beginning of the function (this is
> what will happen with folding the function) among other advantages.

You can always create a new block {} in the function, if that help with
local variables.

Cheers,
Julien Grall Aug. 24, 2023, 5:27 p.m. UTC | #8
On 24/08/2023 17:58, Anthony PERARD wrote:
> On Thu, Aug 24, 2023 at 05:46:45PM +0100, Julien Grall wrote:
>> Hi Anthony,
>>
>> On 24/08/2023 17:34, Anthony PERARD wrote:
>>> On Thu, Aug 24, 2023 at 12:15:39PM +0100, Julien Grall wrote:
>>>> On 18/08/2023 18:04, Anthony PERARD wrote:
>>>>> So, this new pci_revoke_permissions() function been place before
>>>>> do_pci_remove() will make it harder to follow what do_pci_remove() does.
>>>>> Does it need to be a separate function? Can't you inline it in
>>>>> pci_remove_detached() ?
>>>>
>>>> I decided to go with an inline function to avoid increasing the size of
>>>> pci_remove_detached() and also separate the logic from cleaning-up QMP an
>>>> resetting the PCI device.
>>>
>>> It's fine to have a long function, if there's that much to do. You can
>>> add a comment if you want to separate a bit from the rest.
>>>
>>> Having a new function would be ok if it was used from different places,
>>> or if it was needed for a new callback in the chain of callbacks of a
>>> long-running operation.
>>
>> I don't agree with this definition on when to create a new function. Smaller
>> functions help to logically split your code and also avoids to have to
>> define 20 local variables right at the beginning of the function (this is
>> what will happen with folding the function) among other advantages.
> 
> You can always create a new block {} in the function, if that help with
> local variables.

I thought about it... But this is just a function in disguised with 
downside (the extra layer of indentation).

I was actually going to try the folding version. But given this 
suggestion, I am now struggling to understand why this is a problem to 
have a function only called once. This is fairly common in Xen 
Hypervisor and I can see at least one instance in libxl (see 
libxl_pci_assignable()). So what is the rationale behind this request?

I would also like the opinion from others (such as Juergen) before going 
ahead with any changes.

Cheers,
Anthony PERARD Sept. 11, 2023, 11:03 a.m. UTC | #9
On Thu, Aug 24, 2023 at 06:27:12PM +0100, Julien Grall wrote:
> On 24/08/2023 17:58, Anthony PERARD wrote:
> > On Thu, Aug 24, 2023 at 05:46:45PM +0100, Julien Grall wrote:
> > > On 24/08/2023 17:34, Anthony PERARD wrote:
> > > > On Thu, Aug 24, 2023 at 12:15:39PM +0100, Julien Grall wrote:
> > > > > On 18/08/2023 18:04, Anthony PERARD wrote:
> > > > > > So, this new pci_revoke_permissions() function been place before
> > > > > > do_pci_remove() will make it harder to follow what do_pci_remove() does.
> > > > > > Does it need to be a separate function? Can't you inline it in
> > > > > > pci_remove_detached() ?
> > > > > 
> > > > > I decided to go with an inline function to avoid increasing the size of
> > > > > pci_remove_detached() and also separate the logic from cleaning-up QMP an
> > > > > resetting the PCI device.
> > > > 
> > > > It's fine to have a long function, if there's that much to do. You can
> > > > add a comment if you want to separate a bit from the rest.
> > > > 
> > > > Having a new function would be ok if it was used from different places,
> > > > or if it was needed for a new callback in the chain of callbacks of a
> > > > long-running operation.
> > > 
> > > I don't agree with this definition on when to create a new function. Smaller
> > > functions help to logically split your code and also avoids to have to
> > > define 20 local variables right at the beginning of the function (this is
> > > what will happen with folding the function) among other advantages.
> > 
> > You can always create a new block {} in the function, if that help with
> > local variables.
> 
> I thought about it... But this is just a function in disguised with downside
> (the extra layer of indentation).
> 
> I was actually going to try the folding version. But given this suggestion,
> I am now struggling to understand why this is a problem to have a function
> only called once. This is fairly common in Xen Hypervisor and I can see at
> least one instance in libxl (see libxl_pci_assignable()). So what is the
> rationale behind this request?

It is to try to keep the code laid out in chronological order for this
async operation. See CODING_STYLE L153:
    https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libs/light/CODING_STYLE;h=b1a6a45c74df083cdd793e5cb6a67a76cb5c1174;hb=refs/heads/master#l153

(libxl_pci_assignable() isn't a good example, as it should be part of
the API ;-) )

Cheers,
diff mbox series

Patch

diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 7f5f170e6eb0..f5a4b88eb2c0 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -1943,6 +1943,79 @@  static void pci_remove_stubdom_done(libxl__egc *egc,
 static void pci_remove_done(libxl__egc *egc,
     pci_remove_state *prs, int rc);
 
+static void pci_revoke_permissions(libxl__egc *egc, pci_remove_state *prs)
+{
+    STATE_AO_GC(prs->aodev->ao);
+    libxl_ctx *ctx = libxl__gc_owner(gc);
+    const libxl_device_pci *pci = &prs->pci;
+    const char *sysfs_path;
+    uint32_t domid = prs->domid;
+    FILE *f;
+    unsigned int start = 0, end = 0, flags = 0, size = 0;
+    int irq = 0;
+    int i, rc;
+
+    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource",
+                           pci->domain, pci->bus, pci->dev, pci->func);
+
+    f = fopen(sysfs_path, "r");
+    if (f == NULL) {
+        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
+        goto skip_bar;
+    }
+
+    for (i = 0; i < PROC_PCI_NUM_RESOURCES; i++) {
+        if (fscanf(f, "0x%x 0x%x 0x%x\n", &start, &end, &flags) != 3)
+            continue;
+        size = end - start + 1;
+        if (start) {
+            if (flags & PCI_BAR_IO) {
+                rc = xc_domain_ioport_permission(ctx->xch, domid, start, size, 0);
+                if (rc < 0)
+                    LOGED(ERROR, domid,
+                          "xc_domain_ioport_permission error 0x%x/0x%x",
+                          start,
+                          size);
+            } else {
+                rc = xc_domain_iomem_permission(ctx->xch, domid, start>>XC_PAGE_SHIFT,
+                                                (size+(XC_PAGE_SIZE-1))>>XC_PAGE_SHIFT, 0);
+                if (rc < 0)
+                    LOGED(ERROR, domid,
+                          "xc_domain_iomem_permission error 0x%x/0x%x",
+                          start,
+                          size);
+            }
+        }
+    }
+    fclose(f);
+
+skip_bar:
+    if (!pci_supp_legacy_irq())
+        return;
+
+    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+                           pci->bus, pci->dev, pci->func);
+
+    f = fopen(sysfs_path, "r");
+    if (f == NULL) {
+        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
+        return;
+    }
+
+    if ((fscanf(f, "%u", &irq) == 1) && irq) {
+        rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
+        if (rc < 0) {
+            LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
+        }
+        rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
+        if (rc < 0) {
+            LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
+        }
+    }
+
+    fclose(f);
+}
+
 static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
 {
     STATE_AO_GC(prs->aodev->ao);
@@ -1968,7 +2041,6 @@  static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
         goto out_fail;
     }
 
-    rc = ERROR_FAIL;
     if (type == LIBXL_DOMAIN_TYPE_HVM) {
         prs->hvm = true;
         switch (libxl__device_model_version_running(gc, domid)) {
@@ -1980,75 +2052,19 @@  static void do_pci_remove(libxl__egc *egc, pci_remove_state *prs)
             prs->xswait.timeout_ms = LIBXL_DEVICE_MODEL_START_TIMEOUT * 1000;
             prs->xswait.callback = pci_remove_qemu_trad_watch_state_cb;
             rc = libxl__xswait_start(gc, &prs->xswait);
-            if (rc) goto out_fail;
-            return;
+            if (!rc) return;
+            break;
         case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
             pci_remove_qmp_device_del(egc, prs); /* must be last */
             return;
         default:
             rc = ERROR_INVAL;
-            goto out_fail;
+            break;
         }
     } else {
-        char *sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/resource", pci->domain,
-                                     pci->bus, pci->dev, pci->func);
-        FILE *f = fopen(sysfs_path, "r");
-        unsigned int start = 0, end = 0, flags = 0, size = 0;
-        int irq = 0;
-        int i;
-
-        if (f == NULL) {
-            LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
-            goto skip1;
-        }
-        for (i = 0; i < PROC_PCI_NUM_RESOURCES; i++) {
-            if (fscanf(f, "0x%x 0x%x 0x%x\n", &start, &end, &flags) != 3)
-                continue;
-            size = end - start + 1;
-            if (start) {
-                if (flags & PCI_BAR_IO) {
-                    rc = xc_domain_ioport_permission(ctx->xch, domid, start, size, 0);
-                    if (rc < 0)
-                        LOGED(ERROR, domid,
-                              "xc_domain_ioport_permission error 0x%x/0x%x",
-                              start,
-                              size);
-                } else {
-                    rc = xc_domain_iomem_permission(ctx->xch, domid, start>>XC_PAGE_SHIFT,
-                                                    (size+(XC_PAGE_SIZE-1))>>XC_PAGE_SHIFT, 0);
-                    if (rc < 0)
-                        LOGED(ERROR, domid,
-                              "xc_domain_iomem_permission error 0x%x/0x%x",
-                              start,
-                              size);
-                }
-            }
-        }
-        fclose(f);
-skip1:
-        if (!pci_supp_legacy_irq())
-            goto skip_irq;
-        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
-                               pci->bus, pci->dev, pci->func);
-        f = fopen(sysfs_path, "r");
-        if (f == NULL) {
-            LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
-            goto skip_irq;
-        }
-        if ((fscanf(f, "%u", &irq) == 1) && irq) {
-            rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
-            if (rc < 0) {
-                LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
-            }
-            rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
-            if (rc < 0) {
-                LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
-            }
-        }
-        fclose(f);
+        rc = 0;
     }
-skip_irq:
-    rc = 0;
+
 out_fail:
     pci_remove_detached(egc, prs, rc); /* must be last */
 }
@@ -2242,6 +2258,8 @@  static void pci_remove_detached(libxl__egc *egc,
     if (rc && !prs->force)
         goto out;
 
+    pci_revoke_permissions(egc, prs);
+
     isstubdom = libxl_is_stubdom(CTX, domid, &domainid);
 
     /* don't do multiple resets while some functions are still passed through */