diff mbox series

[3/3] hvm/pirq: allow control domains usage of PHYSDEVOP_{un,}map_pirq

Message ID 20220303103057.49181-4-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/hvm: PIRQ related cleanup and a fix | expand

Commit Message

Roger Pau Monné March 3, 2022, 10:30 a.m. UTC
Control domains (including domains having control over a single other
guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup
bindings of interrupts from devices assigned to the controlled guest.

As such relax the check for HVM based guests and allow the usage of
the hypercalls for any control domains. Note that further safety
checks will be performed in order to assert that the current domain
has the right permissions against the target of the hypercall.

Reported-by: Alex Olson <this.is.a0lson@gmail.com>
Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/hypercall.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Andrew Cooper March 3, 2022, 10:45 a.m. UTC | #1
On 03/03/2022 10:30, Roger Pau Monne wrote:
> Control domains (including domains having control over a single other
> guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup
> bindings of interrupts from devices assigned to the controlled guest.
>
> As such relax the check for HVM based guests and allow the usage of
> the hypercalls for any control domains. Note that further safety
> checks will be performed in order to assert that the current domain
> has the right permissions against the target of the hypercall.
>
> Reported-by: Alex Olson <this.is.a0lson@gmail.com>
> Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/hypercall.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 030243810e..9128e4d025 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        /*
> +         * Control domain (and domains controlling others) need to use
> +         * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough
> +         * devices on behalf of other guests.
> +         */
> +        if ( is_control_domain(currd) || currd->target )
> +            break;

Hmm.  In a split control/hardware domain model, then qemu is in the
hardware domain rather than the control domain.  I suspect this wants
extending with || is_hardware_domain(currd).

Also, the sentence about later safety checks really ought to be in this
source comment too.

~Andrew
Jan Beulich March 3, 2022, 10:59 a.m. UTC | #2
On 03.03.2022 11:45, Andrew Cooper wrote:
> On 03/03/2022 10:30, Roger Pau Monne wrote:
>> --- a/xen/arch/x86/hvm/hypercall.c
>> +++ b/xen/arch/x86/hvm/hypercall.c
>> @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>      {
>>      case PHYSDEVOP_map_pirq:
>>      case PHYSDEVOP_unmap_pirq:
>> +        /*
>> +         * Control domain (and domains controlling others) need to use
>> +         * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough
>> +         * devices on behalf of other guests.
>> +         */
>> +        if ( is_control_domain(currd) || currd->target )
>> +            break;
> 
> Hmm.  In a split control/hardware domain model, then qemu is in the
> hardware domain rather than the control domain.

Interesting. I would have expected it to be the other way around, with
qemu for domains with pass-through devices living in a stubdom.

Jan

>  I suspect this wants
> extending with || is_hardware_domain(currd).
> 
> Also, the sentence about later safety checks really ought to be in this
> source comment too.
> 
> ~Andrew
Roger Pau Monné March 3, 2022, 11:40 a.m. UTC | #3
On Thu, Mar 03, 2022 at 10:45:54AM +0000, Andrew Cooper wrote:
> On 03/03/2022 10:30, Roger Pau Monne wrote:
> > Control domains (including domains having control over a single other
> > guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup
> > bindings of interrupts from devices assigned to the controlled guest.
> >
> > As such relax the check for HVM based guests and allow the usage of
> > the hypercalls for any control domains. Note that further safety
> > checks will be performed in order to assert that the current domain
> > has the right permissions against the target of the hypercall.
> >
> > Reported-by: Alex Olson <this.is.a0lson@gmail.com>
> > Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> >  xen/arch/x86/hvm/hypercall.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> > index 030243810e..9128e4d025 100644
> > --- a/xen/arch/x86/hvm/hypercall.c
> > +++ b/xen/arch/x86/hvm/hypercall.c
> > @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> >      {
> >      case PHYSDEVOP_map_pirq:
> >      case PHYSDEVOP_unmap_pirq:
> > +        /*
> > +         * Control domain (and domains controlling others) need to use
> > +         * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough
> > +         * devices on behalf of other guests.
> > +         */
> > +        if ( is_control_domain(currd) || currd->target )
> > +            break;
> 
> Hmm.  In a split control/hardware domain model, then qemu is in the
> hardware domain rather than the control domain.  I suspect this wants
> extending with || is_hardware_domain(currd).

The binding of GSIs is exclusively done by the control domain because
those are static and known at domain creation.  The mapping and
binding of MSI interrupts is however done by QEMU at runtime, so it
needs extending to the hardware domain.

However just extending here won't be enough: we would also need to
modify xsm_default_action, as currently XSM_DM_PRIV will only be
allowed if src->target == target or is_control_domain(src).

diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index 58afc1d589..ac40a24a22 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -88,7 +88,8 @@ static always_inline int xsm_default_action(
         }
         /* fall through */
     case XSM_DM_PRIV:
-        if ( target && evaluate_nospec(src->target == target) )
+        if ( is_hardware_domain(src) ||
+             (target && evaluate_nospec(src->target == target)) )
             return 0;
         /* fall through */
     case XSM_PRIV:

That would however also give the hardware domain access to XSM_TARGET
and XSM_XS_PRIV operations that where previously forbidden.

Or do we just require people with split control/hardware domain model
to also use a different XSM policy?

> Also, the sentence about later safety checks really ought to be in this
> source comment too.

Will add.

Thanks, Roger.
Jan Beulich March 3, 2022, 11:43 a.m. UTC | #4
On 03.03.2022 12:40, Roger Pau Monné wrote:
> Or do we just require people with split control/hardware domain model
> to also use a different XSM policy?

I've been under the impression that this is the intended model, yes.

Jan
Alex Olson March 3, 2022, 4:45 p.m. UTC | #5
Hi Roger,

Thanks for the patches.  In trying them out, I found some other PHYSDEVOP
commands that were being blocked by the "default" case and were being failed
with -ENOSYS... 

Would something like the change below make sense?  Or is defaulting to failure
incorrect?   (I saw denials for the "add" commands, and also added the
remove/release commands for symmetry).

With this change, I was able to achieve a functional virtual function passed
through to a HVM domain with PVH dom0.

Thanks

-Alex


diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index b8becab475..6abaa626a3 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd,
XEN_GUEST_HANDLE_PARAM(void) arg)
 
     switch ( cmd )
     {
+
+    case PHYSDEVOP_manage_pci_add:
+    case PHYSDEVOP_manage_pci_remove:
+    case PHYSDEVOP_pci_device_add:
+    case PHYSDEVOP_pci_device_remove:
+    case PHYSDEVOP_manage_pci_add_ext:
+    case PHYSDEVOP_prepare_msix:
+    case PHYSDEVOP_release_msix:
+        if ( is_control_domain(currd) )
+            break;
+
     case PHYSDEVOP_map_pirq:
     case PHYSDEVOP_unmap_pirq:
         /*



On Thu, 2022-03-03 at 11:30 +0100, Roger Pau Monne wrote:
> Control domains (including domains having control over a single other
> guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup
> bindings of interrupts from devices assigned to the controlled guest.
> 
> As such relax the check for HVM based guests and allow the usage of
> the hypercalls for any control domains. Note that further safety
> checks will be performed in order to assert that the current domain
> has the right permissions against the target of the hypercall.
> 
> Reported-by: Alex Olson <this.is.a0lson@gmail.com>
> Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/hypercall.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
> index 030243810e..9128e4d025 100644
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case PHYSDEVOP_map_pirq:
>      case PHYSDEVOP_unmap_pirq:
> +        /*
> +         * Control domain (and domains controlling others) need to use
> +         * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for
> passthrough
> +         * devices on behalf of other guests.
> +         */
> +        if ( is_control_domain(currd) || currd->target )
> +            break;
>      case PHYSDEVOP_eoi:
>      case PHYSDEVOP_irq_status_query:
>      case PHYSDEVOP_get_free_pirq:
Jan Beulich March 3, 2022, 4:47 p.m. UTC | #6
On 03.03.2022 17:45, Alex Olson wrote:
> --- a/xen/arch/x86/hvm/hypercall.c
> +++ b/xen/arch/x86/hvm/hypercall.c
> @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd,
> XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>      switch ( cmd )
>      {
> +
> +    case PHYSDEVOP_manage_pci_add:
> +    case PHYSDEVOP_manage_pci_remove:
> +    case PHYSDEVOP_pci_device_add:
> +    case PHYSDEVOP_pci_device_remove:
> +    case PHYSDEVOP_manage_pci_add_ext:
> +    case PHYSDEVOP_prepare_msix:
> +    case PHYSDEVOP_release_msix:
> +        if ( is_control_domain(currd) )
> +            break;

These are all operations which I think are purposefully permitted to
be invoked by the hardware domain only. That's where all the devices
live when they're not passed through to guests.

Jan
Alex Olson March 3, 2022, 5:14 p.m. UTC | #7
I wasn't sure of the distinction between hardware domain and control domain for these commands, but they appear to be blocked at the moment when dom0 executes them, including a lot at boot.  Are you suggesting to use is_hardware_domain(currd) instead in my diff?    

Or should the hardware domain always be able to execute any physdev op command? (such as to bypass the switch statement entirely)


It looks like hvm_physdev_op() is the only real caller of do_physdev_op(), and  several other commands (besides the ones in the diff below) are also being blocked by the default case of hvm_physdev_op.

PHYSDEVOP_pirq_eoi_gmfn_v2
PHYSDEVOP_pirq_eoi_gmfn_v1
PHYSDEVOP_IRQ_UNMASK_NOTIFY // legacy?
PHYSDEVOP_apic_read
PHYSDEVOP_apic_write
PHYSDEVOP_alloc_irq_vector
PHYSDEVOP_set_iopl
PHYSDEVOP_set_iobitmap
PHYSDEVOP_restore_msi
PHYSDEVOP_restore_msi_ext
PHYSDEVOP_setup_gsi
PHYSDEVOP_get_free_pirq
PHYSDEVOP_dbgp_op

Thanks

-Alex

On Thu, 2022-03-03 at 17:47 +0100, Jan Beulich wrote:
> On 03.03.2022 17:45, Alex Olson wrote:
> > --- a/xen/arch/x86/hvm/hypercall.c
> > +++ b/xen/arch/x86/hvm/hypercall.c
> > @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd,
> > XEN_GUEST_HANDLE_PARAM(void) arg)
> >  
> >      switch ( cmd )
> >      {
> > +
> > +    case PHYSDEVOP_manage_pci_add:
> > +    case PHYSDEVOP_manage_pci_remove:
> > +    case PHYSDEVOP_pci_device_add:
> > +    case PHYSDEVOP_pci_device_remove:
> > +    case PHYSDEVOP_manage_pci_add_ext:
> > +    case PHYSDEVOP_prepare_msix:
> > +    case PHYSDEVOP_release_msix:
> > +        if ( is_control_domain(currd) )
> > +            break;
> 
> These are all operations which I think are purposefully permitted to
> be invoked by the hardware domain only. That's where all the devices
> live when they're not passed through to guests.
> 
> Jan
>
Jan Beulich March 4, 2022, 7:02 a.m. UTC | #8
On 03.03.2022 18:14, Alex Olson wrote:
> I wasn't sure of the distinction between hardware domain and control domain for these commands, but they appear to be blocked at the moment when dom0 executes them, including a lot at boot.  Are you suggesting to use is_hardware_domain(currd) instead in my diff?    
> 
> Or should the hardware domain always be able to execute any physdev op command? (such as to bypass the switch statement entirely)

No, certainly not. It was on purpose to restrict PVH Dom0. Only PV
Dom0 is supposed to be able to access all sub-ops.

> It looks like hvm_physdev_op() is the only real caller of do_physdev_op(), and  several other commands (besides the ones in the diff below) are also being blocked by the default case of hvm_physdev_op.
> 
> PHYSDEVOP_pirq_eoi_gmfn_v2
> PHYSDEVOP_pirq_eoi_gmfn_v1
> PHYSDEVOP_IRQ_UNMASK_NOTIFY // legacy?
> PHYSDEVOP_apic_read
> PHYSDEVOP_apic_write
> PHYSDEVOP_alloc_irq_vector
> PHYSDEVOP_set_iopl
> PHYSDEVOP_set_iobitmap
> PHYSDEVOP_restore_msi
> PHYSDEVOP_restore_msi_ext
> PHYSDEVOP_setup_gsi
> PHYSDEVOP_get_free_pirq
> PHYSDEVOP_dbgp_op
> 
> Thanks
> 
> -Alex

Also - please don't top-post.

Jan

> On Thu, 2022-03-03 at 17:47 +0100, Jan Beulich wrote:
>> On 03.03.2022 17:45, Alex Olson wrote:
>>> --- a/xen/arch/x86/hvm/hypercall.c
>>> +++ b/xen/arch/x86/hvm/hypercall.c
>>> @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd,
>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>  
>>>      switch ( cmd )
>>>      {
>>> +
>>> +    case PHYSDEVOP_manage_pci_add:
>>> +    case PHYSDEVOP_manage_pci_remove:
>>> +    case PHYSDEVOP_pci_device_add:
>>> +    case PHYSDEVOP_pci_device_remove:
>>> +    case PHYSDEVOP_manage_pci_add_ext:
>>> +    case PHYSDEVOP_prepare_msix:
>>> +    case PHYSDEVOP_release_msix:
>>> +        if ( is_control_domain(currd) )
>>> +            break;
>>
>> These are all operations which I think are purposefully permitted to
>> be invoked by the hardware domain only. That's where all the devices
>> live when they're not passed through to guests.
>>
>> Jan
>>
>
Roger Pau Monné March 4, 2022, 11:47 a.m. UTC | #9
On Thu, Mar 03, 2022 at 05:47:48PM +0100, Jan Beulich wrote:
> On 03.03.2022 17:45, Alex Olson wrote:
> > --- a/xen/arch/x86/hvm/hypercall.c
> > +++ b/xen/arch/x86/hvm/hypercall.c
> > @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd,
> > XEN_GUEST_HANDLE_PARAM(void) arg)
> >  
> >      switch ( cmd )
> >      {
> > +
> > +    case PHYSDEVOP_manage_pci_add:
> > +    case PHYSDEVOP_manage_pci_remove:
> > +    case PHYSDEVOP_pci_device_add:
> > +    case PHYSDEVOP_pci_device_remove:

The add/remove options are already available to a PVH hardware
domain.

> > +    case PHYSDEVOP_manage_pci_add_ext:

This shouldn't be needed in principle for a PVH hardware domain, as
the plan it to emulate accesses to the SR-IOV capability and detect
VFs that way.

> > +    case PHYSDEVOP_prepare_msix:
> > +    case PHYSDEVOP_release_msix:

Those two are likely fine to use for a PVH hardware domain (not the
control domain). AFAICT they shouldn't interact badly with vPCI.

> > +        if ( is_control_domain(currd) )
> > +            break;
> 
> These are all operations which I think are purposefully permitted to
> be invoked by the hardware domain only. That's where all the devices
> live when they're not passed through to guests.

Indeed. I think it's only the {prepare,release}_msix operations that
needs adding (but would need confirmation they actually work as
intended in a PVH setup).

Thanks, Roger.
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 030243810e..9128e4d025 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -87,6 +87,13 @@  static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     {
     case PHYSDEVOP_map_pirq:
     case PHYSDEVOP_unmap_pirq:
+        /*
+         * Control domain (and domains controlling others) need to use
+         * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough
+         * devices on behalf of other guests.
+         */
+        if ( is_control_domain(currd) || currd->target )
+            break;
     case PHYSDEVOP_eoi:
     case PHYSDEVOP_irq_status_query:
     case PHYSDEVOP_get_free_pirq: