Message ID | 20220303103057.49181-4-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/hvm: PIRQ related cleanup and a fix | expand |
On 03/03/2022 10:30, Roger Pau Monne wrote: > Control domains (including domains having control over a single other > guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup > bindings of interrupts from devices assigned to the controlled guest. > > As such relax the check for HVM based guests and allow the usage of > the hypercalls for any control domains. Note that further safety > checks will be performed in order to assert that the current domain > has the right permissions against the target of the hypercall. > > Reported-by: Alex Olson <this.is.a0lson@gmail.com> > Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/hvm/hypercall.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > index 030243810e..9128e4d025 100644 > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > { > case PHYSDEVOP_map_pirq: > case PHYSDEVOP_unmap_pirq: > + /* > + * Control domain (and domains controlling others) need to use > + * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough > + * devices on behalf of other guests. > + */ > + if ( is_control_domain(currd) || currd->target ) > + break; Hmm. In a split control/hardware domain model, then qemu is in the hardware domain rather than the control domain. I suspect this wants extending with || is_hardware_domain(currd). Also, the sentence about later safety checks really ought to be in this source comment too. ~Andrew
On 03.03.2022 11:45, Andrew Cooper wrote: > On 03/03/2022 10:30, Roger Pau Monne wrote: >> --- a/xen/arch/x86/hvm/hypercall.c >> +++ b/xen/arch/x86/hvm/hypercall.c >> @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) >> { >> case PHYSDEVOP_map_pirq: >> case PHYSDEVOP_unmap_pirq: >> + /* >> + * Control domain (and domains controlling others) need to use >> + * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough >> + * devices on behalf of other guests. >> + */ >> + if ( is_control_domain(currd) || currd->target ) >> + break; > > Hmm. In a split control/hardware domain model, then qemu is in the > hardware domain rather than the control domain. Interesting. I would have expected it to be the other way around, with qemu for domains with pass-through devices living in a stubdom. Jan > I suspect this wants > extending with || is_hardware_domain(currd). > > Also, the sentence about later safety checks really ought to be in this > source comment too. > > ~Andrew
On Thu, Mar 03, 2022 at 10:45:54AM +0000, Andrew Cooper wrote: > On 03/03/2022 10:30, Roger Pau Monne wrote: > > Control domains (including domains having control over a single other > > guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup > > bindings of interrupts from devices assigned to the controlled guest. > > > > As such relax the check for HVM based guests and allow the usage of > > the hypercalls for any control domains. Note that further safety > > checks will be performed in order to assert that the current domain > > has the right permissions against the target of the hypercall. > > > > Reported-by: Alex Olson <this.is.a0lson@gmail.com> > > Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > xen/arch/x86/hvm/hypercall.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > > index 030243810e..9128e4d025 100644 > > --- a/xen/arch/x86/hvm/hypercall.c > > +++ b/xen/arch/x86/hvm/hypercall.c > > @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) > > { > > case PHYSDEVOP_map_pirq: > > case PHYSDEVOP_unmap_pirq: > > + /* > > + * Control domain (and domains controlling others) need to use > > + * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough > > + * devices on behalf of other guests. > > + */ > > + if ( is_control_domain(currd) || currd->target ) > > + break; > > Hmm. In a split control/hardware domain model, then qemu is in the > hardware domain rather than the control domain. I suspect this wants > extending with || is_hardware_domain(currd). The binding of GSIs is exclusively done by the control domain because those are static and known at domain creation. The mapping and binding of MSI interrupts is however done by QEMU at runtime, so it needs extending to the hardware domain. However just extending here won't be enough: we would also need to modify xsm_default_action, as currently XSM_DM_PRIV will only be allowed if src->target == target or is_control_domain(src). diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h index 58afc1d589..ac40a24a22 100644 --- a/xen/include/xsm/dummy.h +++ b/xen/include/xsm/dummy.h @@ -88,7 +88,8 @@ static always_inline int xsm_default_action( } /* fall through */ case XSM_DM_PRIV: - if ( target && evaluate_nospec(src->target == target) ) + if ( is_hardware_domain(src) || + (target && evaluate_nospec(src->target == target)) ) return 0; /* fall through */ case XSM_PRIV: That would however also give the hardware domain access to XSM_TARGET and XSM_XS_PRIV operations that where previously forbidden. Or do we just require people with split control/hardware domain model to also use a different XSM policy? > Also, the sentence about later safety checks really ought to be in this > source comment too. Will add. Thanks, Roger.
On 03.03.2022 12:40, Roger Pau Monné wrote: > Or do we just require people with split control/hardware domain model > to also use a different XSM policy? I've been under the impression that this is the intended model, yes. Jan
Hi Roger, Thanks for the patches. In trying them out, I found some other PHYSDEVOP commands that were being blocked by the "default" case and were being failed with -ENOSYS... Would something like the change below make sense? Or is defaulting to failure incorrect? (I saw denials for the "add" commands, and also added the remove/release commands for symmetry). With this change, I was able to achieve a functional virtual function passed through to a HVM domain with PVH dom0. Thanks -Alex diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index b8becab475..6abaa626a3 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) switch ( cmd ) { + + case PHYSDEVOP_manage_pci_add: + case PHYSDEVOP_manage_pci_remove: + case PHYSDEVOP_pci_device_add: + case PHYSDEVOP_pci_device_remove: + case PHYSDEVOP_manage_pci_add_ext: + case PHYSDEVOP_prepare_msix: + case PHYSDEVOP_release_msix: + if ( is_control_domain(currd) ) + break; + case PHYSDEVOP_map_pirq: case PHYSDEVOP_unmap_pirq: /* On Thu, 2022-03-03 at 11:30 +0100, Roger Pau Monne wrote: > Control domains (including domains having control over a single other > guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup > bindings of interrupts from devices assigned to the controlled guest. > > As such relax the check for HVM based guests and allow the usage of > the hypercalls for any control domains. Note that further safety > checks will be performed in order to assert that the current domain > has the right permissions against the target of the hypercall. > > Reported-by: Alex Olson <this.is.a0lson@gmail.com> > Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/hvm/hypercall.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c > index 030243810e..9128e4d025 100644 > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > { > case PHYSDEVOP_map_pirq: > case PHYSDEVOP_unmap_pirq: > + /* > + * Control domain (and domains controlling others) need to use > + * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for > passthrough > + * devices on behalf of other guests. > + */ > + if ( is_control_domain(currd) || currd->target ) > + break; > case PHYSDEVOP_eoi: > case PHYSDEVOP_irq_status_query: > case PHYSDEVOP_get_free_pirq:
On 03.03.2022 17:45, Alex Olson wrote: > --- a/xen/arch/x86/hvm/hypercall.c > +++ b/xen/arch/x86/hvm/hypercall.c > @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd, > XEN_GUEST_HANDLE_PARAM(void) arg) > > switch ( cmd ) > { > + > + case PHYSDEVOP_manage_pci_add: > + case PHYSDEVOP_manage_pci_remove: > + case PHYSDEVOP_pci_device_add: > + case PHYSDEVOP_pci_device_remove: > + case PHYSDEVOP_manage_pci_add_ext: > + case PHYSDEVOP_prepare_msix: > + case PHYSDEVOP_release_msix: > + if ( is_control_domain(currd) ) > + break; These are all operations which I think are purposefully permitted to be invoked by the hardware domain only. That's where all the devices live when they're not passed through to guests. Jan
I wasn't sure of the distinction between hardware domain and control domain for these commands, but they appear to be blocked at the moment when dom0 executes them, including a lot at boot. Are you suggesting to use is_hardware_domain(currd) instead in my diff? Or should the hardware domain always be able to execute any physdev op command? (such as to bypass the switch statement entirely) It looks like hvm_physdev_op() is the only real caller of do_physdev_op(), and several other commands (besides the ones in the diff below) are also being blocked by the default case of hvm_physdev_op. PHYSDEVOP_pirq_eoi_gmfn_v2 PHYSDEVOP_pirq_eoi_gmfn_v1 PHYSDEVOP_IRQ_UNMASK_NOTIFY // legacy? PHYSDEVOP_apic_read PHYSDEVOP_apic_write PHYSDEVOP_alloc_irq_vector PHYSDEVOP_set_iopl PHYSDEVOP_set_iobitmap PHYSDEVOP_restore_msi PHYSDEVOP_restore_msi_ext PHYSDEVOP_setup_gsi PHYSDEVOP_get_free_pirq PHYSDEVOP_dbgp_op Thanks -Alex On Thu, 2022-03-03 at 17:47 +0100, Jan Beulich wrote: > On 03.03.2022 17:45, Alex Olson wrote: > > --- a/xen/arch/x86/hvm/hypercall.c > > +++ b/xen/arch/x86/hvm/hypercall.c > > @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd, > > XEN_GUEST_HANDLE_PARAM(void) arg) > > > > switch ( cmd ) > > { > > + > > + case PHYSDEVOP_manage_pci_add: > > + case PHYSDEVOP_manage_pci_remove: > > + case PHYSDEVOP_pci_device_add: > > + case PHYSDEVOP_pci_device_remove: > > + case PHYSDEVOP_manage_pci_add_ext: > > + case PHYSDEVOP_prepare_msix: > > + case PHYSDEVOP_release_msix: > > + if ( is_control_domain(currd) ) > > + break; > > These are all operations which I think are purposefully permitted to > be invoked by the hardware domain only. That's where all the devices > live when they're not passed through to guests. > > Jan >
On 03.03.2022 18:14, Alex Olson wrote: > I wasn't sure of the distinction between hardware domain and control domain for these commands, but they appear to be blocked at the moment when dom0 executes them, including a lot at boot. Are you suggesting to use is_hardware_domain(currd) instead in my diff? > > Or should the hardware domain always be able to execute any physdev op command? (such as to bypass the switch statement entirely) No, certainly not. It was on purpose to restrict PVH Dom0. Only PV Dom0 is supposed to be able to access all sub-ops. > It looks like hvm_physdev_op() is the only real caller of do_physdev_op(), and several other commands (besides the ones in the diff below) are also being blocked by the default case of hvm_physdev_op. > > PHYSDEVOP_pirq_eoi_gmfn_v2 > PHYSDEVOP_pirq_eoi_gmfn_v1 > PHYSDEVOP_IRQ_UNMASK_NOTIFY // legacy? > PHYSDEVOP_apic_read > PHYSDEVOP_apic_write > PHYSDEVOP_alloc_irq_vector > PHYSDEVOP_set_iopl > PHYSDEVOP_set_iobitmap > PHYSDEVOP_restore_msi > PHYSDEVOP_restore_msi_ext > PHYSDEVOP_setup_gsi > PHYSDEVOP_get_free_pirq > PHYSDEVOP_dbgp_op > > Thanks > > -Alex Also - please don't top-post. Jan > On Thu, 2022-03-03 at 17:47 +0100, Jan Beulich wrote: >> On 03.03.2022 17:45, Alex Olson wrote: >>> --- a/xen/arch/x86/hvm/hypercall.c >>> +++ b/xen/arch/x86/hvm/hypercall.c >>> @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd, >>> XEN_GUEST_HANDLE_PARAM(void) arg) >>> >>> switch ( cmd ) >>> { >>> + >>> + case PHYSDEVOP_manage_pci_add: >>> + case PHYSDEVOP_manage_pci_remove: >>> + case PHYSDEVOP_pci_device_add: >>> + case PHYSDEVOP_pci_device_remove: >>> + case PHYSDEVOP_manage_pci_add_ext: >>> + case PHYSDEVOP_prepare_msix: >>> + case PHYSDEVOP_release_msix: >>> + if ( is_control_domain(currd) ) >>> + break; >> >> These are all operations which I think are purposefully permitted to >> be invoked by the hardware domain only. That's where all the devices >> live when they're not passed through to guests. >> >> Jan >> >
On Thu, Mar 03, 2022 at 05:47:48PM +0100, Jan Beulich wrote: > On 03.03.2022 17:45, Alex Olson wrote: > > --- a/xen/arch/x86/hvm/hypercall.c > > +++ b/xen/arch/x86/hvm/hypercall.c > > @@ -84,6 +84,17 @@ static long hvm_physdev_op(int cmd, > > XEN_GUEST_HANDLE_PARAM(void) arg) > > > > switch ( cmd ) > > { > > + > > + case PHYSDEVOP_manage_pci_add: > > + case PHYSDEVOP_manage_pci_remove: > > + case PHYSDEVOP_pci_device_add: > > + case PHYSDEVOP_pci_device_remove: The add/remove options are already available to a PVH hardware domain. > > + case PHYSDEVOP_manage_pci_add_ext: This shouldn't be needed in principle for a PVH hardware domain, as the plan it to emulate accesses to the SR-IOV capability and detect VFs that way. > > + case PHYSDEVOP_prepare_msix: > > + case PHYSDEVOP_release_msix: Those two are likely fine to use for a PVH hardware domain (not the control domain). AFAICT they shouldn't interact badly with vPCI. > > + if ( is_control_domain(currd) ) > > + break; > > These are all operations which I think are purposefully permitted to > be invoked by the hardware domain only. That's where all the devices > live when they're not passed through to guests. Indeed. I think it's only the {prepare,release}_msix operations that needs adding (but would need confirmation they actually work as intended in a PVH setup). Thanks, Roger.
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c index 030243810e..9128e4d025 100644 --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -87,6 +87,13 @@ static long hvm_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { case PHYSDEVOP_map_pirq: case PHYSDEVOP_unmap_pirq: + /* + * Control domain (and domains controlling others) need to use + * PHYSDEVOP_{un,}map_pirq in order to setup interrupts for passthrough + * devices on behalf of other guests. + */ + if ( is_control_domain(currd) || currd->target ) + break; case PHYSDEVOP_eoi: case PHYSDEVOP_irq_status_query: case PHYSDEVOP_get_free_pirq:
Control domains (including domains having control over a single other guest) need access to PHYSDEVOP_{un,}map_pirq in order to setup bindings of interrupts from devices assigned to the controlled guest. As such relax the check for HVM based guests and allow the usage of the hypercalls for any control domains. Note that further safety checks will be performed in order to assert that the current domain has the right permissions against the target of the hypercall. Reported-by: Alex Olson <this.is.a0lson@gmail.com> Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/hvm/hypercall.c | 7 +++++++ 1 file changed, 7 insertions(+)