Message ID | 20220120152319.7448-3-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/hvm: add support for extended destination ID | expand |
On 20.01.2022 16:23, Roger Pau Monne wrote: > Both QEMU/KVM and HyperV support using bits 11:5 from the MSI address > field in order to store the high part of the target APIC ID. This > allows expanding the maximum APID ID usable without interrupt > remapping support from 255 to 32768. > > Note the interface used by QEMU for emulated devices (via the > XEN_DMOP_inject_msi hypercall) already passes both the address and > data fields into Xen for processing, so there's no need for any change > to QEMU there. > > However for PCI passthrough devices QEMU uses the > XEN_DOMCTL_bind_pt_irq hypercall which does need an addition to the > gflags field in order to pass the high bits of the APIC destination > ID. > > Introduce a new CPUID flag to signal the support for the feature. The > introduced flag covers both the support for extended ID for the > IO-APIC RTE and the MSI address registers. Such flag is currently only > exposed when the domain is using vPCI (ie: a PVH dom0). Because of also covering the IO-APIC side, I think the CPUID aspect of this really wants splitting into a 3rd patch. That way the MSI and IO-APIC parts could in principle go in independently, and only the CPUID one needs to remain at the tail. > --- a/xen/arch/x86/hvm/vmsi.c > +++ b/xen/arch/x86/hvm/vmsi.c > @@ -66,7 +66,7 @@ static void vmsi_inj_irq( > > int vmsi_deliver( > struct domain *d, int vector, > - uint8_t dest, uint8_t dest_mode, > + unsigned int dest, unsigned int dest_mode, If you change the type of dest_mode, then to "bool" please - see its only call site. > @@ -123,7 +125,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci) > } > > /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */ > -int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode) > +int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest, > + unsigned int dest_mode) Same here then. > --- a/xen/arch/x86/include/asm/msi.h > +++ b/xen/arch/x86/include/asm/msi.h > @@ -54,6 +54,7 @@ > #define MSI_ADDR_DEST_ID_SHIFT 12 > #define MSI_ADDR_DEST_ID_MASK 0x00ff000 > #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) > +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 Especially the immediately preceding macro now becomes kind of stale. > --- a/xen/drivers/passthrough/x86/hvm.c > +++ b/xen/drivers/passthrough/x86/hvm.c > @@ -269,7 +269,7 @@ int pt_irq_create_bind( > { > case PT_IRQ_TYPE_MSI: > { > - uint8_t dest, delivery_mode; > + unsigned int dest, delivery_mode; > bool dest_mode; If you touch delivery_mode's type, wouldn't that better become bool? > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { > #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 > #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 > #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 > +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 I'm not convinced it is a good idea to limit the overall destination ID width to 15 bits here - at the interface level we could as well permit more bits right away; the implementation would reject too high a value, of course. Not only with this I further wonder whether the field shouldn't be unsplit while extending it. You won't get away without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was bumped already for 4.17) since afaics the unused bits of this field previously weren't checked for being zero. We could easily have 8 bits vector, 16 bits flags, and 32 bits destination ID in the struct. And there would then still be 8 unused bits (which from now on we ought to check for being zero). Jan
On Mon, 2022-01-24 at 14:47 +0100, Jan Beulich wrote: > Because of also covering the IO-APIC side, I think the CPUID aspect of > this really wants splitting into a 3rd patch. That way the MSI and > IO-APIC parts could in principle go in independently, and only the > CPUID one needs to remain at the tail. HPET can generate MSIs directly too.
On Wed, Jan 26, 2022 at 01:54:26PM +0000, David Woodhouse wrote: > On Mon, 2022-01-24 at 14:47 +0100, Jan Beulich wrote: > > Because of also covering the IO-APIC side, I think the CPUID aspect of > > this really wants splitting into a 3rd patch. That way the MSI and > > IO-APIC parts could in principle go in independently, and only the > > CPUID one needs to remain at the tail. > > HPET can generate MSIs directly too. Indeed, but the emulated one we expose to HVM guests doesn't support FSB. Thanks, Roger.
On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote: > On 20.01.2022 16:23, Roger Pau Monne wrote: > > --- a/xen/arch/x86/include/asm/msi.h > > +++ b/xen/arch/x86/include/asm/msi.h > > @@ -54,6 +54,7 @@ > > #define MSI_ADDR_DEST_ID_SHIFT 12 > > #define MSI_ADDR_DEST_ID_MASK 0x00ff000 > > #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) > > +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 > > Especially the immediately preceding macro now becomes kind of stale. Hm, I'm not so sure about that. We could expand the macro to place the high bits in dest at bits 11:4 of the resulting address. However that macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own messages, so until we add support for the hypervisor itself to use the extended destination ID mode there's not much point in modifying the macro IMO. > > > --- a/xen/drivers/passthrough/x86/hvm.c > > +++ b/xen/drivers/passthrough/x86/hvm.c > > @@ -269,7 +269,7 @@ int pt_irq_create_bind( > > { > > case PT_IRQ_TYPE_MSI: > > { > > - uint8_t dest, delivery_mode; > > + unsigned int dest, delivery_mode; > > bool dest_mode; > > If you touch delivery_mode's type, wouldn't that better become bool? > > > --- a/xen/include/public/domctl.h > > +++ b/xen/include/public/domctl.h > > @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { > > #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 > > #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 > > #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 > > +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 > > I'm not convinced it is a good idea to limit the overall destination > ID width to 15 bits here - at the interface level we could as well > permit more bits right away; the implementation would reject too high > a value, of course. Not only with this I further wonder whether the > field shouldn't be unsplit while extending it. You won't get away > without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was > bumped already for 4.17) since afaics the unused bits of this field > previously weren't checked for being zero. We could easily have 8 > bits vector, 16 bits flags, and 32 bits destination ID in the struct. > And there would then still be 8 unused bits (which from now on we > ought to check for being zero). So I've made gflags a 64bit field, used the high 32bits for the destination ID, and the low ones for flags. I've left gvec as a separate field in the struct, as I don't want to require a modification to QEMU, just a rebuild against the updated headers will be enough. I've been wondering about this interface though (xen_domctl_bind_pt_irq), and it would seem better to just pass the raw MSI address and data fields from the guest and let Xen do the decoding. This however is not trivial to do as we would need to keep the previous interface anyway as it's used by QEMU. Maybe we could have some kind of union between a pair of address and data fields and a gflags one that would match the native layout, but as said not trivial (and would require using anonymous unions which I'm not sure are accepted even for domctls in the public headers). Thanks, Roger.
On 04.02.2022 10:23, Roger Pau Monné wrote: > On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote: >> On 20.01.2022 16:23, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/include/asm/msi.h >>> +++ b/xen/arch/x86/include/asm/msi.h >>> @@ -54,6 +54,7 @@ >>> #define MSI_ADDR_DEST_ID_SHIFT 12 >>> #define MSI_ADDR_DEST_ID_MASK 0x00ff000 >>> #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) >>> +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 >> >> Especially the immediately preceding macro now becomes kind of stale. > > Hm, I'm not so sure about that. We could expand the macro to place the > high bits in dest at bits 11:4 of the resulting address. However that > macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own > messages, so until we add support for the hypervisor itself to use the > extended destination ID mode there's not much point in modifying the > macro IMO. Well, this is all unhelpful considering the different form of extended ID in Intel's doc. At least by way of a comment things need clarifying and potential pitfalls need pointing out imo. >>> --- a/xen/include/public/domctl.h >>> +++ b/xen/include/public/domctl.h >>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { >>> #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 >>> #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 >>> #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 >>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 >> >> I'm not convinced it is a good idea to limit the overall destination >> ID width to 15 bits here - at the interface level we could as well >> permit more bits right away; the implementation would reject too high >> a value, of course. Not only with this I further wonder whether the >> field shouldn't be unsplit while extending it. You won't get away >> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was >> bumped already for 4.17) since afaics the unused bits of this field >> previously weren't checked for being zero. We could easily have 8 >> bits vector, 16 bits flags, and 32 bits destination ID in the struct. >> And there would then still be 8 unused bits (which from now on we >> ought to check for being zero). > > So I've made gflags a 64bit field, used the high 32bits for the > destination ID, and the low ones for flags. I've left gvec as a > separate field in the struct, as I don't want to require a > modification to QEMU, just a rebuild against the updated headers will > be enough. Hmm, wait - if qemu uses this without going through a suitable abstraction in at least libxc, then we cannot _ever_ change this interface: If a rebuild was required, old qemu binaries would stop working with newer Xen. If such a dependency indeed exists, we'll need a prominent warning comment in the public header. Jan > I've been wondering about this interface though > (xen_domctl_bind_pt_irq), and it would seem better to just pass the > raw MSI address and data fields from the guest and let Xen do the > decoding. This however is not trivial to do as we would need to keep > the previous interface anyway as it's used by QEMU. Maybe we could > have some kind of union between a pair of address and data fields and > a gflags one that would match the native layout, but as said not > trivial (and would require using anonymous unions which I'm not sure > are accepted even for domctls in the public headers). > > Thanks, Roger. >
On Fri, Feb 04, 2022 at 10:30:54AM +0100, Jan Beulich wrote: > On 04.02.2022 10:23, Roger Pau Monné wrote: > > On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote: > >> On 20.01.2022 16:23, Roger Pau Monne wrote: > >>> --- a/xen/arch/x86/include/asm/msi.h > >>> +++ b/xen/arch/x86/include/asm/msi.h > >>> @@ -54,6 +54,7 @@ > >>> #define MSI_ADDR_DEST_ID_SHIFT 12 > >>> #define MSI_ADDR_DEST_ID_MASK 0x00ff000 > >>> #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) > >>> +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 > >> > >> Especially the immediately preceding macro now becomes kind of stale. > > > > Hm, I'm not so sure about that. We could expand the macro to place the > > high bits in dest at bits 11:4 of the resulting address. However that > > macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own > > messages, so until we add support for the hypervisor itself to use the > > extended destination ID mode there's not much point in modifying the > > macro IMO. > > Well, this is all unhelpful considering the different form of extended > ID in Intel's doc. At least by way of a comment things need clarifying > and potential pitfalls need pointing out imo. Sure, will add some comments there. > >>> --- a/xen/include/public/domctl.h > >>> +++ b/xen/include/public/domctl.h > >>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { > >>> #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 > >>> #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 > >>> #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 > >>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 > >> > >> I'm not convinced it is a good idea to limit the overall destination > >> ID width to 15 bits here - at the interface level we could as well > >> permit more bits right away; the implementation would reject too high > >> a value, of course. Not only with this I further wonder whether the > >> field shouldn't be unsplit while extending it. You won't get away > >> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was > >> bumped already for 4.17) since afaics the unused bits of this field > >> previously weren't checked for being zero. We could easily have 8 > >> bits vector, 16 bits flags, and 32 bits destination ID in the struct. > >> And there would then still be 8 unused bits (which from now on we > >> ought to check for being zero). > > > > So I've made gflags a 64bit field, used the high 32bits for the > > destination ID, and the low ones for flags. I've left gvec as a > > separate field in the struct, as I don't want to require a > > modification to QEMU, just a rebuild against the updated headers will > > be enough. > > Hmm, wait - if qemu uses this without going through a suitable > abstraction in at least libxc, then we cannot _ever_ change this > interface: If a rebuild was required, old qemu binaries would > stop working with newer Xen. If such a dependency indeed exists, > we'll need a prominent warning comment in the public header. Hm, it's bad. The xc_domain_update_msi_irq interface uses a gflags parameter that's the gflags parameter of xen_domctl_bind_pt_irq. Which is even worse because it's not using the mask definitions from domctl.h, but rather a copy of them named XEN_PT_GFLAGS_* that are hardcoded in xen_pt_msi.c in QEMU code. So we can likely expand the layout of gflags, but moving fields is not an option. I think my original proposal of adding a XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK mask is the less bad option until we add a new stable interface for device passthrough for QEMU. Thanks, Roger.
On 04.02.2022 10:54, Roger Pau Monné wrote: > On Fri, Feb 04, 2022 at 10:30:54AM +0100, Jan Beulich wrote: >> On 04.02.2022 10:23, Roger Pau Monné wrote: >>> On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote: >>>> On 20.01.2022 16:23, Roger Pau Monne wrote: >>>>> --- a/xen/arch/x86/include/asm/msi.h >>>>> +++ b/xen/arch/x86/include/asm/msi.h >>>>> @@ -54,6 +54,7 @@ >>>>> #define MSI_ADDR_DEST_ID_SHIFT 12 >>>>> #define MSI_ADDR_DEST_ID_MASK 0x00ff000 >>>>> #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) >>>>> +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 >>>> >>>> Especially the immediately preceding macro now becomes kind of stale. >>> >>> Hm, I'm not so sure about that. We could expand the macro to place the >>> high bits in dest at bits 11:4 of the resulting address. However that >>> macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own >>> messages, so until we add support for the hypervisor itself to use the >>> extended destination ID mode there's not much point in modifying the >>> macro IMO. >> >> Well, this is all unhelpful considering the different form of extended >> ID in Intel's doc. At least by way of a comment things need clarifying >> and potential pitfalls need pointing out imo. > > Sure, will add some comments there. > >>>>> --- a/xen/include/public/domctl.h >>>>> +++ b/xen/include/public/domctl.h >>>>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { >>>>> #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 >>>>> #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 >>>>> #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 >>>>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 >>>> >>>> I'm not convinced it is a good idea to limit the overall destination >>>> ID width to 15 bits here - at the interface level we could as well >>>> permit more bits right away; the implementation would reject too high >>>> a value, of course. Not only with this I further wonder whether the >>>> field shouldn't be unsplit while extending it. You won't get away >>>> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was >>>> bumped already for 4.17) since afaics the unused bits of this field >>>> previously weren't checked for being zero. We could easily have 8 >>>> bits vector, 16 bits flags, and 32 bits destination ID in the struct. >>>> And there would then still be 8 unused bits (which from now on we >>>> ought to check for being zero). >>> >>> So I've made gflags a 64bit field, used the high 32bits for the >>> destination ID, and the low ones for flags. I've left gvec as a >>> separate field in the struct, as I don't want to require a >>> modification to QEMU, just a rebuild against the updated headers will >>> be enough. >> >> Hmm, wait - if qemu uses this without going through a suitable >> abstraction in at least libxc, then we cannot _ever_ change this >> interface: If a rebuild was required, old qemu binaries would >> stop working with newer Xen. If such a dependency indeed exists, >> we'll need a prominent warning comment in the public header. > > Hm, it's bad. The xc_domain_update_msi_irq interface uses a gflags > parameter that's the gflags parameter of xen_domctl_bind_pt_irq. Which > is even worse because it's not using the mask definitions from > domctl.h, but rather a copy of them named XEN_PT_GFLAGS_* that are > hardcoded in xen_pt_msi.c in QEMU code. > > So we can likely expand the layout of gflags, but moving fields is not > an option. I think my original proposal of adding a > XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK mask is the less bad option until > we add a new stable interface for device passthrough for QEMU. Given the observations - yeah, not much of a choice left. Jan
diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c index 52aae4565f..b9b5182369 100644 --- a/xen/arch/x86/hvm/irq.c +++ b/xen/arch/x86/hvm/irq.c @@ -383,7 +383,8 @@ int hvm_set_pci_link_route(struct domain *d, u8 link, u8 isa_irq) int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data) { uint32_t tmp = (uint32_t) addr; - uint8_t dest = (tmp & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT; + unsigned int dest = (MASK_EXTR(tmp, MSI_ADDR_EXT_DEST_ID_MASK) << 8) | + MASK_EXTR(tmp, MSI_ADDR_DEST_ID_MASK); uint8_t dest_mode = !!(tmp & MSI_ADDR_DESTMODE_MASK); uint8_t delivery_mode = (data & MSI_DATA_DELIVERY_MODE_MASK) >> MSI_DATA_DELIVERY_MODE_SHIFT; diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c index 13e2a190b4..ec0f3bc13f 100644 --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -66,7 +66,7 @@ static void vmsi_inj_irq( int vmsi_deliver( struct domain *d, int vector, - uint8_t dest, uint8_t dest_mode, + unsigned int dest, unsigned int dest_mode, uint8_t delivery_mode, uint8_t trig_mode) { struct vlapic *target; @@ -107,7 +107,9 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci) { uint32_t flags = pirq_dpci->gmsi.gflags; int vector = pirq_dpci->gmsi.gvec; - uint8_t dest = (uint8_t)flags; + unsigned int dest = MASK_EXTR(flags, XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) | + (MASK_EXTR(flags, + XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8); bool dest_mode = flags & XEN_DOMCTL_VMSI_X86_DM_MASK; uint8_t delivery_mode = MASK_EXTR(flags, XEN_DOMCTL_VMSI_X86_DELIV_MASK); bool trig_mode = flags & XEN_DOMCTL_VMSI_X86_TRIG_MASK; @@ -123,7 +125,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci) } /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */ -int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode) +int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest, + unsigned int dest_mode) { int dest_vcpu_id = -1, w = 0; struct vcpu *v; @@ -645,6 +648,8 @@ static unsigned int msi_gflags(uint16_t data, uint64_t addr, bool masked) */ return MASK_INSR(MASK_EXTR(addr, MSI_ADDR_DEST_ID_MASK), XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) | + MASK_INSR(MASK_EXTR(addr, MSI_ADDR_EXT_DEST_ID_MASK), + XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) | MASK_INSR(MASK_EXTR(addr, MSI_ADDR_REDIRECTION_MASK), XEN_DOMCTL_VMSI_X86_RH_MASK) | MASK_INSR(MASK_EXTR(addr, MSI_ADDR_DESTMODE_MASK), @@ -835,6 +840,7 @@ void vpci_msi_arch_print(const struct vpci_msi *msi) msi->data & MSI_DATA_LEVEL_ASSERT ? "" : "de", msi->address & MSI_ADDR_DESTMODE_LOGIC ? "log" : "phys", msi->address & MSI_ADDR_REDIRECTION_LOWPRI ? "lowest" : "fixed", + (MASK_EXTR(msi->address, MSI_ADDR_EXT_DEST_ID_MASK) << 8) | MASK_EXTR(msi->address, MSI_ADDR_DEST_ID_MASK), msi->arch.pirq); } @@ -904,6 +910,7 @@ int vpci_msix_arch_print(const struct vpci_msix *msix) entry->data & MSI_DATA_LEVEL_ASSERT ? "" : "de", entry->addr & MSI_ADDR_DESTMODE_LOGIC ? "log" : "phys", entry->addr & MSI_ADDR_REDIRECTION_LOWPRI ? "lowest" : "fixed", + (MASK_EXTR(entry->addr, MSI_ADDR_EXT_DEST_ID_MASK) << 8) | MASK_EXTR(entry->addr, MSI_ADDR_DEST_ID_MASK), entry->masked, entry->arch.pirq); if ( i && !(i % 64) ) diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h index b26302d9e7..f001b43a21 100644 --- a/xen/arch/x86/include/asm/hvm/hvm.h +++ b/xen/arch/x86/include/asm/hvm/hvm.h @@ -270,11 +270,12 @@ uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc); int vmsi_deliver( struct domain *d, int vector, - uint8_t dest, uint8_t dest_mode, + unsigned int dest, unsigned int dest_mode, uint8_t delivery_mode, uint8_t trig_mode); struct hvm_pirq_dpci; void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *); -int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode); +int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest, + unsigned int dest_mode); enum hvm_intblk hvm_interrupt_blocked(struct vcpu *v, struct hvm_intack intack); diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h index e228b0f3f3..531b860e42 100644 --- a/xen/arch/x86/include/asm/msi.h +++ b/xen/arch/x86/include/asm/msi.h @@ -54,6 +54,7 @@ #define MSI_ADDR_DEST_ID_SHIFT 12 #define MSI_ADDR_DEST_ID_MASK 0x00ff000 #define MSI_ADDR_DEST_ID(dest) (((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK) +#define MSI_ADDR_EXT_DEST_ID_MASK 0x0000fe0 /* MAX fixed pages reserved for mapping MSIX tables. */ #define FIX_MSIX_MAX_PAGES 512 diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 485bd66971..3d2d75978c 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1150,6 +1150,9 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf, res->a |= XEN_HVM_CPUID_DOMID_PRESENT; res->c = d->domain_id; + if ( has_vpci(d) ) + res->a |= XEN_HVM_CPUID_EXT_DEST_ID; + break; case 5: /* PV-specific parameters */ diff --git a/xen/drivers/passthrough/x86/hvm.c b/xen/drivers/passthrough/x86/hvm.c index 351daafdc9..666c4b7757 100644 --- a/xen/drivers/passthrough/x86/hvm.c +++ b/xen/drivers/passthrough/x86/hvm.c @@ -269,7 +269,7 @@ int pt_irq_create_bind( { case PT_IRQ_TYPE_MSI: { - uint8_t dest, delivery_mode; + unsigned int dest, delivery_mode; bool dest_mode; int dest_vcpu_id; const struct vcpu *vcpu; @@ -345,7 +345,9 @@ int pt_irq_create_bind( } /* Calculate dest_vcpu_id for MSI-type pirq migration. */ dest = MASK_EXTR(pirq_dpci->gmsi.gflags, - XEN_DOMCTL_VMSI_X86_DEST_ID_MASK); + XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) | + (MASK_EXTR(pirq_dpci->gmsi.gflags, + XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8); dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK; delivery_mode = MASK_EXTR(pirq_dpci->gmsi.gflags, XEN_DOMCTL_VMSI_X86_DELIV_MASK); @@ -782,7 +784,9 @@ static int _hvm_dpci_msi_eoi(struct domain *d, (pirq_dpci->gmsi.gvec == vector) ) { unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.gflags, - XEN_DOMCTL_VMSI_X86_DEST_ID_MASK); + XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) | + (MASK_EXTR(pirq_dpci->gmsi.gflags, + XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8); bool dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK; if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest, diff --git a/xen/include/public/arch-x86/cpuid.h b/xen/include/public/arch-x86/cpuid.h index ce46305bee..49bcc93b6b 100644 --- a/xen/include/public/arch-x86/cpuid.h +++ b/xen/include/public/arch-x86/cpuid.h @@ -102,6 +102,12 @@ #define XEN_HVM_CPUID_IOMMU_MAPPINGS (1u << 2) #define XEN_HVM_CPUID_VCPU_ID_PRESENT (1u << 3) /* vcpu id is present in EBX */ #define XEN_HVM_CPUID_DOMID_PRESENT (1u << 4) /* domid is present in ECX */ +/* + * Bits 55:49 from the IO-APIC RTE and bits 11:5 from the MSI address can be + * used to store high bits for the Destination ID. This expands the Destination + * ID field from 8 to 15 bits, allowing to target APIC IDs up 32768. + */ +#define XEN_HVM_CPUID_EXT_DEST_ID (1u << 5) /* * Leaf 6 (0x40000x05) diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index b85e6170b0..17ac7ef82b 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq { #define XEN_DOMCTL_VMSI_X86_DELIV_MASK 0x007000 #define XEN_DOMCTL_VMSI_X86_TRIG_MASK 0x008000 #define XEN_DOMCTL_VMSI_X86_UNMASKED 0x010000 +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000 uint64_aligned_t gtable; } msi;
Both QEMU/KVM and HyperV support using bits 11:5 from the MSI address field in order to store the high part of the target APIC ID. This allows expanding the maximum APID ID usable without interrupt remapping support from 255 to 32768. Note the interface used by QEMU for emulated devices (via the XEN_DMOP_inject_msi hypercall) already passes both the address and data fields into Xen for processing, so there's no need for any change to QEMU there. However for PCI passthrough devices QEMU uses the XEN_DOMCTL_bind_pt_irq hypercall which does need an addition to the gflags field in order to pass the high bits of the APIC destination ID. Introduce a new CPUID flag to signal the support for the feature. The introduced flag covers both the support for extended ID for the IO-APIC RTE and the MSI address registers. Such flag is currently only exposed when the domain is using vPCI (ie: a PVH dom0). Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/hvm/irq.c | 3 ++- xen/arch/x86/hvm/vmsi.c | 13 ++++++++++--- xen/arch/x86/include/asm/hvm/hvm.h | 5 +++-- xen/arch/x86/include/asm/msi.h | 1 + xen/arch/x86/traps.c | 3 +++ xen/drivers/passthrough/x86/hvm.c | 10 +++++++--- xen/include/public/arch-x86/cpuid.h | 6 ++++++ xen/include/public/domctl.h | 1 + 8 files changed, 33 insertions(+), 9 deletions(-)