diff mbox series

[2/3] x86/vmsi: add support for extended destination ID in address field

Message ID 20220120152319.7448-3-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/hvm: add support for extended destination ID | expand

Commit Message

Roger Pau Monné Jan. 20, 2022, 3:23 p.m. UTC
Both QEMU/KVM and HyperV support using bits 11:5 from the MSI address
field in order to store the high part of the target APIC ID. This
allows expanding the maximum APID ID usable without interrupt
remapping support from 255 to 32768.

Note the interface used by QEMU for emulated devices (via the
XEN_DMOP_inject_msi hypercall) already passes both the address and
data fields into Xen for processing, so there's no need for any change
to QEMU there.

However for PCI passthrough devices QEMU uses the
XEN_DOMCTL_bind_pt_irq hypercall which does need an addition to the
gflags field in order to pass the high bits of the APIC destination
ID.

Introduce a new CPUID flag to signal the support for the feature. The
introduced flag covers both the support for extended ID for the
IO-APIC RTE and the MSI address registers. Such flag is currently only
exposed when the domain is using vPCI (ie: a PVH dom0).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/irq.c              |  3 ++-
 xen/arch/x86/hvm/vmsi.c             | 13 ++++++++++---
 xen/arch/x86/include/asm/hvm/hvm.h  |  5 +++--
 xen/arch/x86/include/asm/msi.h      |  1 +
 xen/arch/x86/traps.c                |  3 +++
 xen/drivers/passthrough/x86/hvm.c   | 10 +++++++---
 xen/include/public/arch-x86/cpuid.h |  6 ++++++
 xen/include/public/domctl.h         |  1 +
 8 files changed, 33 insertions(+), 9 deletions(-)

Comments

Jan Beulich Jan. 24, 2022, 1:47 p.m. UTC | #1
On 20.01.2022 16:23, Roger Pau Monne wrote:
> Both QEMU/KVM and HyperV support using bits 11:5 from the MSI address
> field in order to store the high part of the target APIC ID. This
> allows expanding the maximum APID ID usable without interrupt
> remapping support from 255 to 32768.
> 
> Note the interface used by QEMU for emulated devices (via the
> XEN_DMOP_inject_msi hypercall) already passes both the address and
> data fields into Xen for processing, so there's no need for any change
> to QEMU there.
> 
> However for PCI passthrough devices QEMU uses the
> XEN_DOMCTL_bind_pt_irq hypercall which does need an addition to the
> gflags field in order to pass the high bits of the APIC destination
> ID.
> 
> Introduce a new CPUID flag to signal the support for the feature. The
> introduced flag covers both the support for extended ID for the
> IO-APIC RTE and the MSI address registers. Such flag is currently only
> exposed when the domain is using vPCI (ie: a PVH dom0).

Because of also covering the IO-APIC side, I think the CPUID aspect of
this really wants splitting into a 3rd patch. That way the MSI and
IO-APIC parts could in principle go in independently, and only the
CPUID one needs to remain at the tail.

> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -66,7 +66,7 @@ static void vmsi_inj_irq(
>  
>  int vmsi_deliver(
>      struct domain *d, int vector,
> -    uint8_t dest, uint8_t dest_mode,
> +    unsigned int dest, unsigned int dest_mode,

If you change the type of dest_mode, then to "bool" please - see its
only call site.

> @@ -123,7 +125,8 @@ void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
>  }
>  
>  /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
> -int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode)
> +int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest,
> +                            unsigned int dest_mode)

Same here then.

> --- a/xen/arch/x86/include/asm/msi.h
> +++ b/xen/arch/x86/include/asm/msi.h
> @@ -54,6 +54,7 @@
>  #define MSI_ADDR_DEST_ID_SHIFT		12
>  #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
>  #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
> +#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0

Especially the immediately preceding macro now becomes kind of stale.

> --- a/xen/drivers/passthrough/x86/hvm.c
> +++ b/xen/drivers/passthrough/x86/hvm.c
> @@ -269,7 +269,7 @@ int pt_irq_create_bind(
>      {
>      case PT_IRQ_TYPE_MSI:
>      {
> -        uint8_t dest, delivery_mode;
> +        unsigned int dest, delivery_mode;
>          bool dest_mode;

If you touch delivery_mode's type, wouldn't that better become bool?

> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq {
>  #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
>  #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
>  #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000

I'm not convinced it is a good idea to limit the overall destination
ID width to 15 bits here - at the interface level we could as well
permit more bits right away; the implementation would reject too high
a value, of course. Not only with this I further wonder whether the
field shouldn't be unsplit while extending it. You won't get away
without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was
bumped already for 4.17) since afaics the unused bits of this field
previously weren't checked for being zero. We could easily have 8
bits vector, 16 bits flags, and 32 bits destination ID in the struct.
And there would then still be 8 unused bits (which from now on we
ought to check for being zero).

Jan
David Woodhouse Jan. 26, 2022, 1:54 p.m. UTC | #2
On Mon, 2022-01-24 at 14:47 +0100, Jan Beulich wrote:
> Because of also covering the IO-APIC side, I think the CPUID aspect of
> this really wants splitting into a 3rd patch. That way the MSI and
> IO-APIC parts could in principle go in independently, and only the
> CPUID one needs to remain at the tail.

HPET can generate MSIs directly too.
Roger Pau Monné Jan. 26, 2022, 2:22 p.m. UTC | #3
On Wed, Jan 26, 2022 at 01:54:26PM +0000, David Woodhouse wrote:
> On Mon, 2022-01-24 at 14:47 +0100, Jan Beulich wrote:
> > Because of also covering the IO-APIC side, I think the CPUID aspect of
> > this really wants splitting into a 3rd patch. That way the MSI and
> > IO-APIC parts could in principle go in independently, and only the
> > CPUID one needs to remain at the tail.
> 
> HPET can generate MSIs directly too.

Indeed, but the emulated one we expose to HVM guests doesn't support
FSB.

Thanks, Roger.
Roger Pau Monné Feb. 4, 2022, 9:23 a.m. UTC | #4
On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote:
> On 20.01.2022 16:23, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/include/asm/msi.h
> > +++ b/xen/arch/x86/include/asm/msi.h
> > @@ -54,6 +54,7 @@
> >  #define MSI_ADDR_DEST_ID_SHIFT		12
> >  #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
> >  #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
> > +#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0
> 
> Especially the immediately preceding macro now becomes kind of stale.

Hm, I'm not so sure about that. We could expand the macro to place the
high bits in dest at bits 11:4 of the resulting address. However that
macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own
messages, so until we add support for the hypervisor itself to use the
extended destination ID mode there's not much point in modifying the
macro IMO.

> 
> > --- a/xen/drivers/passthrough/x86/hvm.c
> > +++ b/xen/drivers/passthrough/x86/hvm.c
> > @@ -269,7 +269,7 @@ int pt_irq_create_bind(
> >      {
> >      case PT_IRQ_TYPE_MSI:
> >      {
> > -        uint8_t dest, delivery_mode;
> > +        unsigned int dest, delivery_mode;
> >          bool dest_mode;
> 
> If you touch delivery_mode's type, wouldn't that better become bool?
> 
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq {
> >  #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
> >  #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
> >  #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
> > +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000
> 
> I'm not convinced it is a good idea to limit the overall destination
> ID width to 15 bits here - at the interface level we could as well
> permit more bits right away; the implementation would reject too high
> a value, of course. Not only with this I further wonder whether the
> field shouldn't be unsplit while extending it. You won't get away
> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was
> bumped already for 4.17) since afaics the unused bits of this field
> previously weren't checked for being zero. We could easily have 8
> bits vector, 16 bits flags, and 32 bits destination ID in the struct.
> And there would then still be 8 unused bits (which from now on we
> ought to check for being zero).

So I've made gflags a 64bit field, used the high 32bits for the
destination ID, and the low ones for flags. I've left gvec as a
separate field in the struct, as I don't want to require a
modification to QEMU, just a rebuild against the updated headers will
be enough.

I've been wondering about this interface though
(xen_domctl_bind_pt_irq), and it would seem better to just pass the
raw MSI address and data fields from the guest and let Xen do the
decoding. This however is not trivial to do as we would need to keep
the previous interface anyway as it's used by QEMU. Maybe we could
have some kind of union between a pair of address and data fields and
a gflags one that would match the native layout, but as said not
trivial (and would require using anonymous unions which I'm not sure
are accepted even for domctls in the public headers).

Thanks, Roger.
Jan Beulich Feb. 4, 2022, 9:30 a.m. UTC | #5
On 04.02.2022 10:23, Roger Pau Monné wrote:
> On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote:
>> On 20.01.2022 16:23, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/include/asm/msi.h
>>> +++ b/xen/arch/x86/include/asm/msi.h
>>> @@ -54,6 +54,7 @@
>>>  #define MSI_ADDR_DEST_ID_SHIFT		12
>>>  #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
>>>  #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
>>> +#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0
>>
>> Especially the immediately preceding macro now becomes kind of stale.
> 
> Hm, I'm not so sure about that. We could expand the macro to place the
> high bits in dest at bits 11:4 of the resulting address. However that
> macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own
> messages, so until we add support for the hypervisor itself to use the
> extended destination ID mode there's not much point in modifying the
> macro IMO.

Well, this is all unhelpful considering the different form of extended
ID in Intel's doc. At least by way of a comment things need clarifying
and potential pitfalls need pointing out imo.

>>> --- a/xen/include/public/domctl.h
>>> +++ b/xen/include/public/domctl.h
>>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq {
>>>  #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
>>>  #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
>>>  #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
>>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000
>>
>> I'm not convinced it is a good idea to limit the overall destination
>> ID width to 15 bits here - at the interface level we could as well
>> permit more bits right away; the implementation would reject too high
>> a value, of course. Not only with this I further wonder whether the
>> field shouldn't be unsplit while extending it. You won't get away
>> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was
>> bumped already for 4.17) since afaics the unused bits of this field
>> previously weren't checked for being zero. We could easily have 8
>> bits vector, 16 bits flags, and 32 bits destination ID in the struct.
>> And there would then still be 8 unused bits (which from now on we
>> ought to check for being zero).
> 
> So I've made gflags a 64bit field, used the high 32bits for the
> destination ID, and the low ones for flags. I've left gvec as a
> separate field in the struct, as I don't want to require a
> modification to QEMU, just a rebuild against the updated headers will
> be enough.

Hmm, wait - if qemu uses this without going through a suitable
abstraction in at least libxc, then we cannot _ever_ change this
interface: If a rebuild was required, old qemu binaries would
stop working with newer Xen. If such a dependency indeed exists,
we'll need a prominent warning comment in the public header.

Jan

> I've been wondering about this interface though
> (xen_domctl_bind_pt_irq), and it would seem better to just pass the
> raw MSI address and data fields from the guest and let Xen do the
> decoding. This however is not trivial to do as we would need to keep
> the previous interface anyway as it's used by QEMU. Maybe we could
> have some kind of union between a pair of address and data fields and
> a gflags one that would match the native layout, but as said not
> trivial (and would require using anonymous unions which I'm not sure
> are accepted even for domctls in the public headers).
> 
> Thanks, Roger.
>
Roger Pau Monné Feb. 4, 2022, 9:54 a.m. UTC | #6
On Fri, Feb 04, 2022 at 10:30:54AM +0100, Jan Beulich wrote:
> On 04.02.2022 10:23, Roger Pau Monné wrote:
> > On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote:
> >> On 20.01.2022 16:23, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/include/asm/msi.h
> >>> +++ b/xen/arch/x86/include/asm/msi.h
> >>> @@ -54,6 +54,7 @@
> >>>  #define MSI_ADDR_DEST_ID_SHIFT		12
> >>>  #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
> >>>  #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
> >>> +#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0
> >>
> >> Especially the immediately preceding macro now becomes kind of stale.
> > 
> > Hm, I'm not so sure about that. We could expand the macro to place the
> > high bits in dest at bits 11:4 of the resulting address. However that
> > macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own
> > messages, so until we add support for the hypervisor itself to use the
> > extended destination ID mode there's not much point in modifying the
> > macro IMO.
> 
> Well, this is all unhelpful considering the different form of extended
> ID in Intel's doc. At least by way of a comment things need clarifying
> and potential pitfalls need pointing out imo.

Sure, will add some comments there.

> >>> --- a/xen/include/public/domctl.h
> >>> +++ b/xen/include/public/domctl.h
> >>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq {
> >>>  #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
> >>>  #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
> >>>  #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
> >>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000
> >>
> >> I'm not convinced it is a good idea to limit the overall destination
> >> ID width to 15 bits here - at the interface level we could as well
> >> permit more bits right away; the implementation would reject too high
> >> a value, of course. Not only with this I further wonder whether the
> >> field shouldn't be unsplit while extending it. You won't get away
> >> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was
> >> bumped already for 4.17) since afaics the unused bits of this field
> >> previously weren't checked for being zero. We could easily have 8
> >> bits vector, 16 bits flags, and 32 bits destination ID in the struct.
> >> And there would then still be 8 unused bits (which from now on we
> >> ought to check for being zero).
> > 
> > So I've made gflags a 64bit field, used the high 32bits for the
> > destination ID, and the low ones for flags. I've left gvec as a
> > separate field in the struct, as I don't want to require a
> > modification to QEMU, just a rebuild against the updated headers will
> > be enough.
> 
> Hmm, wait - if qemu uses this without going through a suitable
> abstraction in at least libxc, then we cannot _ever_ change this
> interface: If a rebuild was required, old qemu binaries would
> stop working with newer Xen. If such a dependency indeed exists,
> we'll need a prominent warning comment in the public header.

Hm, it's bad. The xc_domain_update_msi_irq interface uses a gflags
parameter that's the gflags parameter of xen_domctl_bind_pt_irq. Which
is even worse because it's not using the mask definitions from
domctl.h, but rather a copy of them named XEN_PT_GFLAGS_* that are
hardcoded in xen_pt_msi.c in QEMU code.

So we can likely expand the layout of gflags, but moving fields is not
an option. I think my original proposal of adding a
XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK mask is the less bad option until
we add a new stable interface for device passthrough for QEMU.

Thanks, Roger.
Jan Beulich Feb. 4, 2022, 10:20 a.m. UTC | #7
On 04.02.2022 10:54, Roger Pau Monné wrote:
> On Fri, Feb 04, 2022 at 10:30:54AM +0100, Jan Beulich wrote:
>> On 04.02.2022 10:23, Roger Pau Monné wrote:
>>> On Mon, Jan 24, 2022 at 02:47:58PM +0100, Jan Beulich wrote:
>>>> On 20.01.2022 16:23, Roger Pau Monne wrote:
>>>>> --- a/xen/arch/x86/include/asm/msi.h
>>>>> +++ b/xen/arch/x86/include/asm/msi.h
>>>>> @@ -54,6 +54,7 @@
>>>>>  #define MSI_ADDR_DEST_ID_SHIFT		12
>>>>>  #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
>>>>>  #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
>>>>> +#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0
>>>>
>>>> Especially the immediately preceding macro now becomes kind of stale.
>>>
>>> Hm, I'm not so sure about that. We could expand the macro to place the
>>> high bits in dest at bits 11:4 of the resulting address. However that
>>> macro (MSI_ADDR_DEST_ID) is only used by Xen to compose its own
>>> messages, so until we add support for the hypervisor itself to use the
>>> extended destination ID mode there's not much point in modifying the
>>> macro IMO.
>>
>> Well, this is all unhelpful considering the different form of extended
>> ID in Intel's doc. At least by way of a comment things need clarifying
>> and potential pitfalls need pointing out imo.
> 
> Sure, will add some comments there.
> 
>>>>> --- a/xen/include/public/domctl.h
>>>>> +++ b/xen/include/public/domctl.h
>>>>> @@ -588,6 +588,7 @@ struct xen_domctl_bind_pt_irq {
>>>>>  #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
>>>>>  #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
>>>>>  #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
>>>>> +#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000
>>>>
>>>> I'm not convinced it is a good idea to limit the overall destination
>>>> ID width to 15 bits here - at the interface level we could as well
>>>> permit more bits right away; the implementation would reject too high
>>>> a value, of course. Not only with this I further wonder whether the
>>>> field shouldn't be unsplit while extending it. You won't get away
>>>> without bumping XEN_DOMCTL_INTERFACE_VERSION anyway (unless it was
>>>> bumped already for 4.17) since afaics the unused bits of this field
>>>> previously weren't checked for being zero. We could easily have 8
>>>> bits vector, 16 bits flags, and 32 bits destination ID in the struct.
>>>> And there would then still be 8 unused bits (which from now on we
>>>> ought to check for being zero).
>>>
>>> So I've made gflags a 64bit field, used the high 32bits for the
>>> destination ID, and the low ones for flags. I've left gvec as a
>>> separate field in the struct, as I don't want to require a
>>> modification to QEMU, just a rebuild against the updated headers will
>>> be enough.
>>
>> Hmm, wait - if qemu uses this without going through a suitable
>> abstraction in at least libxc, then we cannot _ever_ change this
>> interface: If a rebuild was required, old qemu binaries would
>> stop working with newer Xen. If such a dependency indeed exists,
>> we'll need a prominent warning comment in the public header.
> 
> Hm, it's bad. The xc_domain_update_msi_irq interface uses a gflags
> parameter that's the gflags parameter of xen_domctl_bind_pt_irq. Which
> is even worse because it's not using the mask definitions from
> domctl.h, but rather a copy of them named XEN_PT_GFLAGS_* that are
> hardcoded in xen_pt_msi.c in QEMU code.
> 
> So we can likely expand the layout of gflags, but moving fields is not
> an option. I think my original proposal of adding a
> XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK mask is the less bad option until
> we add a new stable interface for device passthrough for QEMU.

Given the observations - yeah, not much of a choice left.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index 52aae4565f..b9b5182369 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -383,7 +383,8 @@  int hvm_set_pci_link_route(struct domain *d, u8 link, u8 isa_irq)
 int hvm_inject_msi(struct domain *d, uint64_t addr, uint32_t data)
 {
     uint32_t tmp = (uint32_t) addr;
-    uint8_t  dest = (tmp & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
+    unsigned int dest = (MASK_EXTR(tmp, MSI_ADDR_EXT_DEST_ID_MASK) << 8) |
+                        MASK_EXTR(tmp, MSI_ADDR_DEST_ID_MASK);
     uint8_t  dest_mode = !!(tmp & MSI_ADDR_DESTMODE_MASK);
     uint8_t  delivery_mode = (data & MSI_DATA_DELIVERY_MODE_MASK)
         >> MSI_DATA_DELIVERY_MODE_SHIFT;
diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 13e2a190b4..ec0f3bc13f 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -66,7 +66,7 @@  static void vmsi_inj_irq(
 
 int vmsi_deliver(
     struct domain *d, int vector,
-    uint8_t dest, uint8_t dest_mode,
+    unsigned int dest, unsigned int dest_mode,
     uint8_t delivery_mode, uint8_t trig_mode)
 {
     struct vlapic *target;
@@ -107,7 +107,9 @@  void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 {
     uint32_t flags = pirq_dpci->gmsi.gflags;
     int vector = pirq_dpci->gmsi.gvec;
-    uint8_t dest = (uint8_t)flags;
+    unsigned int dest = MASK_EXTR(flags, XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) |
+                        (MASK_EXTR(flags,
+                                   XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8);
     bool dest_mode = flags & XEN_DOMCTL_VMSI_X86_DM_MASK;
     uint8_t delivery_mode = MASK_EXTR(flags, XEN_DOMCTL_VMSI_X86_DELIV_MASK);
     bool trig_mode = flags & XEN_DOMCTL_VMSI_X86_TRIG_MASK;
@@ -123,7 +125,8 @@  void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *pirq_dpci)
 }
 
 /* Return value, -1 : multi-dests, non-negative value: dest_vcpu_id */
-int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode)
+int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest,
+                            unsigned int dest_mode)
 {
     int dest_vcpu_id = -1, w = 0;
     struct vcpu *v;
@@ -645,6 +648,8 @@  static unsigned int msi_gflags(uint16_t data, uint64_t addr, bool masked)
      */
     return MASK_INSR(MASK_EXTR(addr, MSI_ADDR_DEST_ID_MASK),
                      XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) |
+           MASK_INSR(MASK_EXTR(addr, MSI_ADDR_EXT_DEST_ID_MASK),
+                     XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) |
            MASK_INSR(MASK_EXTR(addr, MSI_ADDR_REDIRECTION_MASK),
                      XEN_DOMCTL_VMSI_X86_RH_MASK) |
            MASK_INSR(MASK_EXTR(addr, MSI_ADDR_DESTMODE_MASK),
@@ -835,6 +840,7 @@  void vpci_msi_arch_print(const struct vpci_msi *msi)
            msi->data & MSI_DATA_LEVEL_ASSERT ? "" : "de",
            msi->address & MSI_ADDR_DESTMODE_LOGIC ? "log" : "phys",
            msi->address & MSI_ADDR_REDIRECTION_LOWPRI ? "lowest" : "fixed",
+           (MASK_EXTR(msi->address, MSI_ADDR_EXT_DEST_ID_MASK) << 8) |
            MASK_EXTR(msi->address, MSI_ADDR_DEST_ID_MASK),
            msi->arch.pirq);
 }
@@ -904,6 +910,7 @@  int vpci_msix_arch_print(const struct vpci_msix *msix)
                entry->data & MSI_DATA_LEVEL_ASSERT ? "" : "de",
                entry->addr & MSI_ADDR_DESTMODE_LOGIC ? "log" : "phys",
                entry->addr & MSI_ADDR_REDIRECTION_LOWPRI ? "lowest" : "fixed",
+               (MASK_EXTR(entry->addr, MSI_ADDR_EXT_DEST_ID_MASK) << 8) |
                MASK_EXTR(entry->addr, MSI_ADDR_DEST_ID_MASK),
                entry->masked, entry->arch.pirq);
         if ( i && !(i % 64) )
diff --git a/xen/arch/x86/include/asm/hvm/hvm.h b/xen/arch/x86/include/asm/hvm/hvm.h
index b26302d9e7..f001b43a21 100644
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -270,11 +270,12 @@  uint64_t hvm_get_guest_time_fixed(const struct vcpu *v, uint64_t at_tsc);
 
 int vmsi_deliver(
     struct domain *d, int vector,
-    uint8_t dest, uint8_t dest_mode,
+    unsigned int dest, unsigned int dest_mode,
     uint8_t delivery_mode, uint8_t trig_mode);
 struct hvm_pirq_dpci;
 void vmsi_deliver_pirq(struct domain *d, const struct hvm_pirq_dpci *);
-int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
+int hvm_girq_dest_2_vcpu_id(struct domain *d, unsigned int dest,
+                            unsigned int dest_mode);
 
 enum hvm_intblk
 hvm_interrupt_blocked(struct vcpu *v, struct hvm_intack intack);
diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h
index e228b0f3f3..531b860e42 100644
--- a/xen/arch/x86/include/asm/msi.h
+++ b/xen/arch/x86/include/asm/msi.h
@@ -54,6 +54,7 @@ 
 #define MSI_ADDR_DEST_ID_SHIFT		12
 #define	 MSI_ADDR_DEST_ID_MASK		0x00ff000
 #define  MSI_ADDR_DEST_ID(dest)		(((dest) << MSI_ADDR_DEST_ID_SHIFT) & MSI_ADDR_DEST_ID_MASK)
+#define	 MSI_ADDR_EXT_DEST_ID_MASK	0x0000fe0
 
 /* MAX fixed pages reserved for mapping MSIX tables. */
 #define FIX_MSIX_MAX_PAGES              512
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 485bd66971..3d2d75978c 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1150,6 +1150,9 @@  void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
         res->a |= XEN_HVM_CPUID_DOMID_PRESENT;
         res->c = d->domain_id;
 
+        if ( has_vpci(d) )
+            res->a |= XEN_HVM_CPUID_EXT_DEST_ID;
+
         break;
 
     case 5: /* PV-specific parameters */
diff --git a/xen/drivers/passthrough/x86/hvm.c b/xen/drivers/passthrough/x86/hvm.c
index 351daafdc9..666c4b7757 100644
--- a/xen/drivers/passthrough/x86/hvm.c
+++ b/xen/drivers/passthrough/x86/hvm.c
@@ -269,7 +269,7 @@  int pt_irq_create_bind(
     {
     case PT_IRQ_TYPE_MSI:
     {
-        uint8_t dest, delivery_mode;
+        unsigned int dest, delivery_mode;
         bool dest_mode;
         int dest_vcpu_id;
         const struct vcpu *vcpu;
@@ -345,7 +345,9 @@  int pt_irq_create_bind(
         }
         /* Calculate dest_vcpu_id for MSI-type pirq migration. */
         dest = MASK_EXTR(pirq_dpci->gmsi.gflags,
-                         XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
+                         XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) |
+               (MASK_EXTR(pirq_dpci->gmsi.gflags,
+                         XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8);
         dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
         delivery_mode = MASK_EXTR(pirq_dpci->gmsi.gflags,
                                   XEN_DOMCTL_VMSI_X86_DELIV_MASK);
@@ -782,7 +784,9 @@  static int _hvm_dpci_msi_eoi(struct domain *d,
          (pirq_dpci->gmsi.gvec == vector) )
     {
         unsigned int dest = MASK_EXTR(pirq_dpci->gmsi.gflags,
-                                      XEN_DOMCTL_VMSI_X86_DEST_ID_MASK);
+                                      XEN_DOMCTL_VMSI_X86_DEST_ID_MASK) |
+                            (MASK_EXTR(pirq_dpci->gmsi.gflags,
+                                XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK) << 8);
         bool dest_mode = pirq_dpci->gmsi.gflags & XEN_DOMCTL_VMSI_X86_DM_MASK;
 
         if ( vlapic_match_dest(vcpu_vlapic(current), NULL, 0, dest,
diff --git a/xen/include/public/arch-x86/cpuid.h b/xen/include/public/arch-x86/cpuid.h
index ce46305bee..49bcc93b6b 100644
--- a/xen/include/public/arch-x86/cpuid.h
+++ b/xen/include/public/arch-x86/cpuid.h
@@ -102,6 +102,12 @@ 
 #define XEN_HVM_CPUID_IOMMU_MAPPINGS   (1u << 2)
 #define XEN_HVM_CPUID_VCPU_ID_PRESENT  (1u << 3) /* vcpu id is present in EBX */
 #define XEN_HVM_CPUID_DOMID_PRESENT    (1u << 4) /* domid is present in ECX */
+/*
+ * Bits 55:49 from the IO-APIC RTE and bits 11:5 from the MSI address can be
+ * used to store high bits for the Destination ID. This expands the Destination
+ * ID field from 8 to 15 bits, allowing to target APIC IDs up 32768.
+ */
+#define XEN_HVM_CPUID_EXT_DEST_ID      (1u << 5)
 
 /*
  * Leaf 6 (0x40000x05)
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index b85e6170b0..17ac7ef82b 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -588,6 +588,7 @@  struct xen_domctl_bind_pt_irq {
 #define XEN_DOMCTL_VMSI_X86_DELIV_MASK   0x007000
 #define XEN_DOMCTL_VMSI_X86_TRIG_MASK    0x008000
 #define XEN_DOMCTL_VMSI_X86_UNMASKED     0x010000
+#define XEN_DOMCTL_VMSI_X86_EXT_DEST_ID_MASK 0xfe0000
 
             uint64_aligned_t gtable;
         } msi;