diff mbox

[V3] x86/HVM: Introduce struct hvm_pi_ops

Message ID 1484196428-8899-1-git-send-email-suravee.suthikulpanit@amd.com (mailing list archive)
State New, archived
Headers show

Commit Message

Suthikulpanit, Suravee Jan. 12, 2017, 4:47 a.m. UTC
The current function pointers in struct vmx_domain for managing hvm
posted interrupt can be used also by SVM AVIC. Therefore, this patch
introduces the struct hvm_pi_ops in the struct hvm_domain to hold them.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
---
Changes from V2:
    * Remove "pi_" prefix from the function pointers.
    * Reword and move VMX-specific description.

NOTE: I have separated this patch out from the AMD AVIC patch series
      since this mainly affects HVM and VMX code.

 xen/arch/x86/hvm/vmx/vmx.c         | 73 +++++++++++++++++++++++++++++---------
 xen/include/asm-x86/hvm/domain.h   | 34 ++++++++++++++++++
 xen/include/asm-x86/hvm/hvm.h      |  4 +--
 xen/include/asm-x86/hvm/vmx/vmcs.h | 59 ------------------------------
 4 files changed, 93 insertions(+), 77 deletions(-)

Comments

Jan Beulich Jan. 12, 2017, 12:37 p.m. UTC | #1
>>> On 12.01.17 at 05:47, <suravee.suthikulpanit@amd.com> wrote:
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -72,6 +72,38 @@ struct hvm_ioreq_server {
>      bool_t                 bufioreq_atomic;
>  };
>  
> +/*
> + * This structure defines function hooks to support hardware-assisted
> + * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
> + *
> + * The way we implement this now is by looking at what needs to happen on
> + * the following runstate transitions:
> + *
> + * A: runnable -> running  : switch_to
> + * B: running  -> runnable : switch_from

I don't think this is accurate - switch_from may be used when
transitioning to states other than runnable. Both hooks really
are context switch rather than run state related afaict.

> + * C: running  -> blocked  : vcpu_block
> + * D: blocked  -> runnable : do_resume

This one is, I think, similarly not so much run state related, but
instead used when a vCPU is about to enter guest context again.
The descriptions following this are more accurate. I note that
you've cloned this from the original, VMX specific description, but
I wonder whether that wouldn't better be corrected in this regard
at once. Otoh you're not making things worse, so if others are
in favor of it going in as is, I'd be fine too.

Other than that I'm fine with the patch, albeit I'd like the VMX
maintainers to specifically take a close look at the new placement
of the VMX specific comment.

Jan
Tian, Kevin Jan. 16, 2017, 2:13 a.m. UTC | #2
> From: Suravee Suthikulpanit [mailto:suravee.suthikulpanit@amd.com]
> Sent: Thursday, January 12, 2017 12:47 PM
> 
> The current function pointers in struct vmx_domain for managing hvm
> posted interrupt can be used also by SVM AVIC. Therefore, this patch
> introduces the struct hvm_pi_ops in the struct hvm_domain to hold them.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> ---
> Changes from V2:
>     * Remove "pi_" prefix from the function pointers.
>     * Reword and move VMX-specific description.
> 
> NOTE: I have separated this patch out from the AMD AVIC patch series
>       since this mainly affects HVM and VMX code.
> 
>  xen/arch/x86/hvm/vmx/vmx.c         | 73
> +++++++++++++++++++++++++++++---------
>  xen/include/asm-x86/hvm/domain.h   | 34 ++++++++++++++++++
>  xen/include/asm-x86/hvm/hvm.h      |  4 +--
>  xen/include/asm-x86/hvm/vmx/vmcs.h | 59 ------------------------------
>  4 files changed, 93 insertions(+), 77 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index 7b2c50c..0854e17 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -103,6 +103,47 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>  }
> 
> +/*
> + * To handle posted interrupts correctly, we need to set the following
> + * state:
> + *
> + * * The PI notification vector (NV)
> + * * The PI notification destination processor (NDST)
> + * * The PI "suppress notification" bit (SN)
> + * * The vcpu pi "blocked" list
> + *
> + * VMX implements the runstate transitions as the following:
> + *
> + * A: runnable -> running
> + *  - SN = 0
> + *  - NDST = v->processor
> + *  If a VM is currently running, we want the PI delivered to the guest vcpu
> + *  on the proper pcpu.
> + *
> + * B: running -> runnable
> + *  - SN = 1
> + *
> + * C: running -> blocked
> + *  - SN = 0
> + *  - NV = pi_wakeup_vector
> + *  - Add vcpu to blocked list
> + *  If the vm is blocked, we want the PI delivered to Xen so that it can
> + *  wake it up.
> + *
> + * D: blocked -> runnable
> + *  - SN = 0
> + *  - NV = posted_intr_vector
> + *  - Take vcpu off blocked list
> + *  If the VM is currently either preempted or offline (i.e., not running
> + *  because of some reason other than blocking waiting for an interrupt),
> + *  there's nothing Xen can do -- we want the interrupt pending bit set in
> + *  the guest, but we don't want to bother Xen with an interrupt (SN clear).

I don't know how much is common above between VMX and SVM. Can you
post the SVM counterpart here so we can see possibility of making it a
HVM general description?

If we have to keep above VMX specific description, I'd prefer to putting
it along with vmx_pi_hooks_assign.

> + *
> + * There's a brief window of time between vmx_intr_assist() and checking
> + * softirqs where if an interrupt comes in it may be lost; so we need Xen
> + * to get an interrupt and raise a softirq so that it will go through the
> + * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
> + */
>  static void vmx_vcpu_block(struct vcpu *v)
>  {
>      unsigned long flags;
> @@ -204,12 +245,12 @@ void vmx_pi_hooks_assign(struct domain *d)
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
> 
> -    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> +    ASSERT(!d->arch.hvm_domain.pi_ops.vcpu_block);
> 
> -    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> -    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
> -    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> -    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> +    d->arch.hvm_domain.pi_ops.vcpu_block = vmx_vcpu_block;
> +    d->arch.hvm_domain.pi_ops.switch_from = vmx_pi_switch_from;
> +    d->arch.hvm_domain.pi_ops.switch_to = vmx_pi_switch_to;
> +    d->arch.hvm_domain.pi_ops.do_resume = vmx_pi_do_resume;
>  }
> 
>  /* This function is called when pcidevs_lock is held */
> @@ -218,12 +259,12 @@ void vmx_pi_hooks_deassign(struct domain *d)
>      if ( !iommu_intpost || !has_hvm_container_domain(d) )
>          return;
> 
> -    ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
> +    ASSERT(d->arch.hvm_domain.pi_ops.vcpu_block);
> 
> -    d->arch.hvm_domain.vmx.vcpu_block = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
> -    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
> -    d->arch.hvm_domain.vmx.pi_do_resume = NULL;
> +    d->arch.hvm_domain.pi_ops.vcpu_block = NULL;
> +    d->arch.hvm_domain.pi_ops.switch_from = NULL;
> +    d->arch.hvm_domain.pi_ops.switch_to = NULL;
> +    d->arch.hvm_domain.pi_ops.do_resume = NULL;
>  }
> 
>  static int vmx_domain_initialise(struct domain *d)
> @@ -901,8 +942,8 @@ static void vmx_ctxt_switch_from(struct vcpu *v)
>      vmx_restore_host_msrs();
>      vmx_save_dr(v);
> 
> -    if ( v->domain->arch.hvm_domain.vmx.pi_switch_from )
> -        v->domain->arch.hvm_domain.vmx.pi_switch_from(v);
> +    if ( v->domain->arch.hvm_domain.pi_ops.switch_from )
> +        v->domain->arch.hvm_domain.pi_ops.switch_from(v);
>  }
> 
>  static void vmx_ctxt_switch_to(struct vcpu *v)
> @@ -916,8 +957,8 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
>      vmx_restore_guest_msrs(v);
>      vmx_restore_dr(v);
> 
> -    if ( v->domain->arch.hvm_domain.vmx.pi_switch_to )
> -        v->domain->arch.hvm_domain.vmx.pi_switch_to(v);
> +    if ( v->domain->arch.hvm_domain.pi_ops.switch_to )
> +        v->domain->arch.hvm_domain.pi_ops.switch_to(v);
>  }
> 
> 
> @@ -3963,8 +4004,8 @@ void vmx_vmenter_helper(const struct cpu_user_regs *regs)
>      struct hvm_vcpu_asid *p_asid;
>      bool_t need_flush;
> 
> -    if ( curr->domain->arch.hvm_domain.vmx.pi_do_resume )
> -        curr->domain->arch.hvm_domain.vmx.pi_do_resume(curr);
> +    if ( curr->domain->arch.hvm_domain.pi_ops.do_resume )
> +        curr->domain->arch.hvm_domain.pi_ops.do_resume(curr);
> 
>      if ( !cpu_has_vmx_vpid )
>          goto out;
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index f34d784..f7674cd 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -72,6 +72,38 @@ struct hvm_ioreq_server {
>      bool_t                 bufioreq_atomic;
>  };
> 
> +/*
> + * This structure defines function hooks to support hardware-assisted
> + * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
> + *
> + * The way we implement this now is by looking at what needs to happen on
> + * the following runstate transitions:
> + *
> + * A: runnable -> running  : switch_to
> + * B: running  -> runnable : switch_from
> + * C: running  -> blocked  : vcpu_block
> + * D: blocked  -> runnable : do_resume
> + *
> + * For transitions A and B, we add hooks into ctx_switch_{to,from} paths.
> + *
> + * For transition C, we add a new arch hook, arch_vcpu_block(), which is
> + * called from vcpu_block() and vcpu_do_poll().
> + *
> + * For transition D, rather than add an extra arch hook on vcpu_wake, we
> + * add a hook on the vmentry path which checks to see if either of the two
> + * actions need to be taken.
> + *
> + * These hooks only need to be called when the domain in question actually
> + * has a physical device assigned to it, so we set and clear the callbacks
> + * as appropriate when device assignment changes.
> + */
> +struct hvm_pi_ops {
> +    void (*vcpu_block) (struct vcpu *);
> +    void (*switch_from) (struct vcpu *v);
> +    void (*switch_to) (struct vcpu *v);
> +    void (*do_resume) (struct vcpu *v);
> +};
> +
>  struct hvm_domain {
>      /* Guest page range used for non-default ioreq servers */
>      struct {
> @@ -148,6 +180,8 @@ struct hvm_domain {
>          struct list_head list;
>      } write_map;
> 
> +    struct hvm_pi_ops pi_ops;
> +
>      union {
>          struct vmx_domain vmx;
>          struct svm_domain svm;
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index 7e7462e..b1e4c75 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -638,8 +638,8 @@ unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
> bool_t restore);
>      struct vcpu *v_ = (v);                                      \
>      struct domain *d_ = v_->domain;                             \
>      if ( has_hvm_container_domain(d_) &&                        \
> -         (cpu_has_vmx && d_->arch.hvm_domain.vmx.vcpu_block) )  \
> -        d_->arch.hvm_domain.vmx.vcpu_block(v_);                 \
> +         (d_->arch.hvm_domain.pi_ops.vcpu_block) )          \
> +        d_->arch.hvm_domain.pi_ops.vcpu_block(v_);          \
>  })
> 
>  #endif /* __ASM_X86_HVM_HVM_H__ */
> diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h
> b/xen/include/asm-x86/hvm/vmx/vmcs.h
> index 997f4f5..4ec8b08 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
> @@ -77,65 +77,6 @@ struct vmx_domain {
>      unsigned long apic_access_mfn;
>      /* VMX_DOMAIN_* */
>      unsigned int status;
> -
> -    /*
> -     * To handle posted interrupts correctly, we need to set the following
> -     * state:
> -     *
> -     * * The PI notification vector (NV)
> -     * * The PI notification destination processor (NDST)
> -     * * The PI "suppress notification" bit (SN)
> -     * * The vcpu pi "blocked" list
> -     *
> -     * If a VM is currently running, we want the PI delivered to the guest vcpu
> -     * on the proper pcpu (NDST = v->processor, SN clear).
> -     *
> -     * If the vm is blocked, we want the PI delivered to Xen so that it can
> -     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
> -     *
> -     * If the VM is currently either preempted or offline (i.e., not running
> -     * because of some reason other than blocking waiting for an interrupt),
> -     * there's nothing Xen can do -- we want the interrupt pending bit set in
> -     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
> -     *
> -     * There's a brief window of time between vmx_intr_assist() and checking
> -     * softirqs where if an interrupt comes in it may be lost; so we need Xen
> -     * to get an interrupt and raise a softirq so that it will go through the
> -     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
> -     *
> -     * The way we implement this now is by looking at what needs to happen on
> -     * the following runstate transitions:
> -     *
> -     * A: runnable -> running
> -     *  - SN = 0
> -     *  - NDST = v->processor
> -     * B: running -> runnable
> -     *  - SN = 1
> -     * C: running -> blocked
> -     *  - NV = pi_wakeup_vector
> -     *  - Add vcpu to blocked list
> -     * D: blocked -> runnable
> -     *  - NV = posted_intr_vector
> -     *  - Take vcpu off blocked list
> -     *
> -     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
> -     * paths.
> -     *
> -     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
> -     * called from vcpu_block() and vcpu_do_poll().
> -     *
> -     * For transition D, rather than add an extra arch hook on vcpu_wake, we
> -     * add a hook on the vmentry path which checks to see if either of the two
> -     * actions need to be taken.
> -     *
> -     * These hooks only need to be called when the domain in question actually
> -     * has a physical device assigned to it, so we set and clear the callbacks
> -     * as appropriate when device assignment changes.
> -     */
> -    void (*vcpu_block) (struct vcpu *);
> -    void (*pi_switch_from) (struct vcpu *v);
> -    void (*pi_switch_to) (struct vcpu *v);
> -    void (*pi_do_resume) (struct vcpu *v);
>  };
> 
>  struct pi_desc {
> --
> 1.9.1
Suthikulpanit, Suravee Jan. 16, 2017, 1:38 p.m. UTC | #3
Jan,

I would like to updated the following to be more correct.

On 1/12/17 19:37, Jan Beulich wrote:
>>>> On 12.01.17 at 05:47, <suravee.suthikulpanit@amd.com> wrote:
>> --- a/xen/include/asm-x86/hvm/domain.h
>> +++ b/xen/include/asm-x86/hvm/domain.h
>> @@ -72,6 +72,38 @@ struct hvm_ioreq_server {
>>      bool_t                 bufioreq_atomic;
>>  };
>>
>> +/*
>> + * This structure defines function hooks to support hardware-assisted
>> + * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
>> + *
>> + * The way we implement this now is by looking at what needs to happen on
>> + * the following runstate transitions:
>> + *
>> + * A: runnable -> running  : switch_to
>> + * B: running  -> runnable : switch_from
>
> I don't think this is accurate - switch_from may be used when
> transitioning to states other than runnable.
> Both hooks really are context switch rather than run state related afaict.
>
>> + * C: running  -> blocked  : vcpu_block
>> + * D: blocked  -> runnable : do_resume
>
> This one is, I think, similarly not so much run state related, but
> instead used when a vCPU is about to enter guest context again.

So, what if I generalize and simplify as following:

+/*
+ * This structure defines function hooks to support hardware-assisted
+ * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
+ *
+ * These hooks are defined by the underlying arch-specific code
+ * as needed. For example:
+ *   - When the domain is enabled with virtual IPI delivery
+ *   - When the domain is enabled with virtual I/O int delivery
+ *     and actually has a physical device assigned .
+ */
+struct hvm_pi_ops {
+    /* We add hooks into ctx_switch_from. */
+    void (*switch_from) (struct vcpu *v);
+
+    /* We add hooks into ctx_switch_to */
+    void (*switch_to) (struct vcpu *v);
+
+    /*
+     * We add a new arch hook, arch_vcpu_block(),
+     * which is called from vcpu_block() and vcpu_do_poll().
+     */
+    void (*vcpu_block) (struct vcpu *);
+
+    /* We add a hook on the vmentry path. */
+    void (*do_resume) (struct vcpu *v);
+};

Thanks,
Suravee
Suthikulpanit, Suravee Jan. 16, 2017, 1:42 p.m. UTC | #4
Kevin,

On 1/16/17 09:13, Tian, Kevin wrote:
>> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
>> index 7b2c50c..0854e17 100644
>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> @@ -103,6 +103,47 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>  }
>>
>> +/*
>> + * To handle posted interrupts correctly, we need to set the following
>> + * state:
>> + *
>> + * * The PI notification vector (NV)
>> + * * The PI notification destination processor (NDST)
>> + * * The PI "suppress notification" bit (SN)
>> + * * The vcpu pi "blocked" list
>> + *
>> + * VMX implements the runstate transitions as the following:
>> + *
>> + * A: runnable -> running
>> + *  - SN = 0
>> + *  - NDST = v->processor
>> + *  If a VM is currently running, we want the PI delivered to the guest vcpu
>> + *  on the proper pcpu.
>> + *
>> + * B: running -> runnable
>> + *  - SN = 1
>> + *
>> + * C: running -> blocked
>> + *  - SN = 0
>> + *  - NV = pi_wakeup_vector
>> + *  - Add vcpu to blocked list
>> + *  If the vm is blocked, we want the PI delivered to Xen so that it can
>> + *  wake it up.
>> + *
>> + * D: blocked -> runnable
>> + *  - SN = 0
>> + *  - NV = posted_intr_vector
>> + *  - Take vcpu off blocked list
>> + *  If the VM is currently either preempted or offline (i.e., not running
>> + *  because of some reason other than blocking waiting for an interrupt),
>> + *  there's nothing Xen can do -- we want the interrupt pending bit set in
>> + *  the guest, but we don't want to bother Xen with an interrupt (SN clear).
> I don't know how much is common above between VMX and SVM. Can you
> post the SVM counterpart here so we can see possibility of making it a
> HVM general description?

I'm planning to put SVM counterpart in the SVM AVIC patch series. For the change
here, I'm trying to keep it as accurate as possible for the VMX. Please let me know
if you would like to change any wordings here.

>
> If we have to keep above VMX specific description, I'd prefer to putting
> it along with vmx_pi_hooks_assign.

Okay, I'll put it there.

Thanks,
Suravee
Jan Beulich Jan. 16, 2017, 1:49 p.m. UTC | #5
>>> On 16.01.17 at 14:38, <Suravee.Suthikulpanit@amd.com> wrote:
> So, what if I generalize and simplify as following:
> 
> +/*
> + * This structure defines function hooks to support hardware-assisted
> + * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
> + *
> + * These hooks are defined by the underlying arch-specific code
> + * as needed. For example:
> + *   - When the domain is enabled with virtual IPI delivery
> + *   - When the domain is enabled with virtual I/O int delivery
> + *     and actually has a physical device assigned .
> + */
> +struct hvm_pi_ops {
> +    /* We add hooks into ctx_switch_from. */
> +    void (*switch_from) (struct vcpu *v);
> +
> +    /* We add hooks into ctx_switch_to */
> +    void (*switch_to) (struct vcpu *v);
> +
> +    /*
> +     * We add a new arch hook, arch_vcpu_block(),
> +     * which is called from vcpu_block() and vcpu_do_poll().
> +     */
> +    void (*vcpu_block) (struct vcpu *);
> +
> +    /* We add a hook on the vmentry path. */
> +    void (*do_resume) (struct vcpu *v);
> +};

Mostly fine, but please without the "We add" beginnings. Just
"Hook in ..." or some such. And please omit the stray blanks
between closing and opening parentheses.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 7b2c50c..0854e17 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -103,6 +103,47 @@  void vmx_pi_per_cpu_init(unsigned int cpu)
     spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
 }
 
+/*
+ * To handle posted interrupts correctly, we need to set the following
+ * state:
+ *
+ * * The PI notification vector (NV)
+ * * The PI notification destination processor (NDST)
+ * * The PI "suppress notification" bit (SN)
+ * * The vcpu pi "blocked" list
+ *
+ * VMX implements the runstate transitions as the following:
+ *
+ * A: runnable -> running
+ *  - SN = 0
+ *  - NDST = v->processor
+ *  If a VM is currently running, we want the PI delivered to the guest vcpu
+ *  on the proper pcpu.
+ *
+ * B: running -> runnable
+ *  - SN = 1
+ *
+ * C: running -> blocked
+ *  - SN = 0
+ *  - NV = pi_wakeup_vector
+ *  - Add vcpu to blocked list
+ *  If the vm is blocked, we want the PI delivered to Xen so that it can
+ *  wake it up.
+ *
+ * D: blocked -> runnable
+ *  - SN = 0
+ *  - NV = posted_intr_vector
+ *  - Take vcpu off blocked list
+ *  If the VM is currently either preempted or offline (i.e., not running
+ *  because of some reason other than blocking waiting for an interrupt),
+ *  there's nothing Xen can do -- we want the interrupt pending bit set in
+ *  the guest, but we don't want to bother Xen with an interrupt (SN clear).
+ *
+ * There's a brief window of time between vmx_intr_assist() and checking
+ * softirqs where if an interrupt comes in it may be lost; so we need Xen
+ * to get an interrupt and raise a softirq so that it will go through the
+ * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
+ */
 static void vmx_vcpu_block(struct vcpu *v)
 {
     unsigned long flags;
@@ -204,12 +245,12 @@  void vmx_pi_hooks_assign(struct domain *d)
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
-    ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
+    ASSERT(!d->arch.hvm_domain.pi_ops.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
-    d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
-    d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
-    d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
+    d->arch.hvm_domain.pi_ops.vcpu_block = vmx_vcpu_block;
+    d->arch.hvm_domain.pi_ops.switch_from = vmx_pi_switch_from;
+    d->arch.hvm_domain.pi_ops.switch_to = vmx_pi_switch_to;
+    d->arch.hvm_domain.pi_ops.do_resume = vmx_pi_do_resume;
 }
 
 /* This function is called when pcidevs_lock is held */
@@ -218,12 +259,12 @@  void vmx_pi_hooks_deassign(struct domain *d)
     if ( !iommu_intpost || !has_hvm_container_domain(d) )
         return;
 
-    ASSERT(d->arch.hvm_domain.vmx.vcpu_block);
+    ASSERT(d->arch.hvm_domain.pi_ops.vcpu_block);
 
-    d->arch.hvm_domain.vmx.vcpu_block = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_from = NULL;
-    d->arch.hvm_domain.vmx.pi_switch_to = NULL;
-    d->arch.hvm_domain.vmx.pi_do_resume = NULL;
+    d->arch.hvm_domain.pi_ops.vcpu_block = NULL;
+    d->arch.hvm_domain.pi_ops.switch_from = NULL;
+    d->arch.hvm_domain.pi_ops.switch_to = NULL;
+    d->arch.hvm_domain.pi_ops.do_resume = NULL;
 }
 
 static int vmx_domain_initialise(struct domain *d)
@@ -901,8 +942,8 @@  static void vmx_ctxt_switch_from(struct vcpu *v)
     vmx_restore_host_msrs();
     vmx_save_dr(v);
 
-    if ( v->domain->arch.hvm_domain.vmx.pi_switch_from )
-        v->domain->arch.hvm_domain.vmx.pi_switch_from(v);
+    if ( v->domain->arch.hvm_domain.pi_ops.switch_from )
+        v->domain->arch.hvm_domain.pi_ops.switch_from(v);
 }
 
 static void vmx_ctxt_switch_to(struct vcpu *v)
@@ -916,8 +957,8 @@  static void vmx_ctxt_switch_to(struct vcpu *v)
     vmx_restore_guest_msrs(v);
     vmx_restore_dr(v);
 
-    if ( v->domain->arch.hvm_domain.vmx.pi_switch_to )
-        v->domain->arch.hvm_domain.vmx.pi_switch_to(v);
+    if ( v->domain->arch.hvm_domain.pi_ops.switch_to )
+        v->domain->arch.hvm_domain.pi_ops.switch_to(v);
 }
 
 
@@ -3963,8 +4004,8 @@  void vmx_vmenter_helper(const struct cpu_user_regs *regs)
     struct hvm_vcpu_asid *p_asid;
     bool_t need_flush;
 
-    if ( curr->domain->arch.hvm_domain.vmx.pi_do_resume )
-        curr->domain->arch.hvm_domain.vmx.pi_do_resume(curr);
+    if ( curr->domain->arch.hvm_domain.pi_ops.do_resume )
+        curr->domain->arch.hvm_domain.pi_ops.do_resume(curr);
 
     if ( !cpu_has_vmx_vpid )
         goto out;
diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
index f34d784..f7674cd 100644
--- a/xen/include/asm-x86/hvm/domain.h
+++ b/xen/include/asm-x86/hvm/domain.h
@@ -72,6 +72,38 @@  struct hvm_ioreq_server {
     bool_t                 bufioreq_atomic;
 };
 
+/*
+ * This structure defines function hooks to support hardware-assisted
+ * virtual interrupt delivery to guest. (e.g. VMX PI and SVM AVIC).
+ *
+ * The way we implement this now is by looking at what needs to happen on
+ * the following runstate transitions:
+ *
+ * A: runnable -> running  : switch_to
+ * B: running  -> runnable : switch_from
+ * C: running  -> blocked  : vcpu_block
+ * D: blocked  -> runnable : do_resume
+ *
+ * For transitions A and B, we add hooks into ctx_switch_{to,from} paths.
+ *
+ * For transition C, we add a new arch hook, arch_vcpu_block(), which is
+ * called from vcpu_block() and vcpu_do_poll().
+ *
+ * For transition D, rather than add an extra arch hook on vcpu_wake, we
+ * add a hook on the vmentry path which checks to see if either of the two
+ * actions need to be taken.
+ *
+ * These hooks only need to be called when the domain in question actually
+ * has a physical device assigned to it, so we set and clear the callbacks
+ * as appropriate when device assignment changes.
+ */
+struct hvm_pi_ops {
+    void (*vcpu_block) (struct vcpu *);
+    void (*switch_from) (struct vcpu *v);
+    void (*switch_to) (struct vcpu *v);
+    void (*do_resume) (struct vcpu *v);
+};
+
 struct hvm_domain {
     /* Guest page range used for non-default ioreq servers */
     struct {
@@ -148,6 +180,8 @@  struct hvm_domain {
         struct list_head list;
     } write_map;
 
+    struct hvm_pi_ops pi_ops;
+
     union {
         struct vmx_domain vmx;
         struct svm_domain svm;
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 7e7462e..b1e4c75 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -638,8 +638,8 @@  unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v, bool_t restore);
     struct vcpu *v_ = (v);                                      \
     struct domain *d_ = v_->domain;                             \
     if ( has_hvm_container_domain(d_) &&                        \
-         (cpu_has_vmx && d_->arch.hvm_domain.vmx.vcpu_block) )  \
-        d_->arch.hvm_domain.vmx.vcpu_block(v_);                 \
+         (d_->arch.hvm_domain.pi_ops.vcpu_block) )          \
+        d_->arch.hvm_domain.pi_ops.vcpu_block(v_);          \
 })
 
 #endif /* __ASM_X86_HVM_HVM_H__ */
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 997f4f5..4ec8b08 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -77,65 +77,6 @@  struct vmx_domain {
     unsigned long apic_access_mfn;
     /* VMX_DOMAIN_* */
     unsigned int status;
-
-    /*
-     * To handle posted interrupts correctly, we need to set the following
-     * state:
-     *
-     * * The PI notification vector (NV)
-     * * The PI notification destination processor (NDST)
-     * * The PI "suppress notification" bit (SN)
-     * * The vcpu pi "blocked" list
-     *
-     * If a VM is currently running, we want the PI delivered to the guest vcpu
-     * on the proper pcpu (NDST = v->processor, SN clear).
-     *
-     * If the vm is blocked, we want the PI delivered to Xen so that it can
-     * wake it up  (SN clear, NV = pi_wakeup_vector, vcpu on block list).
-     *
-     * If the VM is currently either preempted or offline (i.e., not running
-     * because of some reason other than blocking waiting for an interrupt),
-     * there's nothing Xen can do -- we want the interrupt pending bit set in
-     * the guest, but we don't want to bother Xen with an interrupt (SN clear).
-     *
-     * There's a brief window of time between vmx_intr_assist() and checking
-     * softirqs where if an interrupt comes in it may be lost; so we need Xen
-     * to get an interrupt and raise a softirq so that it will go through the
-     * vmx_intr_assist() path again (SN clear, NV = posted_interrupt).
-     *
-     * The way we implement this now is by looking at what needs to happen on
-     * the following runstate transitions:
-     *
-     * A: runnable -> running
-     *  - SN = 0
-     *  - NDST = v->processor
-     * B: running -> runnable
-     *  - SN = 1
-     * C: running -> blocked
-     *  - NV = pi_wakeup_vector
-     *  - Add vcpu to blocked list
-     * D: blocked -> runnable
-     *  - NV = posted_intr_vector
-     *  - Take vcpu off blocked list
-     *
-     * For transitions A and B, we add hooks into vmx_ctxt_switch_{from,to}
-     * paths.
-     *
-     * For transition C, we add a new arch hook, arch_vcpu_block(), which is
-     * called from vcpu_block() and vcpu_do_poll().
-     *
-     * For transition D, rather than add an extra arch hook on vcpu_wake, we
-     * add a hook on the vmentry path which checks to see if either of the two
-     * actions need to be taken.
-     *
-     * These hooks only need to be called when the domain in question actually
-     * has a physical device assigned to it, so we set and clear the callbacks
-     * as appropriate when device assignment changes.
-     */
-    void (*vcpu_block) (struct vcpu *);
-    void (*pi_switch_from) (struct vcpu *v);
-    void (*pi_switch_to) (struct vcpu *v);
-    void (*pi_do_resume) (struct vcpu *v);
 };
 
 struct pi_desc {