Message ID | 1458058456-7345-3-git-send-email-paul.durrant@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Mar 15, 2016 at 04:14:16PM +0000, Paul Durrant wrote: > This patch adds code to enable the APIC assist enlightenment which, > under certain conditions, means that the guest can avoid an EOI of > the local APIC and thereby avoid a VMEXIT. I noticed you deleted the comment having the spec. Would it be better if: 1) It was in the git commit description. 2) As part of the docs directory? Oh and the link is broken. An the link it points to is broken too. So no spec for this, eh? > > Use of the enlightenment by the hypervisor is under control of the > toolstack. > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > Cc: Wei Liu <wei.liu2@citrix.com> > Cc: Keir Fraser <keir@xen.org> > Cc: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > --- > docs/man/xl.cfg.pod.5 | 8 ++++++ > tools/libxl/libxl_dom.c | 3 ++ > tools/libxl/libxl_types.idl | 1 + > xen/arch/x86/hvm/viridian.c | 59 +++++++++++++++++++++++++++++++------- > xen/arch/x86/hvm/vlapic.c | 58 +++++++++++++++++++++++++++++++++---- > xen/include/asm-x86/hvm/viridian.h | 5 ++++ > xen/include/public/hvm/params.h | 7 ++++- > 7 files changed, 124 insertions(+), 17 deletions(-) > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > index 56b1117..49acda7 100644 > --- a/docs/man/xl.cfg.pod.5 > +++ b/docs/man/xl.cfg.pod.5 > @@ -1484,6 +1484,14 @@ This set incorporates use of hypercalls for remote TLB flushing. > This enlightenment may improve performance of Windows guests running > on hosts with higher levels of (physical) CPU contention. > > +=item B<apic_assist> > + > +This set incorporates use of the APIC assist page to avoid EOI of What set? Option? > +the local APIC. > +This enlightenment may improve performance of Windows guests, may? How does this square with Intel vAPIC? Could you mention that here? Or in the commit description? And can non-Windows guests use it too? Or is this particular for Windows guests? IF so which ones? > +particularly those running PV drivers that make use of per-vcpu > +event channel upcall vectors. What kind of PV drivers? Anybody? > + > =item B<defaults> > > This is a special value that enables the default set of groups, which > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c > index b825b98..ee75ad1 100644 > --- a/tools/libxl/libxl_dom.c > +++ b/tools/libxl/libxl_dom.c > @@ -253,6 +253,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, > if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_REMOTE_TLB_FLUSH)) > mask |= HVMPV_hcall_remote_tlb_flush; > > + if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST)) > + mask |= HVMPV_apic_assist; > + > if (mask != 0 && > xc_hvm_param_set(CTX->xch, > domid, > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > index 632c009..e3be957 100644 > --- a/tools/libxl/libxl_types.idl > +++ b/tools/libxl/libxl_types.idl > @@ -221,6 +221,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [ > (2, "time_ref_count"), > (3, "reference_tsc"), > (4, "hcall_remote_tlb_flush"), > + (5, "apic_assist"), > ]) > > libxl_hdtype = Enumeration("hdtype", [ > diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c > index c779290..1f73691 100644 > --- a/xen/arch/x86/hvm/viridian.c > +++ b/xen/arch/x86/hvm/viridian.c > @@ -221,16 +221,6 @@ static void initialize_apic_assist(struct vcpu *v) > struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); > void *va; > > - /* > - * We don't yet make use of the APIC assist page but by setting > - * the CPUID3A_MSR_APIC_ACCESS bit in CPUID leaf 40000003 we are duty > - * bound to support the MSR. We therefore do just enough to keep windows > - * happy. > - * > - * See http://msdn.microsoft.com/en-us/library/ff538657%28VS.85%29.aspx for > - * details of how Windows uses the page. > - */ > - > if ( !page ) > return; > > @@ -251,6 +241,55 @@ static void initialize_apic_assist(struct vcpu *v) > > v->arch.hvm_vcpu.viridian.apic_assist.page = page; > v->arch.hvm_vcpu.viridian.apic_assist.va = va; > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; > +} > + > +void viridian_start_apic_assist(struct vcpu *v, int vector) > +{ > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > + > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > + !va ) > + return; > + > + /* > + * If there is already an assist pending then something has gone > + * wrong and the VM will most likely hang so force a crash now > + * to make the problem clear. > + */ > + if (v->arch.hvm_vcpu.viridian.apic_assist.vector >= 0) I think you need spaces sprinkled here. > + domain_crash(v->domain); > + > + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; > + *(uint32_t *)va |= 1u; Oh my. What does that do? Why not 0xBADF00D Is that prescriped in the spec? That looks quite unhealthy to do. > +} > + > +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector) > +{ > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > + > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > + !va ) > + return 0; > + > + if ( *(uint32_t *)va & 1 ) Is that really part of the spec? > + return 0; /* Interrupt not yet processed by the guest */ > + > + *vector = v->arch.hvm_vcpu.viridian.apic_assist.vector; > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; Could you explain that bit please? Why the reset of the vector? Maybe it becomes clear later on. /me grumbles about not spec. > + return 1; > +} > + > +void viridian_abort_apic_assist(struct vcpu *v) > +{ > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > + > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > + !va ) > + return; This is the third time I see that. Perhaps you can make a function out of it. > + > + *(uint32_t *)va &= ~1u; > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; > } > > static void teardown_apic_assist(struct vcpu *v) > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 01a8430..aac4263 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -38,6 +38,7 @@ > #include <asm/hvm/support.h> > #include <asm/hvm/vmx/vmx.h> > #include <asm/hvm/nestedhvm.h> > +#include <asm/hvm/viridian.h> > #include <public/hvm/ioreq.h> > #include <public/hvm/params.h> > > @@ -95,6 +96,18 @@ static int vlapic_find_highest_vector(const void *bitmap) > return (fls(word[word_offset*4]) - 1) + (word_offset * 32); > } > > +static int vlapic_find_lowest_vector(const void *bitmap) > +{ > + const uint32_t *word = bitmap; > + unsigned int word_offset; > + > + /* Work forwards through the bitmap (first 32-bit word in every four). */ Would it be easier said: Operate on chunks of 4 bytes? > + for ( word_offset = 0; word_offset < NR_VECTORS / 32; word_offset++) > + if ( word[word_offset * 4] ) > + return (ffs(word[word_offset * 4]) - 1) + (word_offset * 32); > + > + return -1; > +} > > /* > * IRR-specific bitmap update & search routines. > @@ -1157,7 +1170,7 @@ int vlapic_virtual_intr_delivery_enabled(void) > int vlapic_has_pending_irq(struct vcpu *v) > { > struct vlapic *vlapic = vcpu_vlapic(v); > - int irr, isr; > + int irr, vector, isr; vector = -1? > > if ( !vlapic_enabled(vlapic) ) > return -1; > @@ -1170,10 +1183,27 @@ int vlapic_has_pending_irq(struct vcpu *v) > !nestedhvm_vcpu_in_guestmode(v) ) > return irr; > > + /* > + * If APIC assist was used then there may have been no EOI so > + * we need to clear the requisite bit from the ISR here, before > + * comparing with the IRR. > + */ > + if ( viridian_complete_apic_assist(v, &vector) && > + vector != -1 ) > + vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]); > + > isr = vlapic_find_highest_isr(vlapic); > isr = (isr != -1) ? isr : 0; > if ( (isr & 0xf0) >= (irr & 0xf0) ) > + { > + /* > + * There's already a higher priority vector pending so > + * we need to abort any previous APIC assist to ensure there > + * is an EOI. > + */ > + viridian_abort_apic_assist(v); > return -1; > + } > > return irr; > } > @@ -1181,13 +1211,29 @@ int vlapic_has_pending_irq(struct vcpu *v) > int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack) > { > struct vlapic *vlapic = vcpu_vlapic(v); > + int isr; > > - if ( force_ack || !vlapic_virtual_intr_delivery_enabled() ) > - { > - vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); > - vlapic_clear_irr(vector, vlapic); > - } > + if ( !force_ack && > + vlapic_virtual_intr_delivery_enabled() ) > + return 1; > + > + if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) ) > + goto done; > + > + isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]); > + if ( isr >= 0 && isr < vector ) > + goto done; > + > + /* > + * This vector is edge triggered and there are no lower priority > + * vectors pending, so we can use APIC assist to avoid exiting > + * for EOI. > + */ > + viridian_start_apic_assist(v, vector); > > +done: > + vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); > + vlapic_clear_irr(vector, vlapic); > return 1; > } > > diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h > index c60c113..658d46a 100644 > --- a/xen/include/asm-x86/hvm/viridian.h > +++ b/xen/include/asm-x86/hvm/viridian.h > @@ -25,6 +25,7 @@ struct viridian_vcpu > union viridian_apic_assist msr; > struct page_info *page; > void *va; > + int vector; > } apic_assist; > cpumask_var_t flush_cpumask; > }; > @@ -125,6 +126,10 @@ void viridian_time_ref_count_thaw(struct domain *d); > int viridian_vcpu_init(struct vcpu *v); > void viridian_vcpu_deinit(struct vcpu *v); > > +void viridian_start_apic_assist(struct vcpu *v, int vector); > +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector); > +void viridian_abort_apic_assist(struct vcpu *v); > + > #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */ > > /* > diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h > index 73d4718..e69c72c 100644 > --- a/xen/include/public/hvm/params.h > +++ b/xen/include/public/hvm/params.h > @@ -115,12 +115,17 @@ > #define _HVMPV_hcall_remote_tlb_flush 4 > #define HVMPV_hcall_remote_tlb_flush (1 << _HVMPV_hcall_remote_tlb_flush) > > +/* Use APIC assist */ > +#define _HVMPV_apic_assist 5 > +#define HVMPV_apic_assist (1 << _HVMPV_apic_assist) > + > #define HVMPV_feature_mask \ > (HVMPV_base_freq | \ > HVMPV_no_freq | \ > HVMPV_time_ref_count | \ > HVMPV_reference_tsc | \ > - HVMPV_hcall_remote_tlb_flush) > + HVMPV_hcall_remote_tlb_flush | \ > + HVMPV_apic_assist) > > #endif > > -- > 2.1.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
> -----Original Message----- > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > Sent: 15 March 2016 23:18 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; Wei Liu; Stefano Stabellini; Andrew > Cooper; Ian Jackson; Jan Beulich; Keir (Xen.org) > Subject: Re: [Xen-devel] [PATCH 2/2] x86/hvm/viridian: Enable APIC assist > enlightenment > > On Tue, Mar 15, 2016 at 04:14:16PM +0000, Paul Durrant wrote: > > This patch adds code to enable the APIC assist enlightenment which, > > under certain conditions, means that the guest can avoid an EOI of > > the local APIC and thereby avoid a VMEXIT. > > I noticed you deleted the comment having the spec. > That's because Microsoft killed the link and I couldn't find the spec online any more. But, thankfully, I've now found it again although it has actually taken me weeks to do so. > Would it be better if: > > 1) It was in the git commit description. > 2) As part of the docs directory? > > Oh and the link is broken. An the link it points to is > broken too. > > So no spec for this, eh? > Now I've found it again I'll reference it. > > > > > > Use of the enlightenment by the hypervisor is under control of the > > toolstack. > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com> > > Cc: Ian Jackson <ian.jackson@eu.citrix.com> > > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > > Cc: Wei Liu <wei.liu2@citrix.com> > > Cc: Keir Fraser <keir@xen.org> > > Cc: Jan Beulich <jbeulich@suse.com> > > Cc: Andrew Cooper <andrew.cooper3@citrix.com> > > --- > > docs/man/xl.cfg.pod.5 | 8 ++++++ > > tools/libxl/libxl_dom.c | 3 ++ > > tools/libxl/libxl_types.idl | 1 + > > xen/arch/x86/hvm/viridian.c | 59 > +++++++++++++++++++++++++++++++------- > > xen/arch/x86/hvm/vlapic.c | 58 > +++++++++++++++++++++++++++++++++---- > > xen/include/asm-x86/hvm/viridian.h | 5 ++++ > > xen/include/public/hvm/params.h | 7 ++++- > > 7 files changed, 124 insertions(+), 17 deletions(-) > > > > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 > > index 56b1117..49acda7 100644 > > --- a/docs/man/xl.cfg.pod.5 > > +++ b/docs/man/xl.cfg.pod.5 > > @@ -1484,6 +1484,14 @@ This set incorporates use of hypercalls for > remote TLB flushing. > > This enlightenment may improve performance of Windows guests running > > on hosts with higher levels of (physical) CPU contention. > > > > +=item B<apic_assist> > > + > > +This set incorporates use of the APIC assist page to avoid EOI of > > What set? Option? > Set of viridian enlightenments. Read the pre-amble to the section. > > +the local APIC. > > +This enlightenment may improve performance of Windows guests, > > may? How does this square with Intel vAPIC? > Could you mention that here? Or in the commit description? > Yes, I'll mention that it's clearly somewhat pointless if posted interrupts are in operation and that it has no effect in that case. > And can non-Windows guests use it too? If they adhere to the viridian spec., yes. > Or is this particular for Windows guests? IF so which ones? > > > +particularly those running PV drivers that make use of per-vcpu > > +event channel upcall vectors. > > What kind of PV drivers? Anybody? > Anybody's drivers that care to make use of FIFO event channels and that HVM op. > > + > > =item B<defaults> > > > > This is a special value that enables the default set of groups, which > > diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c > > index b825b98..ee75ad1 100644 > > --- a/tools/libxl/libxl_dom.c > > +++ b/tools/libxl/libxl_dom.c > > @@ -253,6 +253,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, > uint32_t domid, > > if (libxl_bitmap_test(&enlightenments, > LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_REMOTE_TLB_FLUSH)) > > mask |= HVMPV_hcall_remote_tlb_flush; > > > > + if (libxl_bitmap_test(&enlightenments, > LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST)) > > + mask |= HVMPV_apic_assist; > > + > > if (mask != 0 && > > xc_hvm_param_set(CTX->xch, > > domid, > > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl > > index 632c009..e3be957 100644 > > --- a/tools/libxl/libxl_types.idl > > +++ b/tools/libxl/libxl_types.idl > > @@ -221,6 +221,7 @@ libxl_viridian_enlightenment = > Enumeration("viridian_enlightenment", [ > > (2, "time_ref_count"), > > (3, "reference_tsc"), > > (4, "hcall_remote_tlb_flush"), > > + (5, "apic_assist"), > > ]) > > > > libxl_hdtype = Enumeration("hdtype", [ > > diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c > > index c779290..1f73691 100644 > > --- a/xen/arch/x86/hvm/viridian.c > > +++ b/xen/arch/x86/hvm/viridian.c > > @@ -221,16 +221,6 @@ static void initialize_apic_assist(struct vcpu *v) > > struct page_info *page = get_page_from_gfn(d, gmfn, NULL, > P2M_ALLOC); > > void *va; > > > > - /* > > - * We don't yet make use of the APIC assist page but by setting > > - * the CPUID3A_MSR_APIC_ACCESS bit in CPUID leaf 40000003 we are > duty > > - * bound to support the MSR. We therefore do just enough to keep > windows > > - * happy. > > - * > > - * See http://msdn.microsoft.com/en- > us/library/ff538657%28VS.85%29.aspx for > > - * details of how Windows uses the page. > > - */ > > - > > if ( !page ) > > return; > > > > @@ -251,6 +241,55 @@ static void initialize_apic_assist(struct vcpu *v) > > > > v->arch.hvm_vcpu.viridian.apic_assist.page = page; > > v->arch.hvm_vcpu.viridian.apic_assist.va = va; > > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; > > +} > > + > > +void viridian_start_apic_assist(struct vcpu *v, int vector) > > +{ > > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > > + > > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > > + !va ) > > + return; > > + > > + /* > > + * If there is already an assist pending then something has gone > > + * wrong and the VM will most likely hang so force a crash now > > + * to make the problem clear. > > + */ > > + if (v->arch.hvm_vcpu.viridian.apic_assist.vector >= 0) > > I think you need spaces sprinkled here. > I do indeed. > > + domain_crash(v->domain); > > + > > + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; > > + *(uint32_t *)va |= 1u; > > Oh my. What does that do? Why not 0xBADF00D > Is that prescriped in the spec? The LSB is the only bit that has defined functionality. The next 31 bits are reserved and the rest of the page is undefined. > > That looks quite unhealthy to do. > Reading the spec. again I do see that the reserved bits are defined to be zero, so it looks like I can use = rather than |=. > > +} > > + > > +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector) > > +{ > > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > > + > > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > > + !va ) > > + return 0; > > + > > + if ( *(uint32_t *)va & 1 ) > > Is that really part of the spec? > Yes. > > + return 0; /* Interrupt not yet processed by the guest */ > > + > > + *vector = v->arch.hvm_vcpu.viridian.apic_assist.vector; > > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; > > Could you explain that bit please? Why the reset of the vector? > Maybe it becomes clear later on. > > /me grumbles about not spec. > > > + return 1; > > +} > > + > > +void viridian_abort_apic_assist(struct vcpu *v) > > +{ > > + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; > > + > > + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || > > + !va ) > > + return; > > This is the third time I see that. Perhaps you can make a function > out of it. > Yes, or maybe a macro. > > + > > + *(uint32_t *)va &= ~1u; > > + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; > > } > > > > static void teardown_apic_assist(struct vcpu *v) > > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > > index 01a8430..aac4263 100644 > > --- a/xen/arch/x86/hvm/vlapic.c > > +++ b/xen/arch/x86/hvm/vlapic.c > > @@ -38,6 +38,7 @@ > > #include <asm/hvm/support.h> > > #include <asm/hvm/vmx/vmx.h> > > #include <asm/hvm/nestedhvm.h> > > +#include <asm/hvm/viridian.h> > > #include <public/hvm/ioreq.h> > > #include <public/hvm/params.h> > > > > @@ -95,6 +96,18 @@ static int vlapic_find_highest_vector(const void > *bitmap) > > return (fls(word[word_offset*4]) - 1) + (word_offset * 32); > > } > > > > +static int vlapic_find_lowest_vector(const void *bitmap) > > +{ > > + const uint32_t *word = bitmap; > > + unsigned int word_offset; > > + > > + /* Work forwards through the bitmap (first 32-bit word in every four). > */ > > Would it be easier said: > > Operate on chunks of 4 bytes? > I was aping the comment in the 'find highest' function. I'll keep it as it is for consistency. > > + for ( word_offset = 0; word_offset < NR_VECTORS / 32; word_offset++) > > + if ( word[word_offset * 4] ) > > + return (ffs(word[word_offset * 4]) - 1) + (word_offset * 32); > > + > > + return -1; > > +} > > > > /* > > * IRR-specific bitmap update & search routines. > > @@ -1157,7 +1170,7 @@ int vlapic_virtual_intr_delivery_enabled(void) > > int vlapic_has_pending_irq(struct vcpu *v) > > { > > struct vlapic *vlapic = vcpu_vlapic(v); > > - int irr, isr; > > + int irr, vector, isr; > > vector = -1? > No need. vector is not checked unless viridian_complete_apic_assist() returns 1, in which case it is guaranteed to be set. Paul > > > > if ( !vlapic_enabled(vlapic) ) > > return -1; > > @@ -1170,10 +1183,27 @@ int vlapic_has_pending_irq(struct vcpu *v) > > !nestedhvm_vcpu_in_guestmode(v) ) > > return irr; > > > > + /* > > + * If APIC assist was used then there may have been no EOI so > > + * we need to clear the requisite bit from the ISR here, before > > + * comparing with the IRR. > > + */ > > + if ( viridian_complete_apic_assist(v, &vector) && > > + vector != -1 ) > > + vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]); > > + > > isr = vlapic_find_highest_isr(vlapic); > > isr = (isr != -1) ? isr : 0; > > if ( (isr & 0xf0) >= (irr & 0xf0) ) > > + { > > + /* > > + * There's already a higher priority vector pending so > > + * we need to abort any previous APIC assist to ensure there > > + * is an EOI. > > + */ > > + viridian_abort_apic_assist(v); > > return -1; > > + } > > > > return irr; > > } > > @@ -1181,13 +1211,29 @@ int vlapic_has_pending_irq(struct vcpu *v) > > int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack) > > { > > struct vlapic *vlapic = vcpu_vlapic(v); > > + int isr; > > > > - if ( force_ack || !vlapic_virtual_intr_delivery_enabled() ) > > - { > > - vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); > > - vlapic_clear_irr(vector, vlapic); > > - } > > + if ( !force_ack && > > + vlapic_virtual_intr_delivery_enabled() ) > > + return 1; > > + > > + if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) ) > > + goto done; > > + > > + isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]); > > + if ( isr >= 0 && isr < vector ) > > + goto done; > > + > > + /* > > + * This vector is edge triggered and there are no lower priority > > + * vectors pending, so we can use APIC assist to avoid exiting > > + * for EOI. > > + */ > > + viridian_start_apic_assist(v, vector); > > > > +done: > > + vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); > > + vlapic_clear_irr(vector, vlapic); > > return 1; > > } > > > > diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm- > x86/hvm/viridian.h > > index c60c113..658d46a 100644 > > --- a/xen/include/asm-x86/hvm/viridian.h > > +++ b/xen/include/asm-x86/hvm/viridian.h > > @@ -25,6 +25,7 @@ struct viridian_vcpu > > union viridian_apic_assist msr; > > struct page_info *page; > > void *va; > > + int vector; > > } apic_assist; > > cpumask_var_t flush_cpumask; > > }; > > @@ -125,6 +126,10 @@ void viridian_time_ref_count_thaw(struct domain > *d); > > int viridian_vcpu_init(struct vcpu *v); > > void viridian_vcpu_deinit(struct vcpu *v); > > > > +void viridian_start_apic_assist(struct vcpu *v, int vector); > > +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector); > > +void viridian_abort_apic_assist(struct vcpu *v); > > + > > #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */ > > > > /* > > diff --git a/xen/include/public/hvm/params.h > b/xen/include/public/hvm/params.h > > index 73d4718..e69c72c 100644 > > --- a/xen/include/public/hvm/params.h > > +++ b/xen/include/public/hvm/params.h > > @@ -115,12 +115,17 @@ > > #define _HVMPV_hcall_remote_tlb_flush 4 > > #define HVMPV_hcall_remote_tlb_flush (1 << > _HVMPV_hcall_remote_tlb_flush) > > > > +/* Use APIC assist */ > > +#define _HVMPV_apic_assist 5 > > +#define HVMPV_apic_assist (1 << _HVMPV_apic_assist) > > + > > #define HVMPV_feature_mask \ > > (HVMPV_base_freq | \ > > HVMPV_no_freq | \ > > HVMPV_time_ref_count | \ > > HVMPV_reference_tsc | \ > > - HVMPV_hcall_remote_tlb_flush) > > + HVMPV_hcall_remote_tlb_flush | \ > > + HVMPV_apic_assist) > > > > #endif > > > > -- > > 2.1.4 > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel
>>> On 16.03.16 at 16:02, <Paul.Durrant@citrix.com> wrote: >> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] >> Sent: 15 March 2016 23:18 >> On Tue, Mar 15, 2016 at 04:14:16PM +0000, Paul Durrant wrote: >> > + domain_crash(v->domain); >> > + >> > + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; >> > + *(uint32_t *)va |= 1u; >> >> Oh my. What does that do? Why not 0xBADF00D >> Is that prescriped in the spec? > > The LSB is the only bit that has defined functionality. The next 31 bits are > reserved and the rest of the page is undefined. > >> >> That looks quite unhealthy to do. >> > > Reading the spec. again I do see that the reserved bits are defined to be > zero, so it looks like I can use = rather than |=. Wouldn't that put at risk forward compatibility? Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: 16 March 2016 16:28 > To: Paul Durrant > Cc: Andrew Cooper; Ian Jackson; Stefano Stabellini; Wei Liu; xen- > devel@lists.xenproject.org; Konrad Rzeszutek Wilk; Keir (Xen.org) > Subject: RE: [Xen-devel] [PATCH 2/2] x86/hvm/viridian: Enable APIC assist > enlightenment > > >>> On 16.03.16 at 16:02, <Paul.Durrant@citrix.com> wrote: > >> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] > >> Sent: 15 March 2016 23:18 > >> On Tue, Mar 15, 2016 at 04:14:16PM +0000, Paul Durrant wrote: > >> > + domain_crash(v->domain); > >> > + > >> > + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; > >> > + *(uint32_t *)va |= 1u; > >> > >> Oh my. What does that do? Why not 0xBADF00D > >> Is that prescriped in the spec? > > > > The LSB is the only bit that has defined functionality. The next 31 bits are > > reserved and the rest of the page is undefined. > > > >> > >> That looks quite unhealthy to do. > >> > > > > Reading the spec. again I do see that the reserved bits are defined to be > > zero, so it looks like I can use = rather than |=. > > Wouldn't that put at risk forward compatibility? > The spec just says "Reserved to Zero" so it's hard to know what the right thing to do is until further bits are defined. Paul > Jan
>>> On 16.03.16 at 18:23, <Paul.Durrant@citrix.com> wrote: >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: 16 March 2016 16:28 >> >>> On 16.03.16 at 16:02, <Paul.Durrant@citrix.com> wrote: >> >> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] >> >> Sent: 15 March 2016 23:18 >> >> On Tue, Mar 15, 2016 at 04:14:16PM +0000, Paul Durrant wrote: >> >> > + domain_crash(v->domain); >> >> > + >> >> > + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; >> >> > + *(uint32_t *)va |= 1u; >> >> >> >> Oh my. What does that do? Why not 0xBADF00D >> >> Is that prescriped in the spec? >> > >> > The LSB is the only bit that has defined functionality. The next 31 bits are >> > reserved and the rest of the page is undefined. >> > >> >> >> >> That looks quite unhealthy to do. >> >> >> > >> > Reading the spec. again I do see that the reserved bits are defined to be >> > zero, so it looks like I can use = rather than |=. >> >> Wouldn't that put at risk forward compatibility? > > The spec just says "Reserved to Zero" so it's hard to know what the right > thing to do is until further bits are defined. The same as with any other reserved bits, I would say: Leave them alone. Jan
diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index 56b1117..49acda7 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -1484,6 +1484,14 @@ This set incorporates use of hypercalls for remote TLB flushing. This enlightenment may improve performance of Windows guests running on hosts with higher levels of (physical) CPU contention. +=item B<apic_assist> + +This set incorporates use of the APIC assist page to avoid EOI of +the local APIC. +This enlightenment may improve performance of Windows guests, +particularly those running PV drivers that make use of per-vcpu +event channel upcall vectors. + =item B<defaults> This is a special value that enables the default set of groups, which diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index b825b98..ee75ad1 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -253,6 +253,9 @@ static int hvm_set_viridian_features(libxl__gc *gc, uint32_t domid, if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_HCALL_REMOTE_TLB_FLUSH)) mask |= HVMPV_hcall_remote_tlb_flush; + if (libxl_bitmap_test(&enlightenments, LIBXL_VIRIDIAN_ENLIGHTENMENT_APIC_ASSIST)) + mask |= HVMPV_apic_assist; + if (mask != 0 && xc_hvm_param_set(CTX->xch, domid, diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index 632c009..e3be957 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -221,6 +221,7 @@ libxl_viridian_enlightenment = Enumeration("viridian_enlightenment", [ (2, "time_ref_count"), (3, "reference_tsc"), (4, "hcall_remote_tlb_flush"), + (5, "apic_assist"), ]) libxl_hdtype = Enumeration("hdtype", [ diff --git a/xen/arch/x86/hvm/viridian.c b/xen/arch/x86/hvm/viridian.c index c779290..1f73691 100644 --- a/xen/arch/x86/hvm/viridian.c +++ b/xen/arch/x86/hvm/viridian.c @@ -221,16 +221,6 @@ static void initialize_apic_assist(struct vcpu *v) struct page_info *page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC); void *va; - /* - * We don't yet make use of the APIC assist page but by setting - * the CPUID3A_MSR_APIC_ACCESS bit in CPUID leaf 40000003 we are duty - * bound to support the MSR. We therefore do just enough to keep windows - * happy. - * - * See http://msdn.microsoft.com/en-us/library/ff538657%28VS.85%29.aspx for - * details of how Windows uses the page. - */ - if ( !page ) return; @@ -251,6 +241,55 @@ static void initialize_apic_assist(struct vcpu *v) v->arch.hvm_vcpu.viridian.apic_assist.page = page; v->arch.hvm_vcpu.viridian.apic_assist.va = va; + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; +} + +void viridian_start_apic_assist(struct vcpu *v, int vector) +{ + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; + + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || + !va ) + return; + + /* + * If there is already an assist pending then something has gone + * wrong and the VM will most likely hang so force a crash now + * to make the problem clear. + */ + if (v->arch.hvm_vcpu.viridian.apic_assist.vector >= 0) + domain_crash(v->domain); + + v->arch.hvm_vcpu.viridian.apic_assist.vector = vector; + *(uint32_t *)va |= 1u; +} + +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector) +{ + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; + + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || + !va ) + return 0; + + if ( *(uint32_t *)va & 1 ) + return 0; /* Interrupt not yet processed by the guest */ + + *vector = v->arch.hvm_vcpu.viridian.apic_assist.vector; + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; + return 1; +} + +void viridian_abort_apic_assist(struct vcpu *v) +{ + void *va = v->arch.hvm_vcpu.viridian.apic_assist.va; + + if ( !(viridian_feature_mask(v->domain) & HVMPV_apic_assist) || + !va ) + return; + + *(uint32_t *)va &= ~1u; + v->arch.hvm_vcpu.viridian.apic_assist.vector = -1; } static void teardown_apic_assist(struct vcpu *v) diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c index 01a8430..aac4263 100644 --- a/xen/arch/x86/hvm/vlapic.c +++ b/xen/arch/x86/hvm/vlapic.c @@ -38,6 +38,7 @@ #include <asm/hvm/support.h> #include <asm/hvm/vmx/vmx.h> #include <asm/hvm/nestedhvm.h> +#include <asm/hvm/viridian.h> #include <public/hvm/ioreq.h> #include <public/hvm/params.h> @@ -95,6 +96,18 @@ static int vlapic_find_highest_vector(const void *bitmap) return (fls(word[word_offset*4]) - 1) + (word_offset * 32); } +static int vlapic_find_lowest_vector(const void *bitmap) +{ + const uint32_t *word = bitmap; + unsigned int word_offset; + + /* Work forwards through the bitmap (first 32-bit word in every four). */ + for ( word_offset = 0; word_offset < NR_VECTORS / 32; word_offset++) + if ( word[word_offset * 4] ) + return (ffs(word[word_offset * 4]) - 1) + (word_offset * 32); + + return -1; +} /* * IRR-specific bitmap update & search routines. @@ -1157,7 +1170,7 @@ int vlapic_virtual_intr_delivery_enabled(void) int vlapic_has_pending_irq(struct vcpu *v) { struct vlapic *vlapic = vcpu_vlapic(v); - int irr, isr; + int irr, vector, isr; if ( !vlapic_enabled(vlapic) ) return -1; @@ -1170,10 +1183,27 @@ int vlapic_has_pending_irq(struct vcpu *v) !nestedhvm_vcpu_in_guestmode(v) ) return irr; + /* + * If APIC assist was used then there may have been no EOI so + * we need to clear the requisite bit from the ISR here, before + * comparing with the IRR. + */ + if ( viridian_complete_apic_assist(v, &vector) && + vector != -1 ) + vlapic_clear_vector(vector, &vlapic->regs->data[APIC_ISR]); + isr = vlapic_find_highest_isr(vlapic); isr = (isr != -1) ? isr : 0; if ( (isr & 0xf0) >= (irr & 0xf0) ) + { + /* + * There's already a higher priority vector pending so + * we need to abort any previous APIC assist to ensure there + * is an EOI. + */ + viridian_abort_apic_assist(v); return -1; + } return irr; } @@ -1181,13 +1211,29 @@ int vlapic_has_pending_irq(struct vcpu *v) int vlapic_ack_pending_irq(struct vcpu *v, int vector, bool_t force_ack) { struct vlapic *vlapic = vcpu_vlapic(v); + int isr; - if ( force_ack || !vlapic_virtual_intr_delivery_enabled() ) - { - vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); - vlapic_clear_irr(vector, vlapic); - } + if ( !force_ack && + vlapic_virtual_intr_delivery_enabled() ) + return 1; + + if ( vlapic_test_vector(vector, &vlapic->regs->data[APIC_TMR]) ) + goto done; + + isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]); + if ( isr >= 0 && isr < vector ) + goto done; + + /* + * This vector is edge triggered and there are no lower priority + * vectors pending, so we can use APIC assist to avoid exiting + * for EOI. + */ + viridian_start_apic_assist(v, vector); +done: + vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]); + vlapic_clear_irr(vector, vlapic); return 1; } diff --git a/xen/include/asm-x86/hvm/viridian.h b/xen/include/asm-x86/hvm/viridian.h index c60c113..658d46a 100644 --- a/xen/include/asm-x86/hvm/viridian.h +++ b/xen/include/asm-x86/hvm/viridian.h @@ -25,6 +25,7 @@ struct viridian_vcpu union viridian_apic_assist msr; struct page_info *page; void *va; + int vector; } apic_assist; cpumask_var_t flush_cpumask; }; @@ -125,6 +126,10 @@ void viridian_time_ref_count_thaw(struct domain *d); int viridian_vcpu_init(struct vcpu *v); void viridian_vcpu_deinit(struct vcpu *v); +void viridian_start_apic_assist(struct vcpu *v, int vector); +bool_t viridian_complete_apic_assist(struct vcpu *v, int *vector); +void viridian_abort_apic_assist(struct vcpu *v); + #endif /* __ASM_X86_HVM_VIRIDIAN_H__ */ /* diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h index 73d4718..e69c72c 100644 --- a/xen/include/public/hvm/params.h +++ b/xen/include/public/hvm/params.h @@ -115,12 +115,17 @@ #define _HVMPV_hcall_remote_tlb_flush 4 #define HVMPV_hcall_remote_tlb_flush (1 << _HVMPV_hcall_remote_tlb_flush) +/* Use APIC assist */ +#define _HVMPV_apic_assist 5 +#define HVMPV_apic_assist (1 << _HVMPV_apic_assist) + #define HVMPV_feature_mask \ (HVMPV_base_freq | \ HVMPV_no_freq | \ HVMPV_time_ref_count | \ HVMPV_reference_tsc | \ - HVMPV_hcall_remote_tlb_flush) + HVMPV_hcall_remote_tlb_flush | \ + HVMPV_apic_assist) #endif
This patch adds code to enable the APIC assist enlightenment which, under certain conditions, means that the guest can avoid an EOI of the local APIC and thereby avoid a VMEXIT. Use of the enlightenment by the hypervisor is under control of the toolstack. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Keir Fraser <keir@xen.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> --- docs/man/xl.cfg.pod.5 | 8 ++++++ tools/libxl/libxl_dom.c | 3 ++ tools/libxl/libxl_types.idl | 1 + xen/arch/x86/hvm/viridian.c | 59 +++++++++++++++++++++++++++++++------- xen/arch/x86/hvm/vlapic.c | 58 +++++++++++++++++++++++++++++++++---- xen/include/asm-x86/hvm/viridian.h | 5 ++++ xen/include/public/hvm/params.h | 7 ++++- 7 files changed, 124 insertions(+), 17 deletions(-)