Message ID | 1346429448.2823.1.camel@offbook (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: > For processors that support VPIDs we should invalidate the page table entry > specified by the lineal address. For this purpose add support for individual > address invalidations. Not necessary - a single context invalidation is performed through KVM_REQ_TLB_FLUSH. Single-context. If the INVVPID type is 1, the logical processor invalidates all linear mappings and combined mappings associated with the VPID specified in the INVVPID descriptor. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote: > On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: > > For processors that support VPIDs we should invalidate the page table entry > > specified by the lineal address. For this purpose add support for individual > > address invalidations. > > Not necessary - a single context invalidation is performed through > KVM_REQ_TLB_FLUSH. Since vpid_sync_context() supports both single and all-context vpid invalidations, wouldn't it make sense to also add individual address ones as well, supporting further granularity? > > Single-context. If the INVVPID type is 1, the logical processor > invalidates all > linear mappings and combined mappings associated with the VPID specified > in the INVVPID descriptor. > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/03/2012 02:27 AM, Davidlohr Bueso wrote: > On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote: >> On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: >> > For processors that support VPIDs we should invalidate the page table entry >> > specified by the lineal address. For this purpose add support for individual >> > address invalidations. >> >> Not necessary - a single context invalidation is performed through >> KVM_REQ_TLB_FLUSH. > > Since vpid_sync_context() supports both single and all-context vpid > invalidations, wouldn't it make sense to also add individual address > ones as well, supporting further granularity? It might. Do you have benchmarks supporting this?
On Mon, 2012-09-03 at 12:11 +0300, Avi Kivity wrote: > On 09/03/2012 02:27 AM, Davidlohr Bueso wrote: > > On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote: > >> On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: > >> > For processors that support VPIDs we should invalidate the page table entry > >> > specified by the lineal address. For this purpose add support for individual > >> > address invalidations. > >> > >> Not necessary - a single context invalidation is performed through > >> KVM_REQ_TLB_FLUSH. > > > > Since vpid_sync_context() supports both single and all-context vpid > > invalidations, wouldn't it make sense to also add individual address > > ones as well, supporting further granularity? > > It might. Do you have benchmarks supporting this? > I ran two benchmarks: Java Dacapo[1] Sunflow (renders a set of images using ray tracing) and a vanilla 3.2 kernel build (with 1 job and -j8). The host configuration is an Intel i7-2635QM (4 cores + HT) with 4Gb RAM running Linus's latest and only running standard system daemons. For KVM I disabled EPT. The guest configuration is a 64bit 4 core 4Gb RAM, running Linux 3.2 (debian) and only running the benchmark. All results represent the mean of 5 runs, with time(1). Dacapo without individual addr invvpid: real 1m25.406s user 4m59.315s sys 1m25.406s Dacapo with individual addr invvpid: real 1m4.421s user 3m47.150s sys 0m1.592s -- vanilla kernel build without individual addr invvpid: real 16m42.571s user 13m28.975s sys 2m54.487s vanilla kernel build with individual addr invvpid: real 15m45.789s user 12m25.691s sys 2m44.806s -- vanilla kernel build (-j8) without individual addr invvpid: real 10m32.276s user 33m47.687s sys 5m37.725s vanilla kernel build (-j8) with individual addr invvpid: real 8m29.789s user 28m12.850s sys 4m34.353s In all cases using individual address invalidation outperforms single context ones regarding wall time. Comments? [1] http://dacapobench.org/ Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/06/2012 12:54 AM, Davidlohr Bueso wrote: > On Mon, 2012-09-03 at 12:11 +0300, Avi Kivity wrote: >> On 09/03/2012 02:27 AM, Davidlohr Bueso wrote: >> > On Fri, 2012-08-31 at 14:37 -0300, Marcelo Tosatti wrote: >> >> On Fri, Aug 31, 2012 at 06:10:48PM +0200, Davidlohr Bueso wrote: >> >> > For processors that support VPIDs we should invalidate the page table entry >> >> > specified by the lineal address. For this purpose add support for individual >> >> > address invalidations. >> >> >> >> Not necessary - a single context invalidation is performed through >> >> KVM_REQ_TLB_FLUSH. >> > >> > Since vpid_sync_context() supports both single and all-context vpid >> > invalidations, wouldn't it make sense to also add individual address >> > ones as well, supporting further granularity? >> >> It might. Do you have benchmarks supporting this? >> > > I ran two benchmarks: Java Dacapo[1] Sunflow (renders a set of images > using ray tracing) and a vanilla 3.2 kernel build (with 1 job and -j8). > > The host configuration is an Intel i7-2635QM (4 cores + HT) with 4Gb RAM > running Linus's latest and only running standard system daemons. For KVM > I disabled EPT. That's not very interesting. In all real machines, if you have VPID you also have EPT. Intel are unlikely to produce a processor without EPT. > The guest configuration is a 64bit 4 core 4Gb RAM, running Linux 3.2 > (debian) and only running the benchmark. > > All results represent the mean of 5 runs, with time(1). The results are impressive, but lack real-world relevance. Individual-address invalidation isn't very useful with EPT, since we let the guest issue INVLPG itself and otherwise don't bother with guest page tables. Individual-address INVEPT would probably be more useful, but there is no such instruction variant.
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 74fcb96..20abb18 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -393,6 +393,7 @@ enum vmcs_field { #define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT (KVM_MEMORY_SLOTS + 2) #define VMX_NR_VPIDS (1 << 16) +#define VMX_VPID_EXTENT_INDIVIDUAL_ADDR 0 #define VMX_VPID_EXTENT_SINGLE_CONTEXT 1 #define VMX_VPID_EXTENT_ALL_CONTEXT 2 @@ -406,12 +407,13 @@ enum vmcs_field { #define VMX_EPTP_WB_BIT (1ull << 14) #define VMX_EPT_2MB_PAGE_BIT (1ull << 16) #define VMX_EPT_1GB_PAGE_BIT (1ull << 17) -#define VMX_EPT_AD_BIT (1ull << 21) +#define VMX_EPT_AD_BIT (1ull << 21) #define VMX_EPT_EXTENT_INDIVIDUAL_BIT (1ull << 24) #define VMX_EPT_EXTENT_CONTEXT_BIT (1ull << 25) #define VMX_EPT_EXTENT_GLOBAL_BIT (1ull << 26) -#define VMX_VPID_EXTENT_SINGLE_CONTEXT_BIT (1ull << 9) /* (41 - 32) */ +#define VMX_VPID_EXTENT_INDIVIDUAL_ADDR_BIT (1ull << 8) /* (40 - 32) */ +#define VMX_VPID_EXTENT_SINGLE_CONTEXT_BIT (1ull << 9) /* (41 - 32) */ #define VMX_VPID_EXTENT_GLOBAL_CONTEXT_BIT (1ull << 10) /* (42 - 32) */ #define VMX_EPT_DEFAULT_GAW 3 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c00f03d..d87b22c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -816,6 +816,11 @@ static inline bool cpu_has_vmx_invept_global(void) return vmx_capability.ept & VMX_EPT_EXTENT_GLOBAL_BIT; } +static inline bool cpu_has_vmx_invvpid_individual_addr(void) +{ + return vmx_capability.vpid & VMX_VPID_EXTENT_INDIVIDUAL_ADDR_BIT; +} + static inline bool cpu_has_vmx_invvpid_single(void) { return vmx_capability.vpid & VMX_VPID_EXTENT_SINGLE_CONTEXT_BIT; @@ -1011,6 +1016,15 @@ static void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs) loaded_vmcs->cpu, __loaded_vmcs_clear, loaded_vmcs, 1); } +static inline void vpid_sync_vcpu_individual_addr(struct vcpu_vmx *vmx, gpa_t gpa) +{ + if (vmx->vpid == 0) + return; + + if (cpu_has_vmx_invvpid_individual_addr()) + __invvpid(VMX_VPID_EXTENT_INDIVIDUAL_ADDR, vmx->vpid, gpa); +} + static inline void vpid_sync_vcpu_single(struct vcpu_vmx *vmx) { if (vmx->vpid == 0) @@ -4719,6 +4733,7 @@ static int handle_invlpg(struct kvm_vcpu *vcpu) unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION); kvm_mmu_invlpg(vcpu, exit_qualification); + vpid_sync_vcpu_individual_addr(to_vmx(vcpu), exit_qualification); skip_emulated_instruction(vcpu); return 1; }
For processors that support VPIDs we should invalidate the page table entry specified by the lineal address. For this purpose add support for individual address invalidations. Signed-off-by: Davidlohr Bueso <dave@gnu.org> --- arch/x86/include/asm/vmx.h | 6 ++++-- arch/x86/kvm/vmx.c | 15 +++++++++++++++ 2 files changed, 19 insertions(+), 2 deletions(-)