Message ID | 1482184705-127401-2-git-send-email-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 58995fd9..de55653 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -1713,7 +1713,7 @@ static void kvm_mmu_free_page(struct kvm_mmu_page *sp) > > static unsigned kvm_page_table_hashfn(gfn_t gfn) > { > - return gfn & ((1 << KVM_MMU_HASH_SHIFT) - 1); > + return hash_64(gfn, KVM_MMU_HASH_SHIFT); > } > > static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu, > hash_64() might be more expensive to calculate, as it involves one multiplication. Not sure if that might be a problem. Looks good to me!
On 19/12/2016 22:58, David Matlack wrote: > When using two-dimensional paging, the mmu_page_hash (which provides > lookups for existing kvm_mmu_page structs), becomes imbalanced; with > too many collisions in buckets 0 and 512. Ouch, nice catch. I'll apply this patch tomorrow since it's pretty much independent of patch 1. If you want to send out a revision of the statistics patch, please do. Thanks, Paolo This has been seen to cause > mmu_lock to be held for multiple milliseconds in kvm_mmu_get_page on > VMs with a large amount of RAM mapped with 4K pages. > > The current hash function uses the lower 10 bits of gfn to index into > mmu_page_hash. When doing shadow paging, gfn is the address of the > guest page table being shadow. These tables are 4K-aligned, which > makes the low bits of gfn a good hash. However, with two-dimensional > paging, no guest page tables are being shadowed, so gfn is the base > address that is mapped by the table. Thus page tables (level=1) have > a 2MB aligned gfn, page directories (level=2) have a 1GB aligned gfn, > etc. This means hashes will only differ in their 10th bit. > > hash_64() provides a better hash. For example, on a VM with ~200G > (99458 direct=1 kvm_mmu_page structs): > > hash max_mmu_page_hash_collisions > -------------------------------------------- > low 10 bits 49847 > hash_64 105 > perfect 97 > > While we're changing the hash, increase the table size by 4x to better > support large VMs (further reduces number of collisions in 200G VM to > 29). > > Note that hash_64() does not provide a good distribution prior to commit > ef703f49a6c5 ("Eliminate bad hash multipliers from hash_32() and > hash_64()"). > > Signed-off-by: David Matlack <dmatlack@google.com> > Change-Id: I5aa6b13c834722813c6cca46b8b1ed6f53368ade > --- > arch/x86/include/asm/kvm_host.h | 2 +- > arch/x86/kvm/mmu.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 8ba0d64..5962e4bc 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -115,7 +115,7 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) > > #define KVM_PERMILLE_MMU_PAGES 20 > #define KVM_MIN_ALLOC_MMU_PAGES 64 > -#define KVM_MMU_HASH_SHIFT 10 > +#define KVM_MMU_HASH_SHIFT 12 > #define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT) > #define KVM_MIN_FREE_MMU_PAGES 5 > #define KVM_REFILL_PAGES 25 > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 58995fd9..de55653 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -1713,7 +1713,7 @@ static void kvm_mmu_free_page(struct kvm_mmu_page *sp) > > static unsigned kvm_page_table_hashfn(gfn_t gfn) > { > - return gfn & ((1 << KVM_MMU_HASH_SHIFT) - 1); > + return hash_64(gfn, KVM_MMU_HASH_SHIFT); > } > > static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu, > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 20, 2016 at 8:57 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 19/12/2016 22:58, David Matlack wrote: >> When using two-dimensional paging, the mmu_page_hash (which provides >> lookups for existing kvm_mmu_page structs), becomes imbalanced; with >> too many collisions in buckets 0 and 512. > > Ouch, nice catch. I'll apply this patch tomorrow since it's pretty much > independent of patch 1. If you want to send out a revision of the > statistics patch, please do. SG. I'll send v2 of the stats patch by itself. Thanks! -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 8ba0d64..5962e4bc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -115,7 +115,7 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) #define KVM_PERMILLE_MMU_PAGES 20 #define KVM_MIN_ALLOC_MMU_PAGES 64 -#define KVM_MMU_HASH_SHIFT 10 +#define KVM_MMU_HASH_SHIFT 12 #define KVM_NUM_MMU_PAGES (1 << KVM_MMU_HASH_SHIFT) #define KVM_MIN_FREE_MMU_PAGES 5 #define KVM_REFILL_PAGES 25 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 58995fd9..de55653 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1713,7 +1713,7 @@ static void kvm_mmu_free_page(struct kvm_mmu_page *sp) static unsigned kvm_page_table_hashfn(gfn_t gfn) { - return gfn & ((1 << KVM_MMU_HASH_SHIFT) - 1); + return hash_64(gfn, KVM_MMU_HASH_SHIFT); } static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu,
When using two-dimensional paging, the mmu_page_hash (which provides lookups for existing kvm_mmu_page structs), becomes imbalanced; with too many collisions in buckets 0 and 512. This has been seen to cause mmu_lock to be held for multiple milliseconds in kvm_mmu_get_page on VMs with a large amount of RAM mapped with 4K pages. The current hash function uses the lower 10 bits of gfn to index into mmu_page_hash. When doing shadow paging, gfn is the address of the guest page table being shadow. These tables are 4K-aligned, which makes the low bits of gfn a good hash. However, with two-dimensional paging, no guest page tables are being shadowed, so gfn is the base address that is mapped by the table. Thus page tables (level=1) have a 2MB aligned gfn, page directories (level=2) have a 1GB aligned gfn, etc. This means hashes will only differ in their 10th bit. hash_64() provides a better hash. For example, on a VM with ~200G (99458 direct=1 kvm_mmu_page structs): hash max_mmu_page_hash_collisions -------------------------------------------- low 10 bits 49847 hash_64 105 perfect 97 While we're changing the hash, increase the table size by 4x to better support large VMs (further reduces number of collisions in 200G VM to 29). Note that hash_64() does not provide a good distribution prior to commit ef703f49a6c5 ("Eliminate bad hash multipliers from hash_32() and hash_64()"). Signed-off-by: David Matlack <dmatlack@google.com> Change-Id: I5aa6b13c834722813c6cca46b8b1ed6f53368ade --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)