Message ID | 20210204221959.232582-1-bgardon@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: VMX: Optimize flushing the PML buffer | expand |
Hi, Ben, On Thu, Feb 04, 2021 at 02:19:59PM -0800, Ben Gardon wrote: > The average time for each run demonstrated a strange bimodal distribution, > with clusters around 2 seconds and 2.5 seconds. This may have been a > result of vCPU migration between NUMA nodes. Have you thought about using numactl or similar technique to verify your idea (force both vcpu threads binding, and memory allocations)? From the numbers it already shows improvements indeed, but just curious since you raised this up. :) > @@ -5707,13 +5708,18 @@ static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu) > else > pml_idx++; > > + memslots = kvm_vcpu_memslots(vcpu); > + > pml_buf = page_address(vmx->pml_pg); > for (; pml_idx < PML_ENTITY_NUM; pml_idx++) { > + struct kvm_memory_slot *memslot; > u64 gpa; > > gpa = pml_buf[pml_idx]; > WARN_ON(gpa & (PAGE_SIZE - 1)); > - kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT); > + > + memslot = __gfn_to_memslot(memslots, gpa >> PAGE_SHIFT); > + mark_page_dirty_in_slot(vcpu->kvm, memslot, gpa >> PAGE_SHIFT); Since at it: make "gpa >> PAGE_SHIFT" a temp var too? Thanks,
On Thu, Feb 4, 2021 at 2:51 PM Peter Xu <peterx@redhat.com> wrote: > > Hi, Ben, > > On Thu, Feb 04, 2021 at 02:19:59PM -0800, Ben Gardon wrote: > > The average time for each run demonstrated a strange bimodal distribution, > > with clusters around 2 seconds and 2.5 seconds. This may have been a > > result of vCPU migration between NUMA nodes. > > Have you thought about using numactl or similar technique to verify your idea > (force both vcpu threads binding, and memory allocations)? > > From the numbers it already shows improvements indeed, but just curious since > you raised this up. :) Frustratingly, the test machines I have don't have numactl installed but I've been meaning to add cpu pinning to the selftests perf tests anyway, so maybe this is a good reason to do it. > > > @@ -5707,13 +5708,18 @@ static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu) > > else > > pml_idx++; > > > > + memslots = kvm_vcpu_memslots(vcpu); > > + > > pml_buf = page_address(vmx->pml_pg); > > for (; pml_idx < PML_ENTITY_NUM; pml_idx++) { > > + struct kvm_memory_slot *memslot; > > u64 gpa; > > > > gpa = pml_buf[pml_idx]; > > WARN_ON(gpa & (PAGE_SIZE - 1)); > > - kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT); > > + > > + memslot = __gfn_to_memslot(memslots, gpa >> PAGE_SHIFT); > > + mark_page_dirty_in_slot(vcpu->kvm, memslot, gpa >> PAGE_SHIFT); > > Since at it: make "gpa >> PAGE_SHIFT" a temp var too? That's a good idea, I'll try it. > > Thanks, > > -- > Peter Xu >
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cc60b1fc3ee7..46c54802dfdb 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5692,6 +5692,7 @@ static void vmx_destroy_pml_buffer(struct vcpu_vmx *vmx) static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); + struct kvm_memslots *memslots; u64 *pml_buf; u16 pml_idx; @@ -5707,13 +5708,18 @@ static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu) else pml_idx++; + memslots = kvm_vcpu_memslots(vcpu); + pml_buf = page_address(vmx->pml_pg); for (; pml_idx < PML_ENTITY_NUM; pml_idx++) { + struct kvm_memory_slot *memslot; u64 gpa; gpa = pml_buf[pml_idx]; WARN_ON(gpa & (PAGE_SIZE - 1)); - kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT); + + memslot = __gfn_to_memslot(memslots, gpa >> PAGE_SHIFT); + mark_page_dirty_in_slot(vcpu->kvm, memslot, gpa >> PAGE_SHIFT); } /* reset PML index */