Message ID | 20240724011037.3671523-9-jthoughton@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: multi-gen LRU: Walk secondary MMU page tables while aging | expand |
On 2024-07-24 01:10 AM, James Houghton wrote: > Optimize both kvm_age_gfn and kvm_test_age_gfn's interaction with the nit: Use () when referring to functions. > shadow MMU by, rather than checking if our memslot has rmaps, check if > there are any indirect_shadow_pages at all. What is optimized by checking indirect_shadow_pages instead of have_rmaps and what's the benefit? Smells like a premature optimization. > > Also, for kvm_test_age_gfn, reorder the TDP MMU check to be first. If we > find that the range is young, we do not need to check the shadow MMU. This should be a separate commit since it's a logically distinct change and no dependency on the other change in this commit (other than both touch the same function). Splitting the commits up will also make it easier to write more specific short logs (instead of "optimize a little bit" :) Also, the commit re-orders kvm_age_gfn() as well but the commit message only mentions kvm_test_age_gfn(). No objection to keeping the two functions consistent but it should be called out in the commit message. > > Signed-off-by: James Houghton <jthoughton@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 7b93ce8f0680..919d59385f89 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -1629,19 +1629,24 @@ static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, > __rmap_add(vcpu->kvm, cache, slot, spte, gfn, access); > } > > +static bool kvm_has_shadow_mmu_sptes(struct kvm *kvm) > +{ > + return !tdp_mmu_enabled || READ_ONCE(kvm->arch.indirect_shadow_pages); > +} > + > bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > { > bool young = false; > > - if (kvm_memslots_have_rmaps(kvm)) { > + if (tdp_mmu_enabled) > + young |= kvm_tdp_mmu_age_gfn_range(kvm, range); > + > + if (kvm_has_shadow_mmu_sptes(kvm)) { > write_lock(&kvm->mmu_lock); > young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > - if (tdp_mmu_enabled) > - young |= kvm_tdp_mmu_age_gfn_range(kvm, range); > - > return young; > } > > @@ -1649,15 +1654,15 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) > { > bool young = false; > > - if (kvm_memslots_have_rmaps(kvm)) { > + if (tdp_mmu_enabled) > + young |= kvm_tdp_mmu_test_age_gfn(kvm, range); > + > + if (!young && kvm_has_shadow_mmu_sptes(kvm)) { nit: A short comment here might be helpful to explain why young is checked. > write_lock(&kvm->mmu_lock); > young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); > write_unlock(&kvm->mmu_lock); > } > > - if (tdp_mmu_enabled) > - young |= kvm_tdp_mmu_test_age_gfn(kvm, range); > - > return young; > } > > -- > 2.46.0.rc1.232.g9752f9e123-goog >
On Thu, Jul 25, 2024, David Matlack wrote: > On 2024-07-24 01:10 AM, James Houghton wrote: > > Optimize both kvm_age_gfn and kvm_test_age_gfn's interaction with the > > nit: Use () when referring to functions. > > > shadow MMU by, rather than checking if our memslot has rmaps, check if > > there are any indirect_shadow_pages at all. > > What is optimized by checking indirect_shadow_pages instead of > have_rmaps and what's the benefit? Smells like a premature optimization. Checking indirect_shadow_pages avoids taking mmu_lock for write when KVM doesn't currently have shadow MMU pages, but did at some point in the past, whereas kvm_memslots_have_rmaps() is sticky and will return true forever. > > Also, for kvm_test_age_gfn, reorder the TDP MMU check to be first. If we > > find that the range is young, we do not need to check the shadow MMU. > > This should be a separate commit since it's a logically distinct change > and no dependency on the other change in this commit (other than both > touch the same function). > > Splitting the commits up will also make it easier to write more specific > short logs (instead of "optimize a little bit" :) +1. Especially code movement and refactoring, e.g. factoring out tdp_mmu_clear_spte_bits_atomic() would ideally be in a standalone patch that's dead simple to review.
On Fri, Aug 16, 2024 at 6:00 PM Sean Christopherson <seanjc@google.com> wrote: > > On Thu, Jul 25, 2024, David Matlack wrote: > > On 2024-07-24 01:10 AM, James Houghton wrote: > > > Optimize both kvm_age_gfn and kvm_test_age_gfn's interaction with the > > > > nit: Use () when referring to functions. > > > > > shadow MMU by, rather than checking if our memslot has rmaps, check if > > > there are any indirect_shadow_pages at all. > > > > What is optimized by checking indirect_shadow_pages instead of > > have_rmaps and what's the benefit? Smells like a premature optimization. > > Checking indirect_shadow_pages avoids taking mmu_lock for write when KVM doesn't > currently have shadow MMU pages, but did at some point in the past, whereas > kvm_memslots_have_rmaps() is sticky and will return true forever. Thanks for the clear explanation. > > > Also, for kvm_test_age_gfn, reorder the TDP MMU check to be first. If we > > > find that the range is young, we do not need to check the shadow MMU. > > > > This should be a separate commit since it's a logically distinct change > > and no dependency on the other change in this commit (other than both > > touch the same function). Done. > > Splitting the commits up will also make it easier to write more specific > > short logs (instead of "optimize a little bit" :) > > +1. Especially code movement and refactoring, e.g. factoring out > tdp_mmu_clear_spte_bits_atomic() would ideally be in a standalone patch that's > dead simple to review. I have now split out the creation of tdp_mmu_clear_spte_bits_atomic() into its own patch. Though I'm not entirely convinced splitting out every refactor like that is always a good thing.
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7b93ce8f0680..919d59385f89 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1629,19 +1629,24 @@ static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot, __rmap_add(vcpu->kvm, cache, slot, spte, gfn, access); } +static bool kvm_has_shadow_mmu_sptes(struct kvm *kvm) +{ + return !tdp_mmu_enabled || READ_ONCE(kvm->arch.indirect_shadow_pages); +} + bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) { + if (tdp_mmu_enabled) + young |= kvm_tdp_mmu_age_gfn_range(kvm, range); + + if (kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_age_rmap); write_unlock(&kvm->mmu_lock); } - if (tdp_mmu_enabled) - young |= kvm_tdp_mmu_age_gfn_range(kvm, range); - return young; } @@ -1649,15 +1654,15 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { bool young = false; - if (kvm_memslots_have_rmaps(kvm)) { + if (tdp_mmu_enabled) + young |= kvm_tdp_mmu_test_age_gfn(kvm, range); + + if (!young && kvm_has_shadow_mmu_sptes(kvm)) { write_lock(&kvm->mmu_lock); young = kvm_handle_gfn_range(kvm, range, kvm_test_age_rmap); write_unlock(&kvm->mmu_lock); } - if (tdp_mmu_enabled) - young |= kvm_tdp_mmu_test_age_gfn(kvm, range); - return young; }
Optimize both kvm_age_gfn and kvm_test_age_gfn's interaction with the shadow MMU by, rather than checking if our memslot has rmaps, check if there are any indirect_shadow_pages at all. Also, for kvm_test_age_gfn, reorder the TDP MMU check to be first. If we find that the range is young, we do not need to check the shadow MMU. Signed-off-by: James Houghton <jthoughton@google.com> --- arch/x86/kvm/mmu/mmu.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)