Message ID | 20211011204418.162846-1-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/mmu: Rename slot_handle_leaf to slot_handle_level_4k | expand |
On Mon, Oct 11, 2021 at 1:44 PM David Matlack <dmatlack@google.com> wrote: > > slot_handle_leaf is a misnomer because it only operates on 4K SPTEs > whereas "leaf" is used to describe any valid terminal SPTE (4K or > large page). Rename slot_handle_leaf to slot_handle_level_4k to > avoid confusion. > > Making this change makes it more obvious there is a benign discrepency > between the legacy MMU and the TDP MMU when it comes to dirty logging. > The legacy MMU only operates on 4K SPTEs when zapping for collapsing > and when clearing D-bits. The TDP MMU, on the other hand, operates on > SPTEs on all levels. The TDP MMU behavior is technically overkill but > not incorrect. So opportunistically add comments to explain the > difference. Note that at least in the zapping case when disabling dirty logging, the TDP MMU will still only zap pages if they're mapped smaller than the highest granularity they could be. As a result it uses a slower check, but shouldn't be doing many (if any) extra zaps. > > Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 18 +++++++++++++----- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 24a9f4c3f5e7..f00644e79ef5 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5382,8 +5382,8 @@ slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot, > } > > static __always_inline bool > -slot_handle_leaf(struct kvm *kvm, const struct kvm_memory_slot *memslot, > - slot_level_handler fn, bool flush_on_yield) > +slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot, > + slot_level_handler fn, bool flush_on_yield) > { > return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K, > PG_LEVEL_4K, flush_on_yield); > @@ -5772,7 +5772,12 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, > > if (kvm_memslots_have_rmaps(kvm)) { > write_lock(&kvm->mmu_lock); > - flush = slot_handle_leaf(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > + /* > + * Strictly speaking only 4k SPTEs need to be zapped because > + * KVM never creates intermediate 2m mappings when performing > + * dirty logging. > + */ > + flush = slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > if (flush) > kvm_arch_flush_remote_tlbs_memslot(kvm, slot); > write_unlock(&kvm->mmu_lock); > @@ -5809,8 +5814,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, > > if (kvm_memslots_have_rmaps(kvm)) { > write_lock(&kvm->mmu_lock); > - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, > - false); > + /* > + * Strictly speaking only 4k SPTEs need to be cleared because > + * KVM always performs dirty logging at a 4k granularity. > + */ > + flush = slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false); > write_unlock(&kvm->mmu_lock); > } > > -- > 2.33.0.882.g93a45727a2-goog >
On Mon, Oct 11, 2021 at 2:07 PM Ben Gardon <bgardon@google.com> wrote: > > On Mon, Oct 11, 2021 at 1:44 PM David Matlack <dmatlack@google.com> wrote: > > > > slot_handle_leaf is a misnomer because it only operates on 4K SPTEs > > whereas "leaf" is used to describe any valid terminal SPTE (4K or > > large page). Rename slot_handle_leaf to slot_handle_level_4k to > > avoid confusion. > > > > Making this change makes it more obvious there is a benign discrepency > > between the legacy MMU and the TDP MMU when it comes to dirty logging. > > The legacy MMU only operates on 4K SPTEs when zapping for collapsing > > and when clearing D-bits. The TDP MMU, on the other hand, operates on > > SPTEs on all levels. The TDP MMU behavior is technically overkill but > > not incorrect. So opportunistically add comments to explain the > > difference. > > Note that at least in the zapping case when disabling dirty logging, > the TDP MMU will still only zap pages if they're mapped smaller than > the highest granularity they could be. As a result it uses a slower > check, but shouldn't be doing many (if any) extra zaps. Agreed. The legacy MMU implementation relies on the fact that collapsible 2M SPTEs are never generated by dirty logging so it only needs to check 4K SPTEs. The TDP MMU implementation is actually more robust, since it checks every SPTE for collapsibility. The only reason it would be doing extra zaps if there is something other than dirty logging can cause an SPTE to be collapsible. (HugePage NX comes to mind.) > > > > > Signed-off-by: David Matlack <dmatlack@google.com> > > Reviewed-by: Ben Gardon <bgardon@google.com> > > > --- > > arch/x86/kvm/mmu/mmu.c | 18 +++++++++++++----- > > 1 file changed, 13 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > index 24a9f4c3f5e7..f00644e79ef5 100644 > > --- a/arch/x86/kvm/mmu/mmu.c > > +++ b/arch/x86/kvm/mmu/mmu.c > > @@ -5382,8 +5382,8 @@ slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > } > > > > static __always_inline bool > > -slot_handle_leaf(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > - slot_level_handler fn, bool flush_on_yield) > > +slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > + slot_level_handler fn, bool flush_on_yield) > > { > > return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K, > > PG_LEVEL_4K, flush_on_yield); > > @@ -5772,7 +5772,12 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, > > > > if (kvm_memslots_have_rmaps(kvm)) { > > write_lock(&kvm->mmu_lock); > > - flush = slot_handle_leaf(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > + /* > > + * Strictly speaking only 4k SPTEs need to be zapped because > > + * KVM never creates intermediate 2m mappings when performing > > + * dirty logging. > > + */ > > + flush = slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > if (flush) > > kvm_arch_flush_remote_tlbs_memslot(kvm, slot); > > write_unlock(&kvm->mmu_lock); > > @@ -5809,8 +5814,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, > > > > if (kvm_memslots_have_rmaps(kvm)) { > > write_lock(&kvm->mmu_lock); > > - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, > > - false); > > + /* > > + * Strictly speaking only 4k SPTEs need to be cleared because > > + * KVM always performs dirty logging at a 4k granularity. > > + */ > > + flush = slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false); > > write_unlock(&kvm->mmu_lock); > > } > > > > -- > > 2.33.0.882.g93a45727a2-goog > >
On Mon, Oct 11, 2021 at 2:25 PM David Matlack <dmatlack@google.com> wrote: > > On Mon, Oct 11, 2021 at 2:07 PM Ben Gardon <bgardon@google.com> wrote: > > > > On Mon, Oct 11, 2021 at 1:44 PM David Matlack <dmatlack@google.com> wrote: > > > > > > slot_handle_leaf is a misnomer because it only operates on 4K SPTEs > > > whereas "leaf" is used to describe any valid terminal SPTE (4K or > > > large page). Rename slot_handle_leaf to slot_handle_level_4k to > > > avoid confusion. > > > > > > Making this change makes it more obvious there is a benign discrepency > > > between the legacy MMU and the TDP MMU when it comes to dirty logging. > > > The legacy MMU only operates on 4K SPTEs when zapping for collapsing > > > and when clearing D-bits. The TDP MMU, on the other hand, operates on > > > SPTEs on all levels. The TDP MMU behavior is technically overkill but > > > not incorrect. So opportunistically add comments to explain the > > > difference. > > > > Note that at least in the zapping case when disabling dirty logging, > > the TDP MMU will still only zap pages if they're mapped smaller than > > the highest granularity they could be. As a result it uses a slower > > check, but shouldn't be doing many (if any) extra zaps. > > Agreed. The legacy MMU implementation relies on the fact that > collapsible 2M SPTEs are never generated by dirty logging so it only > needs to check 4K SPTEs. > > The TDP MMU implementation is actually more robust, since it checks > every SPTE for collapsibility. The only reason it would be doing extra > zaps if there is something other than dirty logging can cause an SPTE > to be collapsible. (HugePage NX comes to mind.) Ah but HugePage NX does not create 2M SPTEs so this wouldn't actually result in extra zaps. > > > > > > > > > Signed-off-by: David Matlack <dmatlack@google.com> > > > > Reviewed-by: Ben Gardon <bgardon@google.com> > > > > > --- > > > arch/x86/kvm/mmu/mmu.c | 18 +++++++++++++----- > > > 1 file changed, 13 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > > index 24a9f4c3f5e7..f00644e79ef5 100644 > > > --- a/arch/x86/kvm/mmu/mmu.c > > > +++ b/arch/x86/kvm/mmu/mmu.c > > > @@ -5382,8 +5382,8 @@ slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > > } > > > > > > static __always_inline bool > > > -slot_handle_leaf(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > > - slot_level_handler fn, bool flush_on_yield) > > > +slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot, > > > + slot_level_handler fn, bool flush_on_yield) > > > { > > > return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K, > > > PG_LEVEL_4K, flush_on_yield); > > > @@ -5772,7 +5772,12 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, > > > > > > if (kvm_memslots_have_rmaps(kvm)) { > > > write_lock(&kvm->mmu_lock); > > > - flush = slot_handle_leaf(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > > + /* > > > + * Strictly speaking only 4k SPTEs need to be zapped because > > > + * KVM never creates intermediate 2m mappings when performing > > > + * dirty logging. > > > + */ > > > + flush = slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > > if (flush) > > > kvm_arch_flush_remote_tlbs_memslot(kvm, slot); > > > write_unlock(&kvm->mmu_lock); > > > @@ -5809,8 +5814,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, > > > > > > if (kvm_memslots_have_rmaps(kvm)) { > > > write_lock(&kvm->mmu_lock); > > > - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, > > > - false); > > > + /* > > > + * Strictly speaking only 4k SPTEs need to be cleared because > > > + * KVM always performs dirty logging at a 4k granularity. > > > + */ > > > + flush = slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false); > > > write_unlock(&kvm->mmu_lock); > > > } > > > > > > -- > > > 2.33.0.882.g93a45727a2-goog > > >
On Mon, Oct 11, 2021, David Matlack wrote: > On Mon, Oct 11, 2021 at 2:07 PM Ben Gardon <bgardon@google.com> wrote: > > > > On Mon, Oct 11, 2021 at 1:44 PM David Matlack <dmatlack@google.com> wrote: > > > > > > slot_handle_leaf is a misnomer because it only operates on 4K SPTEs > > > whereas "leaf" is used to describe any valid terminal SPTE (4K or > > > large page). Rename slot_handle_leaf to slot_handle_level_4k to > > > avoid confusion. > > > > > > Making this change makes it more obvious there is a benign discrepency > > > between the legacy MMU and the TDP MMU when it comes to dirty logging. > > > The legacy MMU only operates on 4K SPTEs when zapping for collapsing > > > and when clearing D-bits. The TDP MMU, on the other hand, operates on > > > SPTEs on all levels. The TDP MMU behavior is technically overkill but > > > not incorrect. So opportunistically add comments to explain the > > > difference. > > > > Note that at least in the zapping case when disabling dirty logging, > > the TDP MMU will still only zap pages if they're mapped smaller than > > the highest granularity they could be. As a result it uses a slower > > check, but shouldn't be doing many (if any) extra zaps. FWIW, the legacy MMU now has the same guard against spurious zaps. > Agreed. The legacy MMU implementation relies on the fact that > collapsible 2M SPTEs are never generated by dirty logging so it only > needs to check 4K SPTEs. I think it makes sense to send a v2 to further clarify the TDP MMU with these details, e.g. expand the last sentences to something like The TDP MMU behavior of zapping SPTEs at all levels is technically overkill for its current dirty logging implementation, which always demotes to 4k SPTES, but both the TDP MMU and legacy MMU zap if and only if the SPTE can be replaced by a larger page, i.e. will not spuriously zap 2m (or larger) SPTEs. > > > @@ -5772,7 +5772,12 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, > > > > > > if (kvm_memslots_have_rmaps(kvm)) { > > > write_lock(&kvm->mmu_lock); > > > - flush = slot_handle_leaf(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > > + /* > > > + * Strictly speaking only 4k SPTEs need to be zapped because > > > + * KVM never creates intermediate 2m mappings when performing > > > + * dirty logging. And then also tweak this comment to clarify that the _legacy_ MMU never creates 2m SPTEs when dirty logging, so that the comment doesn't become incorrect if KVM ever supports 2mb+ granularity in the TDP MMU but not the legacy MMU, e.g. /* * Zap only 4k SPTEs, the legacy MMU only supports dirty * logging at 4k granularity, i.e. doesn't create 2m * SPTEs when dirty logging is enabled. */ > > > + */ > > > + flush = slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true); > > > if (flush) > > > kvm_arch_flush_remote_tlbs_memslot(kvm, slot); > > > write_unlock(&kvm->mmu_lock); > > > @@ -5809,8 +5814,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, > > > > > > if (kvm_memslots_have_rmaps(kvm)) { > > > write_lock(&kvm->mmu_lock); > > > - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, > > > - false); > > > + /* > > > + * Strictly speaking only 4k SPTEs need to be cleared because > > > + * KVM always performs dirty logging at a 4k granularity. And "s/KVM/the legacy MMU" here as well. > > > + */ > > > + flush = slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false); > > > write_unlock(&kvm->mmu_lock); > > > } > > > > > > -- > > > 2.33.0.882.g93a45727a2-goog > > >
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 24a9f4c3f5e7..f00644e79ef5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5382,8 +5382,8 @@ slot_handle_level(struct kvm *kvm, const struct kvm_memory_slot *memslot, } static __always_inline bool -slot_handle_leaf(struct kvm *kvm, const struct kvm_memory_slot *memslot, - slot_level_handler fn, bool flush_on_yield) +slot_handle_level_4k(struct kvm *kvm, const struct kvm_memory_slot *memslot, + slot_level_handler fn, bool flush_on_yield) { return slot_handle_level(kvm, memslot, fn, PG_LEVEL_4K, PG_LEVEL_4K, flush_on_yield); @@ -5772,7 +5772,12 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm, if (kvm_memslots_have_rmaps(kvm)) { write_lock(&kvm->mmu_lock); - flush = slot_handle_leaf(kvm, slot, kvm_mmu_zap_collapsible_spte, true); + /* + * Strictly speaking only 4k SPTEs need to be zapped because + * KVM never creates intermediate 2m mappings when performing + * dirty logging. + */ + flush = slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true); if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, slot); write_unlock(&kvm->mmu_lock); @@ -5809,8 +5814,11 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm, if (kvm_memslots_have_rmaps(kvm)) { write_lock(&kvm->mmu_lock); - flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, - false); + /* + * Strictly speaking only 4k SPTEs need to be cleared because + * KVM always performs dirty logging at a 4k granularity. + */ + flush = slot_handle_level_4k(kvm, memslot, __rmap_clear_dirty, false); write_unlock(&kvm->mmu_lock); }
slot_handle_leaf is a misnomer because it only operates on 4K SPTEs whereas "leaf" is used to describe any valid terminal SPTE (4K or large page). Rename slot_handle_leaf to slot_handle_level_4k to avoid confusion. Making this change makes it more obvious there is a benign discrepency between the legacy MMU and the TDP MMU when it comes to dirty logging. The legacy MMU only operates on 4K SPTEs when zapping for collapsing and when clearing D-bits. The TDP MMU, on the other hand, operates on SPTEs on all levels. The TDP MMU behavior is technically overkill but not incorrect. So opportunistically add comments to explain the difference. Signed-off-by: David Matlack <dmatlack@google.com> --- arch/x86/kvm/mmu/mmu.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)