diff mbox series

[16/23] KVM: x86/mmu: Zap collapsible SPTEs at all levels in the shadow MMU

Message ID 20220203010051.2813563-17-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series Extend Eager Page Splitting to the shadow MMU | expand

Commit Message

David Matlack Feb. 3, 2022, 1 a.m. UTC
Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e.
in the rmap). This leads to correct behavior because KVM never creates
intermediate huge pages during dirty logging. For example, a 1GiB page
is never partially split to a 2MiB page.

However this behavior will stop being correct once the shadow MMU
participates in eager page splitting, which can in fact leave behind
partially split huge pages. In preparation for that change, change the
shadow MMU to iterate over all levels when zapping collapsible SPTEs.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

Comments

Ben Gardon Feb. 28, 2022, 8:39 p.m. UTC | #1
On Wed, Feb 2, 2022 at 5:02 PM David Matlack <dmatlack@google.com> wrote:
>
> Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e.
> in the rmap). This leads to correct behavior because KVM never creates
> intermediate huge pages during dirty logging. For example, a 1GiB page
> is never partially split to a 2MiB page.
>
> However this behavior will stop being correct once the shadow MMU
> participates in eager page splitting, which can in fact leave behind
> partially split huge pages. In preparation for that change, change the
> shadow MMU to iterate over all levels when zapping collapsible SPTEs.
>
> No functional change intended.
>

Reviewed-by: Ben Gardon <bgardon@google.com>

> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e2306a39526a..99ad7cc8683f 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6038,18 +6038,25 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
>         return need_tlb_flush;
>  }
>
> +static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> +                                          const struct kvm_memory_slot *slot)
> +{
> +       bool flush;
> +
> +       flush = slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
> +                                 PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, true);

The max level here only needs to be 2M since 1G page wouldn't be
split. I think the upper limit can be lowered to
KVM_MAX_HUGEPAGE_LEVEL - 1.
Not a significant performance difference though.

> +
> +       if (flush)
> +               kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> +
> +}
> +
>  void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
>                                    const struct kvm_memory_slot *slot)
>  {
>         if (kvm_memslots_have_rmaps(kvm)) {
>                 write_lock(&kvm->mmu_lock);
> -               /*
> -                * Zap only 4k SPTEs since the legacy MMU only supports dirty
> -                * logging at a 4k granularity and never creates collapsible
> -                * 2m SPTEs during dirty logging.
> -                */
> -               if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true))
> -                       kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> +               kvm_rmap_zap_collapsible_sptes(kvm, slot);
>                 write_unlock(&kvm->mmu_lock);
>         }
>
> --
> 2.35.0.rc2.247.g8bbb082509-goog
>
David Matlack March 3, 2022, 7:42 p.m. UTC | #2
On Mon, Feb 28, 2022 at 12:40 PM Ben Gardon <bgardon@google.com> wrote:
>
> On Wed, Feb 2, 2022 at 5:02 PM David Matlack <dmatlack@google.com> wrote:
> >
> > Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e.
> > in the rmap). This leads to correct behavior because KVM never creates
> > intermediate huge pages during dirty logging. For example, a 1GiB page
> > is never partially split to a 2MiB page.
> >
> > However this behavior will stop being correct once the shadow MMU
> > participates in eager page splitting, which can in fact leave behind
> > partially split huge pages. In preparation for that change, change the
> > shadow MMU to iterate over all levels when zapping collapsible SPTEs.
> >
> > No functional change intended.
> >
>
> Reviewed-by: Ben Gardon <bgardon@google.com>
>
> > Signed-off-by: David Matlack <dmatlack@google.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++-------
> >  1 file changed, 14 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e2306a39526a..99ad7cc8683f 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -6038,18 +6038,25 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
> >         return need_tlb_flush;
> >  }
> >
> > +static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
> > +                                          const struct kvm_memory_slot *slot)
> > +{
> > +       bool flush;
> > +
> > +       flush = slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
> > +                                 PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, true);
>
> The max level here only needs to be 2M since 1G page wouldn't be
> split. I think the upper limit can be lowered to
> KVM_MAX_HUGEPAGE_LEVEL - 1.
> Not a significant performance difference though.

Good point. There's no reason to look at huge pages that are already
mapped at the maximum possible level.

>
> > +
> > +       if (flush)
> > +               kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> > +
> > +}
> > +
> >  void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
> >                                    const struct kvm_memory_slot *slot)
> >  {
> >         if (kvm_memslots_have_rmaps(kvm)) {
> >                 write_lock(&kvm->mmu_lock);
> > -               /*
> > -                * Zap only 4k SPTEs since the legacy MMU only supports dirty
> > -                * logging at a 4k granularity and never creates collapsible
> > -                * 2m SPTEs during dirty logging.
> > -                */
> > -               if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true))
> > -                       kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
> > +               kvm_rmap_zap_collapsible_sptes(kvm, slot);
> >                 write_unlock(&kvm->mmu_lock);
> >         }
> >
> > --
> > 2.35.0.rc2.247.g8bbb082509-goog
> >
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e2306a39526a..99ad7cc8683f 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6038,18 +6038,25 @@  static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 	return need_tlb_flush;
 }
 
+static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
+					   const struct kvm_memory_slot *slot)
+{
+	bool flush;
+
+	flush = slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
+				  PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, true);
+
+	if (flush)
+		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+
+}
+
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 				   const struct kvm_memory_slot *slot)
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		/*
-		 * Zap only 4k SPTEs since the legacy MMU only supports dirty
-		 * logging at a 4k granularity and never creates collapsible
-		 * 2m SPTEs during dirty logging.
-		 */
-		if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true))
-			kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+		kvm_rmap_zap_collapsible_sptes(kvm, slot);
 		write_unlock(&kvm->mmu_lock);
 	}