diff mbox series

[1/2] KVM: x86/mmu: Don't rely on page-track mechanism to flush on memslot change

Message ID 20221110014821.1548347-2-seanjc@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Use page-track only for... page tracking | expand

Commit Message

Sean Christopherson Nov. 10, 2022, 1:48 a.m. UTC
Call kvm_mmu_zap_all_fast() directly when flushing a memslot instead of
bounding through the page-track mechanism.  KVM (unfortunately) needs to
zap and flush all page tables on memslot DELETE/MOVE irrespective of
whether KVM is shadowing guest page tables.

This will allow changing KVM to register a page-track notifier on the
first shadow root allocation, and will also allow deleting the misguided
kvm_page_track_flush_slot() hook itself once KVM-GT also moves to a
different method for reacting to memslot changes.

No functional change intended.

Cc: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu/mmu.c          | 10 +---------
 arch/x86/kvm/x86.c              |  2 ++
 3 files changed, 4 insertions(+), 9 deletions(-)

Comments

Yan Zhao Nov. 10, 2022, 2:27 a.m. UTC | #1
On Thu, Nov 10, 2022 at 01:48:20AM +0000, Sean Christopherson wrote:
> Call kvm_mmu_zap_all_fast() directly when flushing a memslot instead of
> bounding through the page-track mechanism.  KVM (unfortunately) needs to
> zap and flush all page tables on memslot DELETE/MOVE irrespective of
> whether KVM is shadowing guest page tables.
> 
> This will allow changing KVM to register a page-track notifier on the
> first shadow root allocation, and will also allow deleting the misguided
> kvm_page_track_flush_slot() hook itself once KVM-GT also moves to a
> different method for reacting to memslot changes.
>
<...>
> @@ -6021,7 +6014,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
>  		return r;
>  
>  	node->track_write = kvm_mmu_pte_write;
> -	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
>  	kvm_page_track_register_notifier(kvm, node);
>  
>  	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e46e458c5b08..5da86fe3c113 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12550,6 +12550,8 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
>  void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>  				   struct kvm_memory_slot *slot)
>  {
> +	kvm_mmu_zap_all_fast(kvm);
> +
>  	kvm_page_track_flush_slot(kvm, slot);
Could we move this kvm_page_track_flush_slot() to right before
kvm_commit_memory_region()?
As KVM now does not need track_flush_slot any more and kvmgt is the only user
to track_flush_slot, we can rename it to track_slot_changed to notify
the new/deleted/moved slot.
Do you think it's good?

Thanks
Yan


>  }
>  
> -- 
> 2.38.1.431.g37b22c650d-goog
>
Sean Christopherson Nov. 10, 2022, 5:08 p.m. UTC | #2
On Thu, Nov 10, 2022, Yan Zhao wrote:
> On Thu, Nov 10, 2022 at 01:48:20AM +0000, Sean Christopherson wrote:
> > Call kvm_mmu_zap_all_fast() directly when flushing a memslot instead of
> > bounding through the page-track mechanism.  KVM (unfortunately) needs to
> > zap and flush all page tables on memslot DELETE/MOVE irrespective of
> > whether KVM is shadowing guest page tables.
> > 
> > This will allow changing KVM to register a page-track notifier on the
> > first shadow root allocation, and will also allow deleting the misguided
> > kvm_page_track_flush_slot() hook itself once KVM-GT also moves to a
> > different method for reacting to memslot changes.
> >
> <...>
> > @@ -6021,7 +6014,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
> >  		return r;
> >  
> >  	node->track_write = kvm_mmu_pte_write;
> > -	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
> >  	kvm_page_track_register_notifier(kvm, node);
> >  
> >  	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index e46e458c5b08..5da86fe3c113 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -12550,6 +12550,8 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> >  void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
> >  				   struct kvm_memory_slot *slot)
> >  {
> > +	kvm_mmu_zap_all_fast(kvm);
> > +
> >  	kvm_page_track_flush_slot(kvm, slot);
> Could we move this kvm_page_track_flush_slot() to right before
> kvm_commit_memory_region()?

More or less.  The page-track stuff is x86-specific, just move it into x86's
kvm_arch_commit_memory_region().

> As KVM now does not need track_flush_slot any more and kvmgt is the only user
> to track_flush_slot, we can rename it to track_slot_changed to notify
> the new/deleted/moved slot.
> Do you think it's good?

Given that KVM/KVM-GT have never propery supported the MOVE case, and (IIUC) that
there's no danger to the kernel if KVM-GT fails to write-protect a moved memslot,
I would say just change the hook to ->remove_memslot().  I.e. even if the memslot
is being moved, simply notify KVM-GT that the old memslot is being removed.

E.g.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5a2821ca03b8..437e3832e377 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12566,6 +12566,9 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
                                const struct kvm_memory_slot *new,
                                enum kvm_mr_change change)
 {
+       if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
+               kvm_page_track_remove_memslot(kvm, old);
+
        if (!kvm->arch.n_requested_mmu_pages &&
            (change == KVM_MR_CREATE || change == KVM_MR_DELETE)) {
                unsigned long nr_mmu_pages;
Yan Zhao Nov. 11, 2022, 2:18 a.m. UTC | #3
On Thu, Nov 10, 2022 at 05:08:30PM +0000, Sean Christopherson wrote:
> On Thu, Nov 10, 2022, Yan Zhao wrote:
> > On Thu, Nov 10, 2022 at 01:48:20AM +0000, Sean Christopherson wrote:
> > > Call kvm_mmu_zap_all_fast() directly when flushing a memslot instead of
> > > bounding through the page-track mechanism.  KVM (unfortunately) needs to
> > > zap and flush all page tables on memslot DELETE/MOVE irrespective of
> > > whether KVM is shadowing guest page tables.
> > > 
> > > This will allow changing KVM to register a page-track notifier on the
> > > first shadow root allocation, and will also allow deleting the misguided
> > > kvm_page_track_flush_slot() hook itself once KVM-GT also moves to a
> > > different method for reacting to memslot changes.
> > >
> > <...>
> > > @@ -6021,7 +6014,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
> > >  		return r;
> > >  
> > >  	node->track_write = kvm_mmu_pte_write;
> > > -	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
> > >  	kvm_page_track_register_notifier(kvm, node);
> > >  
> > >  	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index e46e458c5b08..5da86fe3c113 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -12550,6 +12550,8 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> > >  void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
> > >  				   struct kvm_memory_slot *slot)
> > >  {
> > > +	kvm_mmu_zap_all_fast(kvm);
> > > +
> > >  	kvm_page_track_flush_slot(kvm, slot);
> > Could we move this kvm_page_track_flush_slot() to right before
> > kvm_commit_memory_region()?
> 
> More or less.  The page-track stuff is x86-specific, just move it into x86's
> kvm_arch_commit_memory_region().
> 
> > As KVM now does not need track_flush_slot any more and kvmgt is the only user
> > to track_flush_slot, we can rename it to track_slot_changed to notify
> > the new/deleted/moved slot.
> > Do you think it's good?
> 
> Given that KVM/KVM-GT have never propery supported the MOVE case, and (IIUC) that
> there's no danger to the kernel if KVM-GT fails to write-protect a moved memslot,
> I would say just change the hook to ->remove_memslot().  I.e. even if the memslot
> is being moved, simply notify KVM-GT that the old memslot is being removed.
>
I think it should be good.
We can refine the support of MOVE later after it really happens.
Will do it by following your suggestions and based on this series :)

Thanks
Yan

> E.g.
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 5a2821ca03b8..437e3832e377 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12566,6 +12566,9 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>                                 const struct kvm_memory_slot *new,
>                                 enum kvm_mr_change change)
>  {
> +       if (change == KVM_MR_DELETE || change == KVM_MR_MOVE)
> +               kvm_page_track_remove_memslot(kvm, old);
> +
>         if (!kvm->arch.n_requested_mmu_pages &&
>             (change == KVM_MR_CREATE || change == KVM_MR_DELETE)) {
>                 unsigned long nr_mmu_pages;
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 81114a376c4e..382cfffb7e6c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1765,6 +1765,7 @@  void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,
 				   const struct kvm_memory_slot *memslot);
 void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_mmu_zap_all_fast(struct kvm *kvm);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned long kvm_nr_mmu_pages);
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 93c389eaf471..0a5ae07a190e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5943,7 +5943,7 @@  static void kvm_zap_obsolete_pages(struct kvm *kvm)
  * not use any resource of the being-deleted slot or all slots
  * after calling the function.
  */
-static void kvm_mmu_zap_all_fast(struct kvm *kvm)
+void kvm_mmu_zap_all_fast(struct kvm *kvm)
 {
 	lockdep_assert_held(&kvm->slots_lock);
 
@@ -5999,13 +5999,6 @@  static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
 	return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
 }
 
-static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
-			struct kvm_memory_slot *slot,
-			struct kvm_page_track_notifier_node *node)
-{
-	kvm_mmu_zap_all_fast(kvm);
-}
-
 int kvm_mmu_init_vm(struct kvm *kvm)
 {
 	struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker;
@@ -6021,7 +6014,6 @@  int kvm_mmu_init_vm(struct kvm *kvm)
 		return r;
 
 	node->track_write = kvm_mmu_pte_write;
-	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
 	kvm_page_track_register_notifier(kvm, node);
 
 	kvm->arch.split_page_header_cache.kmem_cache = mmu_page_header_cache;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e46e458c5b08..5da86fe3c113 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12550,6 +12550,8 @@  void kvm_arch_flush_shadow_all(struct kvm *kvm)
 void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
 				   struct kvm_memory_slot *slot)
 {
+	kvm_mmu_zap_all_fast(kvm);
+
 	kvm_page_track_flush_slot(kvm, slot);
 }