diff mbox series

[v4,03/30] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic

Message ID 20220303193842.370645-4-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Overhaul TDP MMU zapping and flushing | expand

Commit Message

Paolo Bonzini March 3, 2022, 7:38 p.m. UTC
From: Sean Christopherson <seanjc@google.com>

Explicitly ignore the result of zap_gfn_range() when putting the last
reference to a TDP MMU root, and add a pile of comments to formalize the
TDP MMU's behavior of deferring TLB flushes to alloc/reuse.  Note, this
only affects the !shared case, as zap_gfn_range() subtly never returns
true for "flush" as the flush is handled by tdp_mmu_zap_spte_atomic().

Putting the root without a flush is ok because even if there are stale
references to the root in the TLB, they are unreachable because KVM will
not run the guest with the same ASID without first flushing (where ASID
in this context refers to both SVM's explicit ASID and Intel's implicit
ASID that is constructed from VPID+PCID+EPT4A+etc...).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220226001546.360188-5-seanjc@google.com>
Reviewed-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/mmu/mmu.c     |  8 ++++++++
 arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

Comments

Mingwei Zhang March 3, 2022, 11:39 p.m. UTC | #1
On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Explicitly ignore the result of zap_gfn_range() when putting the last
> reference to a TDP MMU root, and add a pile of comments to formalize the
> TDP MMU's behavior of deferring TLB flushes to alloc/reuse.  Note, this
> only affects the !shared case, as zap_gfn_range() subtly never returns
> true for "flush" as the flush is handled by tdp_mmu_zap_spte_atomic().
> 
> Putting the root without a flush is ok because even if there are stale
> references to the root in the TLB, they are unreachable because KVM will
> not run the guest with the same ASID without first flushing (where ASID
> in this context refers to both SVM's explicit ASID and Intel's implicit
> ASID that is constructed from VPID+PCID+EPT4A+etc...).
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Message-Id: <20220226001546.360188-5-seanjc@google.com>
> Reviewed-by: Mingwei Zhang <mizhang@google.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/mmu/mmu.c     |  8 ++++++++
>  arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 32c041ed65cb..9a6df2d02777 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5083,6 +5083,14 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
>  	kvm_mmu_sync_roots(vcpu);
>  
>  	kvm_mmu_load_pgd(vcpu);
> +
> +	/*
> +	 * Flush any TLB entries for the new root, the provenance of the root
> +	 * is unknown.  Even if KVM ensures there are no stale TLB entries
> +	 * for a freed root, in theory another hypervisor could have left
> +	 * stale entries.  Flushing on alloc also allows KVM to skip the TLB
> +	 * flush when freeing a root (see kvm_tdp_mmu_put_root()).
> +	 */
>  	static_call(kvm_x86_flush_tlb_current)(vcpu);
>  out:
>  	return r;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index b97a4125feac..921fa386df99 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -93,7 +93,15 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
>  	list_del_rcu(&root->link);
>  	spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>  
> -	zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
> +	/*
> +	 * A TLB flush is not necessary as KVM performs a local TLB flush when
> +	 * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
> +	 * to a different pCPU.  Note, the local TLB flush on reuse also
> +	 * invalidates any paging-structure-cache entries, i.e. TLB entries for
> +	 * intermediate paging structures, that may be zapped, as such entries
> +	 * are associated with the ASID on both VMX and SVM.
> +	 */
> +	(void)zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);

Discussed offline with Sean. Now I get myself comfortable with the style
of mmu with multiple 'roots' and leaving TLB unflushed for invalidated
roots.

I guess one minor improvement on the comment could be:

"A TLB flush is not necessary as each vCPU performs a local TLB flush
when allocating or assigning a new root (see kvm_mmu_load()), and when
migrating to a different pCPU."

The above could be better since "KVM performs a local TLB flush" makes
readers think why we miss the 'remote' TLB flushes?
>  
>  	call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
>  }
> -- 
> 2.31.1
> 
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 32c041ed65cb..9a6df2d02777 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5083,6 +5083,14 @@  int kvm_mmu_load(struct kvm_vcpu *vcpu)
 	kvm_mmu_sync_roots(vcpu);
 
 	kvm_mmu_load_pgd(vcpu);
+
+	/*
+	 * Flush any TLB entries for the new root, the provenance of the root
+	 * is unknown.  Even if KVM ensures there are no stale TLB entries
+	 * for a freed root, in theory another hypervisor could have left
+	 * stale entries.  Flushing on alloc also allows KVM to skip the TLB
+	 * flush when freeing a root (see kvm_tdp_mmu_put_root()).
+	 */
 	static_call(kvm_x86_flush_tlb_current)(vcpu);
 out:
 	return r;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index b97a4125feac..921fa386df99 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -93,7 +93,15 @@  void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
 	list_del_rcu(&root->link);
 	spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
 
-	zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
+	/*
+	 * A TLB flush is not necessary as KVM performs a local TLB flush when
+	 * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
+	 * to a different pCPU.  Note, the local TLB flush on reuse also
+	 * invalidates any paging-structure-cache entries, i.e. TLB entries for
+	 * intermediate paging structures, that may be zapped, as such entries
+	 * are associated with the ASID on both VMX and SVM.
+	 */
+	(void)zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
 
 	call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
 }