diff mbox series

[v2,4/4] KVM: x86/mmu: Improve TLB flush comment in kvm_mmu_slot_remove_write_access()

Message ID 20220113233020.3986005-5-dmatlack@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/mmu: Fix write-protection bug in the TDP MMU | expand

Commit Message

David Matlack Jan. 13, 2022, 11:30 p.m. UTC
Rewrite the comment in kvm_mmu_slot_remove_write_access() that explains
why it is safe to flush TLBs outside of the MMU lock after
write-protecting SPTEs for dirty logging. The current comment is a long
run-on sentence that was difficult to understand. In addition it was
specific to the shadow MMU (mentioning mmu_spte_update()) when the TDP
MMU has to handle this as well.

The new comment explains:
 - Why the TLB flush is necessary at all.
 - Why it is desirable to do the TLB flush outside of the MMU lock.
 - Why it is safe to do the TLB flush outside of the MMU lock.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

Comments

Sean Christopherson Jan. 14, 2022, 11:58 p.m. UTC | #1
On Thu, Jan 13, 2022, David Matlack wrote:
> Rewrite the comment in kvm_mmu_slot_remove_write_access() that explains
> why it is safe to flush TLBs outside of the MMU lock after
> write-protecting SPTEs for dirty logging. The current comment is a long
> run-on sentence that was difficult to understand. In addition it was
> specific to the shadow MMU (mentioning mmu_spte_update()) when the TDP
> MMU has to handle this as well.
> 
> The new comment explains:
>  - Why the TLB flush is necessary at all.
>  - Why it is desirable to do the TLB flush outside of the MMU lock.
>  - Why it is safe to do the TLB flush outside of the MMU lock.
> 
> No functional change intended.
> 
> Signed-off-by: David Matlack <dmatlack@google.com>

One nit below,

Reviewed-by: Sean Christopherson <seanjc@google.com>

> ---
>  arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 1d275e9d76b5..8ed2b42a7aa3 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5756,6 +5756,7 @@ static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
>  				continue;
>  
>  			flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp,
> +
>  							PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
>  							start, end - 1, true, flush);
>  		}
> @@ -5825,15 +5826,27 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
>  	}
>  
>  	/*
> -	 * We can flush all the TLBs out of the mmu lock without TLB
> -	 * corruption since we just change the spte from writable to
> -	 * readonly so that we only need to care the case of changing
> -	 * spte from present to present (changing the spte from present
> -	 * to nonpresent will flush all the TLBs immediately), in other
> -	 * words, the only case we care is mmu_spte_update() where we
> -	 * have checked Host-writable | MMU-writable instead of
> -	 * PT_WRITABLE_MASK, that means it does not depend on PT_WRITABLE_MASK
> -	 * anymore.
> +	 * Flush TLBs if any SPTEs had to be write-protected to ensure that
> +	 * guest writes are reflected in the dirty bitmap before the memslot
> +	 * update completes, i.e. before enabling dirty logging is visible to
> +	 * userspace.
> +	 *
> +	 * Perform the TLB flush outside the mmu_lock to reduce the amount of
> +	 * time the lock is held. However, this does mean that another CPU can
> +	 * now grab the mmu_lock and encounter an SPTE that is write-protected
> +	 * while CPUs still have writable versions of that SPTE in their TLB.

Uber nit on "SPTE in their TLB".  Maybe this?

	 * now grab mmu_lock and encounter a write-protected SPTE while CPUs
	 * still have a writable mapping for the associated GFN in their TLB.

> +	 *
> +	 * This is safe but requires KVM to be careful when making decisions
> +	 * based on the write-protection status of an SPTE. Specifically, KVM
> +	 * also write-protects SPTEs to monitor changes to guest page tables
> +	 * during shadow paging, and must guarantee no CPUs can write to those
> +	 * page before the lock is dropped. As mentioned in the previous
> +	 * paragraph, a write-protected SPTE is no guarantee that CPU cannot
> +	 * perform writes. So to determine if a TLB flush is truly required, KVM
> +	 * will clear a separate software-only bit (MMU-writable) and skip the
> +	 * flush if-and-only-if this bit was already clear.
> +	 *
> +	 * See DEFAULT_SPTE_MMU_WRITEABLE for more details.
>  	 */
>  	if (flush)
>  		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
> -- 
> 2.34.1.703.g22d0c6ccf7-goog
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1d275e9d76b5..8ed2b42a7aa3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5756,6 +5756,7 @@  static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 				continue;
 
 			flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp,
+
 							PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL,
 							start, end - 1, true, flush);
 		}
@@ -5825,15 +5826,27 @@  void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 	}
 
 	/*
-	 * We can flush all the TLBs out of the mmu lock without TLB
-	 * corruption since we just change the spte from writable to
-	 * readonly so that we only need to care the case of changing
-	 * spte from present to present (changing the spte from present
-	 * to nonpresent will flush all the TLBs immediately), in other
-	 * words, the only case we care is mmu_spte_update() where we
-	 * have checked Host-writable | MMU-writable instead of
-	 * PT_WRITABLE_MASK, that means it does not depend on PT_WRITABLE_MASK
-	 * anymore.
+	 * Flush TLBs if any SPTEs had to be write-protected to ensure that
+	 * guest writes are reflected in the dirty bitmap before the memslot
+	 * update completes, i.e. before enabling dirty logging is visible to
+	 * userspace.
+	 *
+	 * Perform the TLB flush outside the mmu_lock to reduce the amount of
+	 * time the lock is held. However, this does mean that another CPU can
+	 * now grab the mmu_lock and encounter an SPTE that is write-protected
+	 * while CPUs still have writable versions of that SPTE in their TLB.
+	 *
+	 * This is safe but requires KVM to be careful when making decisions
+	 * based on the write-protection status of an SPTE. Specifically, KVM
+	 * also write-protects SPTEs to monitor changes to guest page tables
+	 * during shadow paging, and must guarantee no CPUs can write to those
+	 * page before the lock is dropped. As mentioned in the previous
+	 * paragraph, a write-protected SPTE is no guarantee that CPU cannot
+	 * perform writes. So to determine if a TLB flush is truly required, KVM
+	 * will clear a separate software-only bit (MMU-writable) and skip the
+	 * flush if-and-only-if this bit was already clear.
+	 *
+	 * See DEFAULT_SPTE_MMU_WRITEABLE for more details.
 	 */
 	if (flush)
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);