Message ID | 20220113233020.3986005-5-dmatlack@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/mmu: Fix write-protection bug in the TDP MMU | expand |
On Thu, Jan 13, 2022, David Matlack wrote: > Rewrite the comment in kvm_mmu_slot_remove_write_access() that explains > why it is safe to flush TLBs outside of the MMU lock after > write-protecting SPTEs for dirty logging. The current comment is a long > run-on sentence that was difficult to understand. In addition it was > specific to the shadow MMU (mentioning mmu_spte_update()) when the TDP > MMU has to handle this as well. > > The new comment explains: > - Why the TLB flush is necessary at all. > - Why it is desirable to do the TLB flush outside of the MMU lock. > - Why it is safe to do the TLB flush outside of the MMU lock. > > No functional change intended. > > Signed-off-by: David Matlack <dmatlack@google.com> One nit below, Reviewed-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++--------- > 1 file changed, 22 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index 1d275e9d76b5..8ed2b42a7aa3 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -5756,6 +5756,7 @@ static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) > continue; > > flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp, > + > PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, > start, end - 1, true, flush); > } > @@ -5825,15 +5826,27 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, > } > > /* > - * We can flush all the TLBs out of the mmu lock without TLB > - * corruption since we just change the spte from writable to > - * readonly so that we only need to care the case of changing > - * spte from present to present (changing the spte from present > - * to nonpresent will flush all the TLBs immediately), in other > - * words, the only case we care is mmu_spte_update() where we > - * have checked Host-writable | MMU-writable instead of > - * PT_WRITABLE_MASK, that means it does not depend on PT_WRITABLE_MASK > - * anymore. > + * Flush TLBs if any SPTEs had to be write-protected to ensure that > + * guest writes are reflected in the dirty bitmap before the memslot > + * update completes, i.e. before enabling dirty logging is visible to > + * userspace. > + * > + * Perform the TLB flush outside the mmu_lock to reduce the amount of > + * time the lock is held. However, this does mean that another CPU can > + * now grab the mmu_lock and encounter an SPTE that is write-protected > + * while CPUs still have writable versions of that SPTE in their TLB. Uber nit on "SPTE in their TLB". Maybe this? * now grab mmu_lock and encounter a write-protected SPTE while CPUs * still have a writable mapping for the associated GFN in their TLB. > + * > + * This is safe but requires KVM to be careful when making decisions > + * based on the write-protection status of an SPTE. Specifically, KVM > + * also write-protects SPTEs to monitor changes to guest page tables > + * during shadow paging, and must guarantee no CPUs can write to those > + * page before the lock is dropped. As mentioned in the previous > + * paragraph, a write-protected SPTE is no guarantee that CPU cannot > + * perform writes. So to determine if a TLB flush is truly required, KVM > + * will clear a separate software-only bit (MMU-writable) and skip the > + * flush if-and-only-if this bit was already clear. > + * > + * See DEFAULT_SPTE_MMU_WRITEABLE for more details. > */ > if (flush) > kvm_arch_flush_remote_tlbs_memslot(kvm, memslot); > -- > 2.34.1.703.g22d0c6ccf7-goog >
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 1d275e9d76b5..8ed2b42a7aa3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5756,6 +5756,7 @@ static bool __kvm_zap_rmaps(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) continue; flush = slot_handle_level_range(kvm, memslot, kvm_zap_rmapp, + PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, start, end - 1, true, flush); } @@ -5825,15 +5826,27 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, } /* - * We can flush all the TLBs out of the mmu lock without TLB - * corruption since we just change the spte from writable to - * readonly so that we only need to care the case of changing - * spte from present to present (changing the spte from present - * to nonpresent will flush all the TLBs immediately), in other - * words, the only case we care is mmu_spte_update() where we - * have checked Host-writable | MMU-writable instead of - * PT_WRITABLE_MASK, that means it does not depend on PT_WRITABLE_MASK - * anymore. + * Flush TLBs if any SPTEs had to be write-protected to ensure that + * guest writes are reflected in the dirty bitmap before the memslot + * update completes, i.e. before enabling dirty logging is visible to + * userspace. + * + * Perform the TLB flush outside the mmu_lock to reduce the amount of + * time the lock is held. However, this does mean that another CPU can + * now grab the mmu_lock and encounter an SPTE that is write-protected + * while CPUs still have writable versions of that SPTE in their TLB. + * + * This is safe but requires KVM to be careful when making decisions + * based on the write-protection status of an SPTE. Specifically, KVM + * also write-protects SPTEs to monitor changes to guest page tables + * during shadow paging, and must guarantee no CPUs can write to those + * page before the lock is dropped. As mentioned in the previous + * paragraph, a write-protected SPTE is no guarantee that CPU cannot + * perform writes. So to determine if a TLB flush is truly required, KVM + * will clear a separate software-only bit (MMU-writable) and skip the + * flush if-and-only-if this bit was already clear. + * + * See DEFAULT_SPTE_MMU_WRITEABLE for more details. */ if (flush) kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
Rewrite the comment in kvm_mmu_slot_remove_write_access() that explains why it is safe to flush TLBs outside of the MMU lock after write-protecting SPTEs for dirty logging. The current comment is a long run-on sentence that was difficult to understand. In addition it was specific to the shadow MMU (mentioning mmu_spte_update()) when the TDP MMU has to handle this as well. The new comment explains: - Why the TLB flush is necessary at all. - Why it is desirable to do the TLB flush outside of the MMU lock. - Why it is safe to do the TLB flush outside of the MMU lock. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> --- arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-)