Message ID | 20220429210025.3293691-7-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Fix mmu_notifier vs. pfncache vs. pfncache races | expand |
On 4/29/22 23:00, Sean Christopherson wrote: > diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c > index 05cb0bcbf662..eaef31462bbe 100644 > --- a/virt/kvm/pfncache.c > +++ b/virt/kvm/pfncache.c > @@ -157,6 +157,13 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, > if (page_offset + len > PAGE_SIZE) > return -EINVAL; > > + /* > + * If another task is refreshing the cache, wait for it to complete. > + * There is no guarantee that concurrent refreshes will see the same > + * gpa, memslots generation, etc..., so they must be fully serialized. > + */ > + mutex_lock(&gpc->refresh_lock); > + > write_lock_irq(&gpc->lock); > > old_pfn = gpc->pfn; > @@ -248,6 +255,8 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, > out: > write_unlock_irq(&gpc->lock); > > + mutex_unlock(&gpc->refresh_lock); > + > gpc_release_pfn_and_khva(kvm, old_pfn, old_khva); > > return ret; Does kvm_gfn_to_pfn_cache_unmap also need to take the mutex, to avoid the WARN_ON(gpc->valid)? Paolo
On Fri, May 20, 2022, Paolo Bonzini wrote: > On 4/29/22 23:00, Sean Christopherson wrote: > > diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c > > index 05cb0bcbf662..eaef31462bbe 100644 > > --- a/virt/kvm/pfncache.c > > +++ b/virt/kvm/pfncache.c > > @@ -157,6 +157,13 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, > > if (page_offset + len > PAGE_SIZE) > > return -EINVAL; > > + /* > > + * If another task is refreshing the cache, wait for it to complete. > > + * There is no guarantee that concurrent refreshes will see the same > > + * gpa, memslots generation, etc..., so they must be fully serialized. > > + */ > > + mutex_lock(&gpc->refresh_lock); > > + > > write_lock_irq(&gpc->lock); > > old_pfn = gpc->pfn; > > @@ -248,6 +255,8 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, > > out: > > write_unlock_irq(&gpc->lock); > > + mutex_unlock(&gpc->refresh_lock); > > + > > gpc_release_pfn_and_khva(kvm, old_pfn, old_khva); > > return ret; > > Does kvm_gfn_to_pfn_cache_unmap also need to take the mutex, to avoid the > WARN_ON(gpc->valid)? I don't know What WARN_ON() you're referring to, but there is a double-free bug if unmap() runs during an invalidation. That can be solved without having to take the mutex though, just reset valid/pfn/khva before the retry. When searching to see how unmap() was used in the original series (there's no other user besides destroy...), I stumbled across this likely-related syzbot bug that unfortunately didn't Cc KVM :-( https://lore.kernel.org/all/00000000000073f09205db439577@google.com diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 72eee096a7cd..1719b0249dbc 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -228,6 +228,11 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, if (!old_valid || old_uhva != gpc->uhva) { void *new_khva = NULL; + /* blah blah blah */ + gpc->valid = false; + gpc->pfn = KVM_PFN_ERR_FAULT; + gpc->khva = NULL; + new_pfn = hva_to_pfn_retry(kvm, gpc); if (is_error_noslot_pfn(new_pfn)) { ret = -EFAULT; @@ -251,11 +256,7 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, /* Nothing more to do, the pfn is consumed only by the guest. */ } - if (ret) { - gpc->valid = false; - gpc->pfn = KVM_PFN_ERR_FAULT; - gpc->khva = NULL; - } else { + if (!ret) { gpc->valid = true; gpc->pfn = new_pfn; gpc->khva = new_khva; @@ -283,6 +284,11 @@ void kvm_gfn_to_pfn_cache_unmap(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) write_lock_irq(&gpc->lock); + if (!gpc->valid) { + write_unlock_irq(&gpc->lock); + return; + } + gpc->valid = false; old_khva = gpc->khva - offset_in_page(gpc->khva);
On 5/20/22 17:53, Sean Christopherson wrote: >> Does kvm_gfn_to_pfn_cache_unmap also need to take the mutex, to avoid the >> WARN_ON(gpc->valid)? > I don't know What WARN_ON() you're referring to, but there is a double-free bug > if unmap() runs during an invalidation. That can be solved without having to > take the mutex though, just reset valid/pfn/khva before the retry. I was thinking of this one: /* * Other tasks must wait for _this_ refresh to complete before * attempting to refresh. */ WARN_ON_ONCE(gpc->valid); but unmap sets gpc->valid to false, not true. ಠ_ಠ Still, as you point out unmap() and refresh() can easily clash. In practice they probably exclude each other by different means (e.g. running in a single vCPU thread), but then in practice neither is a fast path. It seems easier to just make them thread-safe the easy way now that there is a mutex. > When searching to see how unmap() was used in the original series (there's no > other user besides destroy...), I stumbled across this likely-related syzbot bug > that unfortunately didn't Cc KVM:-( To give an example, VMCLEAR would do an unmap() if the VMCS12 was mapped with a gfn-to-pfn cache. Paolo
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index ac1ebb37a0ff..f328a01db4fe 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -19,6 +19,7 @@ struct kvm_memslots; enum kvm_mr_change; #include <linux/bits.h> +#include <linux/mutex.h> #include <linux/types.h> #include <linux/spinlock_types.h> @@ -69,6 +70,7 @@ struct gfn_to_pfn_cache { struct kvm_vcpu *vcpu; struct list_head list; rwlock_t lock; + struct mutex refresh_lock; void *khva; kvm_pfn_t pfn; enum pfn_cache_usage usage; diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 05cb0bcbf662..eaef31462bbe 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -157,6 +157,13 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, if (page_offset + len > PAGE_SIZE) return -EINVAL; + /* + * If another task is refreshing the cache, wait for it to complete. + * There is no guarantee that concurrent refreshes will see the same + * gpa, memslots generation, etc..., so they must be fully serialized. + */ + mutex_lock(&gpc->refresh_lock); + write_lock_irq(&gpc->lock); old_pfn = gpc->pfn; @@ -248,6 +255,8 @@ int kvm_gfn_to_pfn_cache_refresh(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, out: write_unlock_irq(&gpc->lock); + mutex_unlock(&gpc->refresh_lock); + gpc_release_pfn_and_khva(kvm, old_pfn, old_khva); return ret; @@ -288,6 +297,7 @@ int kvm_gfn_to_pfn_cache_init(struct kvm *kvm, struct gfn_to_pfn_cache *gpc, if (!gpc->active) { rwlock_init(&gpc->lock); + mutex_init(&gpc->refresh_lock); gpc->khva = NULL; gpc->pfn = KVM_PFN_ERR_FAULT;
Protect gfn=>pfn cache refresh with a mutex to fully serialize refreshes. The refresh logic doesn't protect against concurrent refreshes with different GPAs (which may or may not be a desired use case, but it's allowed in the code), nor does it protect against a false negative on the memslot generation. If the first refresh sees a stale memslot generation, it will refresh the hva and generation before moving on to the hva=>pfn translation. If it then drops gpc->lock, a different user of the cache can come along, acquire gpc->lock, see that the memslot generation is fresh, and skip the hva=>pfn update due to the userspace address also matching (because it too was updated). The refresh path can already sleep during hva=>pfn resolution, so wrap the refresh with a mutex to ensure that any given refresh runs to completion before other callers can start their refresh. Cc: stable@vger.kernel.org Cc: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> --- include/linux/kvm_types.h | 2 ++ virt/kvm/pfncache.c | 10 ++++++++++ 2 files changed, 12 insertions(+)