Message ID | 20240115125707.1183-19-paul@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: xen: update shared_info and vcpu_info handling | expand |
On Mon, Jan 15, 2024, Paul Durrant wrote: > From: Paul Durrant <pdurrant@amazon.com> > > Taking a write lock on a pfncache will be disruptive if the cache is *Unnecessarily* taking a write lock. Please save readers a bit of brain power and explain that this is beneificial when there are _unrelated_ invalidation. > heavily used (which only requires a read lock). Hence, in the MMU notifier > callback, take read locks on caches to check for a match; only taking a > write lock to actually perform an invalidation (after a another check). This doesn't have any dependency on this series, does it? I.e. this should be posted separately, and preferably with some performance data. Not having data isn't a sticking point, but it would be nice to verify that this isn't a pointless optimization.
On Tue, 2024-02-06 at 20:22 -0800, Sean Christopherson wrote: > On Mon, Jan 15, 2024, Paul Durrant wrote: > > From: Paul Durrant <pdurrant@amazon.com> > > > > Taking a write lock on a pfncache will be disruptive if the cache is > > *Unnecessarily* taking a write lock. No. Taking a write lock will be disrupting. Unnecessarily taking a write lock will be unnecessarily disrupting. Taking a write lock on a Thursday will be disrupting on a Thursday. But the key is that if the cache is heavily used, the user gets disrupted. > Please save readers a bit of brain power > and explain that this is beneificial when there are _unrelated_ invalidation. I don't understand what you're saying there. Paul's sentence did have an implicit "...so do that less then", but that didn't take much brain power to infer. > > heavily used (which only requires a read lock). Hence, in the MMU notifier > > callback, take read locks on caches to check for a match; only taking a > > write lock to actually perform an invalidation (after a another check). > > This doesn't have any dependency on this series, does it? I.e. this should be > posted separately, and preferably with some performance data. Not having data > isn't a sticking point, but it would be nice to verify that this isn't a > pointless optimization. No fundamental dependency, no. But it was triggered by the previous patch, which makes kvm_xen_set_evtchn_fast() use read_trylock() and makes it take the slow path when there's contention. It lives here just fine as part of the series.
On Tue, Feb 06, 2024, David Woodhouse wrote: > On Tue, 2024-02-06 at 20:22 -0800, Sean Christopherson wrote: > > On Mon, Jan 15, 2024, Paul Durrant wrote: > > > From: Paul Durrant <pdurrant@amazon.com> > > > > > > Taking a write lock on a pfncache will be disruptive if the cache is > > > > *Unnecessarily* taking a write lock. > > No. Taking a write lock will be disrupting. > > Unnecessarily taking a write lock will be unnecessarily disrupting. > > Taking a write lock on a Thursday will be disrupting on a Thursday. > > But the key is that if the cache is heavily used, the user gets > disrupted. If the invalidation is relevant, then this code is taking gpc->lock for write no matter what. The purpose of the changelog is to explain _why_ a patch adds value. > > Please save readers a bit of brain power > > and explain that this is beneificial when there are _unrelated_ invalidation. > > I don't understand what you're saying there. Paul's sentence did have > an implicit "...so do that less then", but that didn't take much brain > power to infer. I'm saying this: When processing mmu_notifier invalidations for gpc caches, pre-check for overlap with the invalidation event while holding gpc->lock for read, and only take gpc->lock for write if the cache needs to be invalidated. Doing a pre-check without taking gpc->lock for write avoids unnecessarily contending the lock for unrelated invalidations, which is very beneficial for caches that are heavily used (but rarely subjected to mmu_notifier invalidations). is much friendlier to readers than this: Taking a write lock on a pfncache will be disruptive if the cache is heavily used (which only requires a read lock). Hence, in the MMU notifier callback, take read locks on caches to check for a match; only taking a write lock to actually perform an invalidation (after a another check). Is it too much hand-holding, and bordering on stating the obvious? Maybe. But (a) a lot of people that read mailing lists and KVM code are *not* kernel experts, and (b) a changelog is written _once_, and read hundreds if not thousands of times. If we can save each reader even a few seconds, then taking an extra minute or two to write a more verbose changelog is a net win.
On Tue, 2024-02-06 at 20:47 -0800, Sean Christopherson wrote: > > I'm saying this: > > When processing mmu_notifier invalidations for gpc caches, pre-check for > overlap with the invalidation event while holding gpc->lock for read, and > only take gpc->lock for write if the cache needs to be invalidated. Doing > a pre-check without taking gpc->lock for write avoids unnecessarily > contending the lock for unrelated invalidations, which is very beneficial > for caches that are heavily used (but rarely subjected to mmu_notifier > invalidations). > > is much friendlier to readers than this: > > Taking a write lock on a pfncache will be disruptive if the cache is > heavily used (which only requires a read lock). Hence, in the MMU notifier > callback, take read locks on caches to check for a match; only taking a > write lock to actually perform an invalidation (after a another check). That's a somewhat subjective observation. I actually find the latter to be far more succinct and obvious. Actually... maybe I find yours harder because it isn't actually stating the situation as I understand it. You said "unrelated invalidation" in your first email, and "overlap with the invalidation event" in this one... neither of which makes sense to me because there is no *other* invalidation here. We're only talking about the MMU notifier gratuitously taking the write lock on a GPC that it *isn't* going to invalidate (the common case), and that disrupting users which are trying to take the read lock on that GPC.
On Tue, Feb 06, 2024, David Woodhouse wrote: > On Tue, 2024-02-06 at 20:47 -0800, Sean Christopherson wrote: > > > > I'm saying this: > > > > When processing mmu_notifier invalidations for gpc caches, pre-check for > > overlap with the invalidation event while holding gpc->lock for read, and > > only take gpc->lock for write if the cache needs to be invalidated. Doing > > a pre-check without taking gpc->lock for write avoids unnecessarily > > contending the lock for unrelated invalidations, which is very beneficial > > for caches that are heavily used (but rarely subjected to mmu_notifier > > invalidations). > > > > is much friendlier to readers than this: > > > > Taking a write lock on a pfncache will be disruptive if the cache is > > heavily used (which only requires a read lock). Hence, in the MMU notifier > > callback, take read locks on caches to check for a match; only taking a > > write lock to actually perform an invalidation (after a another check). > > That's a somewhat subjective observation. I actually find the latter to > be far more succinct and obvious. > > Actually... maybe I find yours harder because it isn't actually stating > the situation as I understand it. You said "unrelated invalidation" in > your first email, and "overlap with the invalidation event" in this > one... neither of which makes sense to me because there is no *other* > invalidation here. I am referring to the "mmu_notifier invalidation event". While a particular GPC may not be affected by the invalidation, it's entirely possible that a different GPC and/or some chunk of guest memory does need to be invalidated/zapped. > We're only talking about the MMU notifier gratuitously taking the write It's not "the MMU notifier" though, it's KVM that unnecessarily takes a lock. I know I'm being somewhat pedantic, but the distinction does matter. E.g. with guest_memfd, there will be invalidations that get routed through this code, but that do not originate in the mmu_notifier. And I think it's important to make it clear to readers that an mmu_notifier really just is a notification from the primary MMU, albeit a notification that comes with a rather strict contract. > lock on a GPC that it *isn't* going to invalidate (the common case), > and that disrupting users which are trying to take the read lock on > that GPC.
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index ae822bff812f..70394d7c9a38 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -29,14 +29,30 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start, spin_lock(&kvm->gpc_lock); list_for_each_entry(gpc, &kvm->gpc_list, list) { - write_lock_irq(&gpc->lock); + read_lock_irq(&gpc->lock); /* Only a single page so no need to care about length */ if (gpc->valid && !is_error_noslot_pfn(gpc->pfn) && gpc->uhva >= start && gpc->uhva < end) { - gpc->valid = false; + read_unlock_irq(&gpc->lock); + + /* + * There is a small window here where the cache could + * be modified, and invalidation would no longer be + * necessary. Hence check again whether invalidation + * is still necessary once the write lock has been + * acquired. + */ + + write_lock_irq(&gpc->lock); + if (gpc->valid && !is_error_noslot_pfn(gpc->pfn) && + gpc->uhva >= start && gpc->uhva < end) + gpc->valid = false; + write_unlock_irq(&gpc->lock); + continue; } - write_unlock_irq(&gpc->lock); + + read_unlock_irq(&gpc->lock); } spin_unlock(&kvm->gpc_lock); }