mbox series

[v2,0/8] KVM: x86/xen updates

Message ID 20240227115648.3104-1-dwmw2@infradead.org (mailing list archive)
Headers show
Series KVM: x86/xen updates | expand

Message

David Woodhouse Feb. 27, 2024, 11:49 a.m. UTC
These apply to the kvm-x86/xen tree.

First, deal with the awful brokenness of the KVM clock, and its systemic 
drift especially when TSC scaling is used. This is a bit of a workaround 
for Xen timers where it hurts *most*, but it's actually easier in this 
case because there is a vCPU (and associated PV clock information) to 
use for the scaling. A better fix for __get_kvmclock() is in the works, 
but there's an enormous yak to shave there because there are so many 
interrelated bugs in the TSC and timekeeping code.

Ensure that the guest doesn't miss Xen event channel wakeups which are
already pending when the local APIC is enabled. Userspace doesn't get
to interpose here, so KVM needs to do the same as Xen and explicitly
check for the pending event. While looking at that, Michal spotted a
potential false positive from the WARN_ON_ONCE() when delivering the
vector, so fix that too.

The remainder of the series is about cleaning up locking, simplifying
the pfncache locking so that a recursive lock deadlock in the Xen code
can be eliminated (by virtue of the inner function not having to take
that lock at all any more). The final patch in the series is optional,
but probably worth doing anyway.

In moving the rwlock cleanup to be an optional patch at the end of the
series, I've reworked the commit messages so most of the lamentation
about the existing horridness, and the mention of the "bug that should
not happen", is in the simpler ->refresh_lock patch.

In v2 I rounded up the patches which were dropped from Paul's shared-info
series, to (cosmetically) split up kvm_xen_set_evtchn_fast() and then fix
the RT_PREEMPT locking issue. To address the concerns about fairness when
using read_trylock(), I've adjusted it to only do so from IRQ context, so
if it does fall back to the slow path it still takes the lock normally as
before.

David Woodhouse (6):
      KVM: x86/xen: improve accuracy of Xen timers
      KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
      KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
      KVM: pfncache: simplify locking and make more self-contained
      KVM: x86/xen: fix recursive deadlock in timer injection
      KVM: pfncache: clean up rwlock abuse

Paul Durrant (2):
      KVM: x86/xen: split up kvm_xen_set_evtchn_fast()
      KVM: x86/xen: avoid blocking in hardirq context in kvm_xen_set_evtchn_fast()

 arch/x86/kvm/lapic.c |   5 +-
 arch/x86/kvm/x86.c   |  61 +++++++++-
 arch/x86/kvm/x86.h   |   1 +
 arch/x86/kvm/xen.c   | 327 +++++++++++++++++++++++++++++++++------------------
 arch/x86/kvm/xen.h   |  18 +++
 virt/kvm/pfncache.c  | 216 +++++++++++++++++-----------------
 6 files changed, 403 insertions(+), 225 deletions(-)

Comments

Sean Christopherson Feb. 29, 2024, 11:12 p.m. UTC | #1
On Tue, Feb 27, 2024, David Woodhouse wrote:
> David Woodhouse (6):
>       KVM: x86/xen: improve accuracy of Xen timers
>       KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
>       KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
>       KVM: pfncache: simplify locking and make more self-contained
>       KVM: x86/xen: fix recursive deadlock in timer injection
>       KVM: pfncache: clean up rwlock abuse
> 
> Paul Durrant (2):
>       KVM: x86/xen: split up kvm_xen_set_evtchn_fast()
>       KVM: x86/xen: avoid blocking in hardirq context in kvm_xen_set_evtchn_fast()
> 
>  arch/x86/kvm/lapic.c |   5 +-
>  arch/x86/kvm/x86.c   |  61 +++++++++-
>  arch/x86/kvm/x86.h   |   1 +
>  arch/x86/kvm/xen.c   | 327 +++++++++++++++++++++++++++++++++------------------
>  arch/x86/kvm/xen.h   |  18 +++
>  virt/kvm/pfncache.c  | 216 +++++++++++++++++-----------------
>  6 files changed, 403 insertions(+), 225 deletions(-)

FYI, I'm planning on grabbing at least the first 3 for 6.9, but I'm off tomorrow
and don't want to risk having to fix breakage in -next, so it won't happen until
next week.

I might also grab 4 and 5, I just need to stare at that locking code a bit.
Sean Christopherson March 5, 2024, 12:35 a.m. UTC | #2
On Tue, 27 Feb 2024 11:49:14 +0000, David Woodhouse wrote:
> These apply to the kvm-x86/xen tree.
> 
> First, deal with the awful brokenness of the KVM clock, and its systemic
> drift especially when TSC scaling is used. This is a bit of a workaround
> for Xen timers where it hurts *most*, but it's actually easier in this
> case because there is a vCPU (and associated PV clock information) to
> use for the scaling. A better fix for __get_kvmclock() is in the works,
> but there's an enormous yak to shave there because there are so many
> interrelated bugs in the TSC and timekeeping code.
> 
> [...]

Applied 1-5 to kvm-x86 xen.  Please take a look and test the result (patches 1
and 4 in particular).  I didn't _intend_ to make any functional changes outside
of fixing up the unlock goof, but I'd greatly appreciate extra eyeballs,
especially this close to the merge window.

Oh, and can you look at v2[*] of Vitaly's fixes for xen_shinfo_test?  I'd like
to get that applied soonish (I see intermittent failures), but I'm nowhere near
competent enough with clocks to give it a proper review.

Thanks!

[*] https://lore.kernel.org/all/20240206151950.31174-1-vkuznets@redhat.com


[1/8] KVM: x86/xen: improve accuracy of Xen timers
      https://github.com/kvm-x86/linux/commit/451a707813ae
[2/8] KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
      https://github.com/kvm-x86/linux/commit/8e62bf2bfa46
[3/8] KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
      https://github.com/kvm-x86/linux/commit/66e3cf729b1e
[4/8] KVM: pfncache: simplify locking and make more self-contained
      https://github.com/kvm-x86/linux/commit/6addfcf27139
[5/8] KVM: x86/xen: fix recursive deadlock in timer injection
      https://github.com/kvm-x86/linux/commit/7a36d680658b
[6/8] KVM: x86/xen: split up kvm_xen_set_evtchn_fast()
      (not applied)
[7/8] KVM: x86/xen: avoid blocking in hardirq context in kvm_xen_set_evtchn_fast()
      (not applied)
[8/8] KVM: pfncache: clean up rwlock abuse
      (not applied)

--
https://github.com/kvm-x86/linux/tree/next