[v5,00/10] KVM: xen: update shared_info and vcpu_info handling

Message ID	20230922150009.3319-1-paul@xen.org (mailing list archive)
Headers	show Return-Path: <kvm-owner@vger.kernel.org> From: Paul Durrant <paul@xen.org> To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Paul Durrant <pdurrant@amazon.com>, "H. Peter Anvin" <hpa@zytor.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, David Woodhouse <dwmw2@infradead.org>, Ingo Molnar <mingo@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Sean Christopherson <seanjc@google.com>, Thomas Gleixner <tglx@linutronix.de>, x86@kernel.org Subject: [PATCH v5 00/10] KVM: xen: update shared_info and vcpu_info handling Date: Fri, 22 Sep 2023 14:59:59 +0000 Message-Id: <20230922150009.3319-1-paul@xen.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	KVM: xen: update shared_info and vcpu_info handling \| expand [v5,00/10] KVM: xen: update shared_info and vcpu_info handling [v5,01/10] KVM: pfncache: add a map helper function [v5,02/10] KVM: pfncache: add a mark-dirty helper [v5,03/10] KVM: pfncache: add a helper to get the gpa [v5,04/10] KVM: pfncache: base offset check on khva rather than gpa [v5,05/10] KVM: pfncache: allow a cache to be activated with a fixed (userspace) HVA [v5,06/10] KVM: xen: allow shared_info to be mapped by fixed HVA [v5,07/10] KVM: xen: allow vcpu_info to be mapped by fixed HVA [v5,08/10] KVM: selftests / xen: map shared_info using HVA rather than GFN [v5,09/10] KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA [v5,10/10] KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability

Message ID

20230922150009.3319-1-paul@xen.org (mailing list archive)

Headers

From: Paul Durrant <paul@xen.org>
To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Paul Durrant <pdurrant@amazon.com>,
        "H. Peter Anvin" <hpa@zytor.com>, Borislav Petkov <bp@alien8.de>,
        Dave Hansen <dave.hansen@linux.intel.com>,
        David Woodhouse <dwmw2@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Sean Christopherson <seanjc@google.com>,
        Thomas Gleixner <tglx@linutronix.de>, x86@kernel.org
Subject: [PATCH v5 00/10] KVM: xen: update shared_info and vcpu_info handling
Date: Fri, 22 Sep 2023 14:59:59 +0000
Message-Id: <20230922150009.3319-1-paul@xen.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

KVM: xen: update shared_info and vcpu_info handling | expand

Message

Paul Durrant Sept. 22, 2023, 2:59 p.m. UTC

From: Paul Durrant <pdurrant@amazon.com>

The following part of the original cover letter still applies...

"Currently we treat the shared_info page as guest memory and the VMM
informs KVM of its location using a GFN. However it is not guest memory as
such; it's an overlay page. So we pointlessly invalidate and re-cache a
mapping to the *same page* of memory every time the guest requests that
shared_info be mapped into its address space. Let's avoid doing that by
modifying the pfncache code to allow activation using a fixed userspace
HVA as well as a GPA."

However, this version of the series has dropped the other changes to try
to handle the default vcpu_info location directly in KVM. With all the
corner cases, it was getting sufficiently complex the functionality is
better off staying in the VMM. So, instead of that code, two new patches
have been added:

"xen: allow vcpu_info to be mapped by fixed HVA" is analogous to the
"xen: allow shared_info to be mapped by fixed HVA" patch that has been
present from the original version of the series and simply provides an
attribute to that vcpu_info can be mapped using a fixed userspace HVA,
which is desirable when using one embedded in the shared_info page (since
we similarly avoid pointless cache invalidations).

"selftests / xen: re-map vcpu_info using HVA rather than GPA" is just a
small addition to the 'xen_shinfo_test' selftest to swizzle the vcpu_info
mapping to show that there's no functional change.

Paul Durrant (10):
  KVM: pfncache: add a map helper function
  KVM: pfncache: add a mark-dirty helper
  KVM: pfncache: add a helper to get the gpa
  KVM: pfncache: base offset check on khva rather than gpa
  KVM: pfncache: allow a cache to be activated with a fixed (userspace)
    HVA
  KVM: xen: allow shared_info to be mapped by fixed HVA
  KVM: xen: allow vcpu_info to be mapped by fixed HVA
  KVM: selftests / xen: map shared_info using HVA rather than GFN
  KVM: selftests / xen: re-map vcpu_info using HVA rather than GPA
  KVM: xen: advertize the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA capability

 Documentation/virt/kvm/api.rst                |  53 +++++--
 arch/x86/kvm/x86.c                            |   5 +-
 arch/x86/kvm/xen.c                            |  89 ++++++++----
 include/linux/kvm_host.h                      |  43 ++++++
 include/linux/kvm_types.h                     |   3 +-
 include/uapi/linux/kvm.h                      |   9 +-
 .../selftests/kvm/x86_64/xen_shinfo_test.c    |  59 ++++++--
 virt/kvm/pfncache.c                           | 129 +++++++++++++-----
 8 files changed, 302 insertions(+), 88 deletions(-)
---
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org

Comments

David Woodhouse Sept. 22, 2023, 3:37 p.m. UTC | #1

On Fri, 2023-09-22 at 14:59 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> The following part of the original cover letter still applies...
> 
> "Currently we treat the shared_info page as guest memory and the VMM
> informs KVM of its location using a GFN. However it is not guest memory as
> such; it's an overlay page. So we pointlessly invalidate and re-cache a
> mapping to the *same page* of memory every time the guest requests that
> shared_info be mapped into its address space. Let's avoid doing that by
> modifying the pfncache code to allow activation using a fixed userspace
> HVA as well as a GPA."
> 
> However, this version of the series has dropped the other changes to try
> to handle the default vcpu_info location directly in KVM. With all the
> corner cases, it was getting sufficiently complex the functionality is
> better off staying in the VMM. So, instead of that code, two new patches
> have been added:

I think there's key information missing from this cover letter (and
since cover letters don't get preserved, it probably wants to end up in
one of the commits too).

This isn't *just* an optimisation; it's not just that we're pointlessly
invalidating and re-caching it. The problem is the time in *between*
those two, because we don't have atomic memslot updates (qv).

If we have to break apart a large memslot which contains the
shared_info GPA, then add back the two pieces and whatever we've
overlaid in the middle which broke it in two... there are long periods
of time when an interrupt might arrive and the shared_info GPA might
just be *absent*.

Using the HVA for the shinfo page makes a whole bunch of sense since
it's kind of supposed to be a xenheap page anyway and not defined by
the guest address it may — or may NOT — be mapped at. But more to the
point, using the HVA means that the kernel can continue to deliver
event channel interrupts (e.g. timer virqs, MSI pirqs from passthrough
devices, etc.) to it even when it *isn't* mapped.

We don't have the same problem for the vcpu_info because there's a per-
vcpu *shadow* of evtchn_pending_sel for that very purpose, which the
vCPU itself will OR into the real vcpu_info on the way into guest mode.

So since we have to stop all vCPUs before changing the memslots anyway,
the events can gather in that evtchn_pending_sel and it all works out
OK.

It would still be nice to have a way of atomically sticking an overlay
page over the middle of an existing memslot and breaking it apart, but
we can live without it. And even if we *did* get that, what you're
doing here makes a lot of sense anyway.