mbox series

[0/7] Make ASIDs static for SVM

Message ID 20250313215540.4171762-1-yosry.ahmed@linux.dev (mailing list archive)
Headers show
Series Make ASIDs static for SVM | expand

Message

Yosry Ahmed March 13, 2025, 9:55 p.m. UTC
This series changes SVM to use a single ASID per-VM, instead of using
dynamic generation-based ASIDs per-vCPU. Dynamic ASIDs were added for
CPUs without FLUSHBYASID to avoid full TLB flushes, but as Sean said,
FLUSHBYASID was added in 2010, and the case for this is no longer as
strong [1].

Furthermore, having different ASIDs for different vCPUs is not required.
ASIDs are local to physical CPUs. The only requirement is to make sure
the ASID is flushed before a differnet vCPU runs on the same physical
CPU (see below). Furthermore, SEV VMs have been using with a single ASID
per-VM anyway (required for different reasons).

A new ASID is currently allocated in 3 cases:
(a) Once when the vCPU is initialized.
(b) When the vCPU moves to a new physical CPU.
(c) On TLB flushes when FLUSHBYASID is not available.

Case (a) is trivial, instead the ASID is allocated for VM creation.
Case (b) is handled by flushing the ASID instead of assigning a new one.
Case (c) is handled by doing a full TLB flush (i.e.
TLB_CONTROL_FLUSH_ALL_ASID) instead of assinging a new ASID. This is
a bit aggressive, but FLUSHBYASID is available in all modern CPUs.

The series is organized as follows:
- Patch 1 generalizes the VPID allocation code in VMX to be
  vendor-neutral, to reuse for SVM.
- Patches 2-3 do some refactoring and cleanups.
- Patches 4-5 address cases (b) and (c) above.
- Patch 6 moves to single ASID per-VM.
- Patch 7 performs some minimal unification between SVM and SEV code.
  More unification can be done. In particular, SEV can use the
  generalized kvm_tlb_tags to allocate ASIDs, and can stop tracking the
  ASID separately in struct kvm_sev_info. However, I didn't have enough
  SEV knowledge (or testability) to do this.

The performance impact does not seem to be that bad. To test this
series, I ran 3 benchmarks in an SVM guest on a Milan machine:
- netperf
- cpuid_rate [2]
- A simple program doing mmap() and munmap() of 100M for 100 iterations,
  to trigger MMU syncs and TLB flushes when using the shadow MMU.

The benchmarks were ran with and without the patches for 5 iterations
each, and also with and without NPT and FLUSBYASID to emulate old
hardware. In all cases, there was either no difference or a 1-2%
performance hit for the old hardware case. The performance hit could be
larger for specific workloads, but niche performance-sensitive workloads
should not be running on very old hardware.

[1] https://lore.kernel.org/lkml/Z8JOvMx6iLexT3pK@google.com/
[2] https://lore.kernel.org/kvm/20231109180646.2963718-1-khorenko@virtuozzo.com/

Yosry Ahmed (7):
  KVM: VMX: Generalize VPID allocation to be vendor-neutral
  KVM: SVM: Use cached local variable in init_vmcb()
  KVM: SVM: Add helpers to set/clear ASID flush
  KVM: SVM: Flush everything if FLUSHBYASID is not available
  KVM: SVM: Flush the ASID when running on a new CPU
  KVM: SVM: Use a single ASID per VM
  KVM: SVM: Share more code between pre_sev_run() and pre_svm_run()

 arch/x86/include/asm/svm.h |  5 ---
 arch/x86/kvm/svm/nested.c  |  4 +-
 arch/x86/kvm/svm/sev.c     | 26 +++++-------
 arch/x86/kvm/svm/svm.c     | 87 ++++++++++++++++++++------------------
 arch/x86/kvm/svm/svm.h     | 28 ++++++++----
 arch/x86/kvm/vmx/nested.c  |  4 +-
 arch/x86/kvm/vmx/vmx.c     | 38 +++--------------
 arch/x86/kvm/vmx/vmx.h     |  4 +-
 arch/x86/kvm/x86.c         | 58 +++++++++++++++++++++++++
 arch/x86/kvm/x86.h         | 13 ++++++
 10 files changed, 161 insertions(+), 106 deletions(-)