mbox series

[00/89] KVM: arm64: Base support for the pKVM hypervisor at EL2

Message ID 20220519134204.5379-1-will@kernel.org (mailing list archive)
Headers show
Series KVM: arm64: Base support for the pKVM hypervisor at EL2 | expand

Message

Will Deacon May 19, 2022, 1:40 p.m. UTC
Hi all,

This rather large series (based on -rc2) builds on top of the limited
pKVM support available upstream and gets us to a point where the
hypervisor code at EL2 is capable of running guests in both
non-protected and protected mode on the same system. For more background
information about pKVM, the following (slightly dated) LWN article may
be informative:

  https://lwn.net/Articles/836693/

The structure of this series is roughly as follows:

  * Patches 01-06 :
    - Some small cleanups and minor fixes.

  * Patches 07-12 :
    - Memory management changes at EL2 to allow the donation of memory
      from the host to the hypervisor and the "pinning" of shared memory
      at EL2.

  * Patches 13-16 :
    - Introduction of shadow VM and vCPU state at EL2 so that the
      hypervisor can manage guest state using its own private data
      structures, initially populated from the host structures.

  * Patches 17-33 :
    - Further memory management changes at EL2 to allow the allocation
      and reclaim of guest memory by the host. This then allows us to
      manage guest stage-2 page-tables entirely at EL2, with the host
      issuing hypercalls to map guest pages in response to faults.

  * Patches 34-78 :
    - Gradual reduction of EL2 trust in host data; rather than copy
      blindly between the host and shadow structures, we instead
      selectively sync/flush between them and reduce the amount of host
      data that is accessed directly by EL2.

  * Patches 79-81 :
    - Inject an abort into the host if it tries to access a guest page
      for which it does not have permission. This will then deliver a
      SEGV if the access originated from userspace.

  * Patches 82-87 :
    - Expose hypercalls to protected guests for sharing memory back with
      the host

  * Patches 88-89 :
    - Introduce the new machine type and add some documentation.

We considered splitting this into multiple series, but decided to keep
everything together initially so that reviewers can more easily get an
idea of what we're trying to do and also take it for a spin. The patches
are also available in our git tree here:

  https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstream/pkvm-base-v1

It's worth pointing out that, although we've been tracking the fd-based
proposal around KVM private memory [1], for now the approach taken here
interacts directly with anonymous pages using a longterm GUP pin. We're
expecting to prototype an fd-based implementation once the discussion at
[2] has converged. In the meantime, we hope to progress the non-protected
VM support.

Finally, there are still some features that we have not included in this
posting and will come later on:

  - Support for read-only memslots and dirty logging for non-protected
    VMs. We currently document that this doesn't work (setting the
    memslot flags will fail), but we're working to enable this.

  - Support for IOMMU configuration to protect guest memory from DMA
    attacks by the host.

  - Support for optional loading of the guest's initial firmware by the
    hypervisor.

  - Proxying of host interactions with Trustzone, intercepting and
    validating FF-A [3] calls at EL2.

  - Support for restricted MMIO exits to only regions designated as
    MMIO by the guest. An earlier version of this work was previously
    posted at [4].

  - Hardware debug and PMU support for non-protected guests -- this
    builds on the separate series posted at [5] and which is now queued
    for 5.19.

  - Guest-side changes to issue the new pKVM hypercalls, for example
    sharing back the SWIOTLB buffer with the host for virtio traffic.

Please enjoy,

Will, Quentin, Fuad and Marc

[1] https://lore.kernel.org/all/20220310140911.50924-1-chao.p.peng@linux.intel.com/
[2] https://lore.kernel.org/r/20220422105612.GB61987@chaop.bj.intel.com
[3] https://developer.arm.com/documentation/den0077/latest
[4] https://lore.kernel.org/all/20211004174849.2831548-1-maz@kernel.org/
[5] https://lore.kernel.org/all/20220510095710.148178-1-tabba@google.com/

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Marc Zyngier <maz@kernel.org>

Cc: kernel-team@android.com
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org

--->8

Fuad Tabba (23):
  KVM: arm64: Add hyp_spinlock_t static initializer
  KVM: arm64: Introduce shadow VM state at EL2
  KVM: arm64: Instantiate VM shadow data from EL1
  KVM: arm64: Do not allow memslot changes after first VM run under pKVM
  KVM: arm64: Add hyp per_cpu variable to track current physical cpu
    number
  KVM: arm64: Ensure that TLBs and I-cache are private to each vcpu
  KVM: arm64: Do not pass the vcpu to __pkvm_host_map_guest()
  KVM: arm64: Check directly whether the vcpu is protected
  KVM: arm64: Trap debug break and watch from guest
  KVM: arm64: Restrict protected VM capabilities
  KVM: arm64: Do not support MTE for protected VMs
  KVM: arm64: Refactor reset_mpidr to extract its computation
  KVM: arm64: Reset sysregs for protected VMs
  KVM: arm64: Move pkvm_vcpu_init_traps to shadow vcpu init
  KVM: arm64: Fix initializing traps in protected mode
  KVM: arm64: Add EL2 entry/exit handlers for pKVM guests
  KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use
  KVM: arm64: Initialize shadow vm state at hyp
  KVM: arm64: Add HVC handling for protected guests at EL2
  KVM: arm64: Move pstate reset values to kvm_arm.h
  KVM: arm64: Move some kvm_psci functions to a shared header
  KVM: arm64: Factor out vcpu_reset code for core registers and PSCI
  KVM: arm64: Handle PSCI for protected VMs in EL2

Marc Zyngier (20):
  KVM: arm64: Handle all ID registers trapped for a protected VM
  KVM: arm64: Drop stale comment
  KVM: arm64: Check for PTE validity when checking for
    executable/cacheable
  KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  KVM: arm64: Simplify vgic-v3 hypercalls
  KVM: arm64: Add the {flush,sync}_vgic_state() primitives
  KVM: arm64: Introduce predicates to check for protected state
  KVM: arm64: Add the {flush,sync}_timer_state() primitives
  KVM: arm64: Introduce the pkvm_vcpu_{load,put} hypercalls
  KVM: arm64: Add current vcpu and shadow_state lookup primitive
  KVM: arm64: Skip __kvm_adjust_pc() for protected vcpus
  KVM: arm64: Introduce per-EC entry/exit handlers
  KVM: arm64: Introduce lazy-ish state sync for non-protected VMs
  KVM: arm64: Lazy host FP save/restore
  KVM: arm64: Reduce host/shadow vcpu state copying
  KVM: arm64: Force injection of a data abort on NISV MMIO exit
  KVM: arm64: Donate memory to protected guests
  KVM: arm64: Move vgic state between host and shadow vcpu structures
  KVM: arm64: Do not update virtual timer state for protected VMs
  KVM: arm64: Track the SVE state in the shadow vcpu

Quentin Perret (22):
  KVM: arm64: Move hyp refcount manipulation helpers
  KVM: arm64: Back hyp_vmemmap for all of memory
  KVM: arm64: Implement do_donate() helper for donating memory
  KVM: arm64: Prevent the donation of no-map pages
  KVM: arm64: Add helpers to pin memory shared with hyp
  KVM: arm64: Make hyp stage-1 refcnt correct on the whole range
  KVM: arm64: Factor out private range VA allocation
  KVM: arm64: Add pcpu fixmap infrastructure at EL2
  KVM: arm64: Allow non-coallescable pages in a hyp_pool
  KVM: arm64: Add generic hyp_memcache helpers
  KVM: arm64: Instantiate guest stage-2 page-tables at EL2
  KVM: arm64: Return guest memory from EL2 via dedicated teardown
    memcache
  KVM: arm64: Add flags to struct hyp_page
  KVM: arm64: Consolidate stage-2 init in one function
  KVM: arm64: Disallow dirty logging and RO memslots with pKVM
  KVM: arm64: Don't access kvm_arm_hyp_percpu_base at EL1
  KVM: arm64: Unmap kvm_arm_hyp_percpu_base from the host
  KVM: arm64: Explicitly map kvm_vgic_global_state at EL2
  KVM: arm64: Don't map host sections in pkvm
  KVM: arm64: Add is_pkvm_initialized() helper
  KVM: arm64: Refactor enter_exception64()
  KVM: arm64: Inject SIGSEGV on illegal accesses

Will Deacon (24):
  KVM: arm64: Remove redundant hyp_assert_lock_held() assertions
  KVM: arm64: Return error from kvm_arch_init_vm() on allocation failure
  KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHE
  KVM: arm64: Extend comment in has_vhe()
  KVM: arm64: Unify identifiers used to distinguish host and hypervisor
  KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
  KVM: arm64: Provide I-cache invalidation by VA at EL2
  KVM: arm64: Provide a hypercall for the host to reclaim guest memory
  KVM: arm64: Extend memory sharing to allow host-to-guest transitions
  KVM: arm64: Use the shadow vCPU structure in handle___kvm_vcpu_run()
  KVM: arm64: Handle guest stage-2 page-tables entirely at EL2
  KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
  KVM: arm64: Extend memory donation to allow host-to-guest transitions
  KVM: arm64: Split up nvhe/fixed_config.h
  KVM: arm64: Advertise GICv3 sysreg interface to protected guests
  KVM: arm64: Don't expose TLBI hypercalls after de-privilege
  KVM: arm64: Support TLB invalidation in guest context
  KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
  KVM: arm64: Extend memory sharing to allow guest-to-host transitions
  KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst
  KVM: arm64: Reformat/beautify PTP hypercall documentation
  KVM: arm64: Expose memory sharing hypercalls to protected guests
  KVM: arm64: Introduce KVM_VM_TYPE_ARM_PROTECTED machine type for PVMs
  Documentation: KVM: Add some documentation for Protected KVM on arm64

 .../admin-guide/kernel-parameters.txt         |    5 +-
 Documentation/virt/kvm/api.rst                |    7 +
 Documentation/virt/kvm/arm/hypercalls.rst     |  118 ++
 Documentation/virt/kvm/arm/index.rst          |    2 +
 Documentation/virt/kvm/arm/pkvm.rst           |   96 ++
 Documentation/virt/kvm/arm/ptp_kvm.rst        |   38 +-
 arch/arm64/include/asm/kvm_arm.h              |   11 +-
 arch/arm64/include/asm/kvm_asm.h              |   28 +-
 arch/arm64/include/asm/kvm_emulate.h          |   92 ++
 arch/arm64/include/asm/kvm_host.h             |  123 +-
 arch/arm64/include/asm/kvm_hyp.h              |   10 +-
 arch/arm64/include/asm/kvm_mmu.h              |    2 +-
 arch/arm64/include/asm/kvm_pgtable.h          |    8 +
 arch/arm64/include/asm/kvm_pkvm.h             |  257 ++++
 arch/arm64/include/asm/virt.h                 |   15 +-
 arch/arm64/kernel/cpufeature.c                |   10 +-
 arch/arm64/kernel/image-vars.h                |   15 -
 arch/arm64/kvm/arch_timer.c                   |    7 +-
 arch/arm64/kvm/arm.c                          |  194 ++-
 arch/arm64/kvm/handle_exit.c                  |   22 +
 arch/arm64/kvm/hyp/exception.c                |   89 +-
 arch/arm64/kvm/hyp/hyp-constants.c            |    3 +
 .../arm64/kvm/hyp/include/nvhe/fixed_config.h |  205 ---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h         |    6 +-
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |   25 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |   33 +-
 arch/arm64/kvm/hyp/include/nvhe/mm.h          |   18 +-
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  119 ++
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h    |   10 +-
 .../arm64/kvm/hyp/include/nvhe/trap_handler.h |    2 -
 arch/arm64/kvm/hyp/nvhe/cache.S               |   11 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  937 +++++++++++++-
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c             |    4 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 1035 +++++++++++++++-
 arch/arm64/kvm/hyp/nvhe/mm.c                  |  177 ++-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |   42 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c                | 1095 ++++++++++++++++-
 arch/arm64/kvm/hyp/nvhe/setup.c               |   97 +-
 arch/arm64/kvm/hyp/nvhe/switch.c              |    9 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c            |  139 ++-
 arch/arm64/kvm/hyp/nvhe/tlb.c                 |   96 +-
 arch/arm64/kvm/hyp/pgtable.c                  |   31 +-
 arch/arm64/kvm/hyp/vgic-v3-sr.c               |   27 +-
 arch/arm64/kvm/mmio.c                         |    9 +
 arch/arm64/kvm/mmu.c                          |  202 ++-
 arch/arm64/kvm/pkvm.c                         |  156 ++-
 arch/arm64/kvm/pmu.c                          |   16 +-
 arch/arm64/kvm/psci.c                         |   28 -
 arch/arm64/kvm/reset.c                        |   99 +-
 arch/arm64/kvm/sys_regs.c                     |   34 +-
 arch/arm64/kvm/sys_regs.h                     |   19 +
 arch/arm64/kvm/vgic/vgic-v2.c                 |    9 +-
 arch/arm64/kvm/vgic/vgic-v3.c                 |   28 +-
 arch/arm64/kvm/vgic/vgic.c                    |   17 +-
 arch/arm64/kvm/vgic/vgic.h                    |    6 +-
 arch/arm64/mm/fault.c                         |   22 +
 include/kvm/arm_vgic.h                        |    3 +-
 include/linux/arm-smccc.h                     |   21 +
 include/uapi/linux/kvm.h                      |    6 +
 59 files changed, 5128 insertions(+), 817 deletions(-)
 create mode 100644 Documentation/virt/kvm/arm/hypercalls.rst
 create mode 100644 Documentation/virt/kvm/arm/pkvm.rst
 delete mode 100644 arch/arm64/kvm/hyp/include/nvhe/fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h