mbox series

[v8,00/69] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support

Message ID 20230131092504.2880505-1-maz@kernel.org (mailing list archive)
Headers show
Series KVM: arm64: ARMv8.3/8.4 Nested Virtualization support | expand

Message

Marc Zyngier Jan. 31, 2023, 9:23 a.m. UTC
This is the second drop of NV support on arm64 for this year. This
month even. Something is happening!

For the previous episodes, see [1].

What's changed:

- TLB invalidation has been fixed! A couple of terrible bugs were
  preventing the TLBI emulation code from selecting the correct MMU
  context, leading the shadow page-tables never being invalidated.

- As a consequence, EDK2 now works correctly at L2 with QEMU as the
  VMM, as we now teardown the S2 PTs when L1 unmaps the memslot on
  write to the emulated flash.

- As a consequence of the above consequence, the Debian installer
  passes with flying colors at L2. Yay!

- A few other fixes, such as better handling of AT instructions.

- All rebased on 6.2-rc4 + kvmarm-fixes-6.2-* + kvmarm/next. Full
  branch at [2].

Performance:

- This is surprisingly good. Specially after years of watching paint
  dry in a model. Of course, if your workload is exit/trap heavy, it
  will suck. But not quite as much as on the model, which is the
  benchmark.

- For CPU intensive workloads (kernel compilation), I have seen as
  little a 6% overhead between L0 and L2. YMMV.

Testing:

- No model involved this time around. I have solely run the series on
  an Apple M2 with 16k pages, because that's what I have, and I
  consider it the reference (HW you cannot get your hands on doesn't
  really count).

- Because the M2 is "pretty lax" with the architecture, a bunch of
  things cannot be tested (E2H=0 being the most obvious one).

- For the same reason, I may be relying on non-architectural
  behaviours. I've done my best not to, but you never know. Until I
  get a machine that *is* fully compliant with the architecture, it is
  a risk I'll have to take.

- The series has a number of hacks for the M2 to actually work (vgic
  emulation, mostly).

- I haven't tested FEAT_NV (only FEAT_NV2), and seriously considering
  dropping the support for it. I doubt there is any HW solely
  implementing FEAT_NV and not FEAT_NV2.

- There is a kvmtool branch that allows you to use the --nested option
  at [3]. I have no idea what became of the equivalent QEMU patches.

Merging:

- I would suggest that the first 12 patches are in a mergeable
  state. Only patch #2 hasn't been reviewed yet (because it is new),
  but it doesn't affect anything non-NV and makes NV work. I really
  want this in before CCA! :D

- I have a yet another fix for the Apple AIC driver, but I'll post
  that separately.

Many thanks to Alexandru, Chase, Ganapatrao and Russell for spending a
lot of time reviewing this, spotting issues and providing helpful
suggestions.

[1] https://lore.kernel.org/r/20230112191927.1814989-1-maz@kernel.org
[2] git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/nv-6.3-WIP
[3] https://git.kernel.org/pub/scm/linux/kernel/git/maz/kvmtool.git/log/?h=arm64/nv-5.16

Andre Przywara (1):
  KVM: arm64: nv: vgic: Allow userland to set VGIC maintenance IRQ

Christoffer Dall (13):
  KVM: arm64: nv: Introduce nested virtualization VCPU feature
  KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
  KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
  KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
  KVM: arm64: nv: Reset VMPIDR_EL2 and VPIDR_EL2 to sane values
  KVM: arm64: nv: Handle trapped ERET from virtual EL2
  KVM: arm64: nv: Trap EL1 VM register accesses in virtual EL2
  KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2
    changes
  KVM: arm64: nv: Implement nested Stage-2 page table walk logic
  KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
  KVM: arm64: nv: vgic: Emulate the HW bit in software
  KVM: arm64: nv: Add nested GICv3 tracepoints
  KVM: arm64: nv: Sync nested timer state with FEAT_NV2

Jintack Lim (16):
  arm64: Add ARM64_HAS_NESTED_VIRT cpufeature
  KVM: arm64: nv: Handle HCR_EL2.NV system register traps
  KVM: arm64: nv: Support virtual EL2 exceptions
  KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
  KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from
    virtual EL2
  KVM: arm64: nv: Trap CPACR_EL1 access in virtual EL2
  KVM: arm64: nv: Handle PSCI call via smc from the guest
  KVM: arm64: nv: Respect virtual HCR_EL2.TWX setting
  KVM: arm64: nv: Respect virtual CPTR_EL2.{TFP,FPEN} settings
  KVM: arm64: nv: Respect the virtual HCR_EL2.NV bit setting
  KVM: arm64: nv: Respect virtual HCR_EL2.TVM and TRVM settings
  KVM: arm64: nv: Respect the virtual HCR_EL2.NV1 bit setting
  KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
  KVM: arm64: nv: Configure HCR_EL2 for nested virtualization
  KVM: arm64: nv: Trap and emulate TLBI instructions from virtual EL2
  KVM: arm64: nv: Nested GICv3 Support

Marc Zyngier (39):
  KVM: arm64: Use the S2 MMU context to iterate over S2 table
  KVM: arm64: nv: Add EL2 system registers to vcpu context
  KVM: arm64: nv: Add non-VHE-EL2->EL1 translation helpers
  KVM: arm64: nv: Handle virtual EL2 registers in
    vcpu_read/write_sys_reg()
  KVM: arm64: nv: Handle SPSR_EL2 specially
  KVM: arm64: nv: Handle HCR_EL2.E2H specially
  KVM: arm64: nv: Save/Restore vEL2 sysregs
  KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
  KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
  KVM: arm64: nv: Forward debug traps to the nested guest
  KVM: arm64: nv: Filter out unsupported features from ID regs
  KVM: arm64: nv: Hide RAS from nested guests
  KVM: arm64: nv: Support multiple nested Stage-2 mmu structures
  KVM: arm64: nv: Handle shadow stage 2 page faults
  KVM: arm64: nv: Restrict S2 RD/WR permissions to match the guest's
  KVM: arm64: nv: Set a handler for the system instruction traps
  KVM: arm64: nv: Trap and emulate AT instructions from virtual EL2
  KVM: arm64: nv: Fold guest's HCR_EL2 configuration into the host's
  KVM: arm64: nv: arch_timer: Support hyp timer emulation
  KVM: arm64: nv: Add handling of EL2-specific timer registers
  KVM: arm64: nv: Load timer before the GIC
  KVM: arm64: nv: Don't load the GICv4 context on entering a nested
    guest
  KVM: arm64: nv: Implement maintenance interrupt forwarding
  KVM: arm64: nv: Deal with broken VGIC on maintenance interrupt
    delivery
  KVM: arm64: nv: Allow userspace to request KVM_ARM_VCPU_NESTED_VIRT
  KVM: arm64: nv: Add handling of FEAT_TTL TLB invalidation
  KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like
    information
  KVM: arm64: nv: Tag shadow S2 entries with nested level
  KVM: arm64: nv: Add include containing the VNCR_EL2 offsets
  KVM: arm64: nv: Map VNCR-capable registers to a separate page
  KVM: arm64: nv: Move nested vgic state into the sysreg file
  KVM: arm64: Add FEAT_NV2 cpu feature
  KVM: arm64: nv: Publish emulated timer interrupt state in the
    in-memory state
  KVM: arm64: nv: Allocate VNCR page when required
  KVM: arm64: nv: Enable ARMv8.4-NV support
  KVM: arm64: nv: Fast-track 'InHost' exception returns
  KVM: arm64: nv: Fast-track EL1 TLBIs for VHE guests
  KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers
  KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on

 .../admin-guide/kernel-parameters.txt         |    7 +-
 .../virt/kvm/devices/arm-vgic-v3.rst          |   12 +-
 arch/arm64/include/asm/esr.h                  |    5 +
 arch/arm64/include/asm/kvm_arm.h              |   27 +-
 arch/arm64/include/asm/kvm_asm.h              |    4 +
 arch/arm64/include/asm/kvm_emulate.h          |  159 +-
 arch/arm64/include/asm/kvm_host.h             |  195 ++-
 arch/arm64/include/asm/kvm_hyp.h              |    2 +
 arch/arm64/include/asm/kvm_mmu.h              |   23 +-
 arch/arm64/include/asm/kvm_nested.h           |  150 ++
 arch/arm64/include/asm/sysreg.h               |  102 +-
 arch/arm64/include/asm/vncr_mapping.h         |   74 +
 arch/arm64/include/uapi/asm/kvm.h             |    2 +
 arch/arm64/kernel/cpufeature.c                |   36 +
 arch/arm64/kvm/Makefile                       |    4 +-
 arch/arm64/kvm/arch_timer.c                   |  301 +++-
 arch/arm64/kvm/arm.c                          |   38 +-
 arch/arm64/kvm/at.c                           |  221 +++
 arch/arm64/kvm/emulate-nested.c               |  217 +++
 arch/arm64/kvm/guest.c                        |    6 +
 arch/arm64/kvm/handle_exit.c                  |   74 +-
 arch/arm64/kvm/hyp/exception.c                |   48 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h       |   12 +-
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h    |   24 +-
 arch/arm64/kvm/hyp/nvhe/switch.c              |    2 +-
 arch/arm64/kvm/hyp/nvhe/sysreg-sr.c           |    2 +-
 arch/arm64/kvm/hyp/vgic-v3-sr.c               |    2 +-
 arch/arm64/kvm/hyp/vhe/switch.c               |  230 ++-
 arch/arm64/kvm/hyp/vhe/sysreg-sr.c            |  125 +-
 arch/arm64/kvm/hyp/vhe/tlb.c                  |   83 +
 arch/arm64/kvm/inject_fault.c                 |   61 +-
 arch/arm64/kvm/mmu.c                          |  259 +++-
 arch/arm64/kvm/nested.c                       |  896 +++++++++++
 arch/arm64/kvm/reset.c                        |   17 +
 arch/arm64/kvm/sys_regs.c                     | 1377 ++++++++++++++++-
 arch/arm64/kvm/sys_regs.h                     |   14 +-
 arch/arm64/kvm/trace_arm.h                    |   65 +-
 arch/arm64/kvm/vgic/vgic-init.c               |   33 +
 arch/arm64/kvm/vgic/vgic-kvm-device.c         |   29 +-
 arch/arm64/kvm/vgic/vgic-nested-trace.h       |  137 ++
 arch/arm64/kvm/vgic/vgic-v3-nested.c          |  244 +++
 arch/arm64/kvm/vgic/vgic-v3.c                 |   39 +-
 arch/arm64/kvm/vgic/vgic.c                    |   44 +
 arch/arm64/kvm/vgic/vgic.h                    |   10 +
 arch/arm64/tools/cpucaps                      |    2 +
 include/clocksource/arm_arch_timer.h          |    4 +
 include/kvm/arm_arch_timer.h                  |    9 +-
 include/kvm/arm_vgic.h                        |   16 +
 include/uapi/linux/kvm.h                      |    1 +
 tools/arch/arm/include/uapi/asm/kvm.h         |    1 +
 50 files changed, 5221 insertions(+), 224 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_nested.h
 create mode 100644 arch/arm64/include/asm/vncr_mapping.h
 create mode 100644 arch/arm64/kvm/at.c
 create mode 100644 arch/arm64/kvm/emulate-nested.c
 create mode 100644 arch/arm64/kvm/nested.c
 create mode 100644 arch/arm64/kvm/vgic/vgic-nested-trace.h
 create mode 100644 arch/arm64/kvm/vgic/vgic-v3-nested.c

Comments

Oliver Upton Jan. 31, 2023, 10:13 p.m. UTC | #1
On Tue, Jan 31, 2023 at 09:23:55AM +0000, Marc Zyngier wrote:
> - I would suggest that the first 12 patches are in a mergeable
>   state. Only patch #2 hasn't been reviewed yet (because it is new),
>   but it doesn't affect anything non-NV and makes NV work. I really
>   want this in before CCA! :D

Sounds like a reasonable plan to me. I'm still chewing through these
patches, but after they've soaked for a bit on the mailing list could
you repost the set destined for immediate inclusion?

OTOH, if you feel like dropping another patch bomb for the hell of it I
can just grab the subset. But please don't :)
Marc Zyngier Jan. 31, 2023, 10:20 p.m. UTC | #2
On Tue, 31 Jan 2023 22:13:37 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Tue, Jan 31, 2023 at 09:23:55AM +0000, Marc Zyngier wrote:
> > - I would suggest that the first 12 patches are in a mergeable
> >   state. Only patch #2 hasn't been reviewed yet (because it is new),
> >   but it doesn't affect anything non-NV and makes NV work. I really
> >   want this in before CCA! :D
> 
> Sounds like a reasonable plan to me. I'm still chewing through these
> patches, but after they've soaked for a bit on the mailing list could
> you repost the set destined for immediate inclusion?

Yup, no problem. Probably 11 patches, maybe a couple more if I manage
to reorder the series.

> OTOH, if you feel like dropping another patch bomb for the hell of it I
> can just grab the subset. But please don't :)

Hey, now you know what it feels like! :D

Thanks,

	M.