From patchwork Thu Jan 12 19:18:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13098640 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBF0DC63797 for ; Thu, 12 Jan 2023 19:26:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232268AbjALT04 (ORCPT ); Thu, 12 Jan 2023 14:26:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240376AbjALT0Z (ORCPT ); Thu, 12 Jan 2023 14:26:25 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83B6E5F52 for ; Thu, 12 Jan 2023 11:19:40 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2092761AA0 for ; Thu, 12 Jan 2023 19:19:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C997C433D2; Thu, 12 Jan 2023 19:19:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673551179; bh=cLsuM6GQSJJ+WSNit88smbjtMO9dez2naL2xP2db/qE=; h=From:To:Cc:Subject:Date:From; b=F7QcqbPwgqqSfrbPn7MbFXPDXceyx/9ezfMUWMVfwxBaA+xdJOyExApV5w5QHsX9t vjkfJegaCxqSaE6g6lup38/x3ihalJr/QuoNFmozl7f5fTWkRGE3xQe88hMl2RybTg equJ/G96Cqa6dJmI3007+uqKIa3YF4weUpMXfRw+3jJMowpdH8PDS+4OS0ZLOmnCck PRGRxuPpe6dh5lGGayKldxWp1LtbgYYnDLSH9EbcTuxJLPsHlfyE1Xp/WJAEYWXHNW QYCIxXEIHqDNQ8QfqdG6FztRNOijdtE1YPg1+6CbVb2O57uxsFKaLDWAyd5Pf/T9a9 fg86uLkUaFckw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pG36v-001IWu-1I; Thu, 12 Jan 2023 19:19:37 +0000 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Jintack Lim , Russell King , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v7 00/68] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support Date: Thu, 12 Jan 2023 19:18:19 +0000 Message-Id: <20230112191927.1814989-1-maz@kernel.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org After almost a year of not doing much on the NV front, I've finally landed some actual hardware that allowed me to make decent progress. For the previous episodes, see [1]. What's changed: - Timers have been massively reworked, mostly enabling ECV. The support is still incomplete, as we don't support ECV in guests, but it allows timers to be reliable. - A lot of bugs (both big and small) have been squashed, too many to list here. Funny how quickly these show up on real HW! - A bunch of things have been reworked, mostly a result of Alexandru's thorough review effort. - All rebased on 6.2-rc2, with a few dependencies that I have already posted. Full branch at [2]. Performance: - This is surprisingly good. Specially after years of watching paint dry in a model. Of course, if your workload is exit/trap heavy, it will suck. But not quite as much as on the model, which is the benchmark. Testing: - No model involved this time around. I have solely run the series on an Apple M2 with 16k pages, because that's what I have, and I consider it the reference (HW you cannot get your hands on doesn't really count). - Because the M2 is "pretty lax" with the architecture, a bunch of things cannot be tested (E2H=0 being the most obvious one). - For the same reason, I may be relying on non-architectural behaviours. I've done my best not to, but you never know. Until I get a machine that *is* fully compliant with the architecture, it is a risk I'll have to take. - The series has a number of hacks for the M2 to actually work (vgic emulation, mostly). - I haven't tested FEAT_NV (only FEAT_NV2), and seriously considering dropping the support for it. I doubt there is any HW solely implementing FEAT_NV and not FEAT_NV2. - There is a kvmtool branch that allows you to use the --nested option at [3]. I have no idea what became of the equivalent QEMU patches. Many thanks to Alexandru, Chase, Ganapatrao and Russell for spending a lot of time reviewing this, spotting issues and providing helpful suggestions. [1] https://lore.kernel.org/r/20220128121912.509006-1-maz@kernel.org [2] git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/nv-6.2-WIP [3] https://git.kernel.org/pub/scm/linux/kernel/git/maz/kvmtool.git/log/?h=arm64/nv-5.16 Andre Przywara (1): KVM: arm64: nv: vgic: Allow userland to set VGIC maintenance IRQ Christoffer Dall (13): KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Reset VMPIDR_EL2 and VPIDR_EL2 to sane values KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Trap EL1 VM register accesses in virtual EL2 KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Implement nested Stage-2 page table walk logic KVM: arm64: nv: Unmap/flush shadow stage 2 page tables KVM: arm64: nv: vgic: Emulate the HW bit in software KVM: arm64: nv: Add nested GICv3 tracepoints KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Jintack Lim (16): arm64: Add ARM64_HAS_NESTED_VIRT cpufeature KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Trap CPACR_EL1 access in virtual EL2 KVM: arm64: nv: Handle PSCI call via smc from the guest KVM: arm64: nv: Respect virtual HCR_EL2.TWX setting KVM: arm64: nv: Respect virtual CPTR_EL2.{TFP,FPEN} settings KVM: arm64: nv: Respect the virtual HCR_EL2.NV bit setting KVM: arm64: nv: Respect virtual HCR_EL2.TVM and TRVM settings KVM: arm64: nv: Respect the virtual HCR_EL2.NV1 bit setting KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Configure HCR_EL2 for nested virtualization KVM: arm64: nv: Trap and emulate TLBI instructions from virtual EL2 KVM: arm64: nv: Nested GICv3 Support Marc Zyngier (38): KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Add non-VHE-EL2->EL1 translation helpers KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg() KVM: arm64: nv: Handle SPSR_EL2 specially KVM: arm64: nv: Handle HCR_EL2.E2H specially KVM: arm64: nv: Save/Restore vEL2 sysregs KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Forward debug traps to the nested guest KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Hide RAS from nested guests KVM: arm64: nv: Support multiple nested Stage-2 mmu structures KVM: arm64: nv: Handle shadow stage 2 page faults KVM: arm64: nv: Restrict S2 RD/WR permissions to match the guest's KVM: arm64: nv: Set a handler for the system instruction traps KVM: arm64: nv: Trap and emulate AT instructions from virtual EL2 KVM: arm64: nv: Fold guest's HCR_EL2 configuration into the host's KVM: arm64: nv: arch_timer: Support hyp timer emulation KVM: arm64: nv: Add handling of EL2-specific timer registers KVM: arm64: nv: Load timer before the GIC KVM: arm64: nv: Don't load the GICv4 context on entering a nested guest KVM: arm64: nv: Implement maintenance interrupt forwarding KVM: arm64: nv: Deal with broken VGIC on maintenance interrupt delivery KVM: arm64: nv: Allow userspace to request KVM_ARM_VCPU_NESTED_VIRT KVM: arm64: nv: Add handling of FEAT_TTL TLB invalidation KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information KVM: arm64: nv: Tag shadow S2 entries with nested level KVM: arm64: nv: Add include containing the VNCR_EL2 offsets KVM: arm64: nv: Map VNCR-capable registers to a separate page KVM: arm64: nv: Move nested vgic state into the sysreg file KVM: arm64: Add FEAT_NV2 cpu feature KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state KVM: arm64: nv: Allocate VNCR page when required KVM: arm64: nv: Enable ARMv8.4-NV support KVM: arm64: nv: Fast-track 'InHost' exception returns KVM: arm64: nv: Fast-track EL1 TLBIs for VHE guests KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on .../admin-guide/kernel-parameters.txt | 7 +- .../virt/kvm/devices/arm-vgic-v3.rst | 12 +- arch/arm64/include/asm/esr.h | 5 + arch/arm64/include/asm/kvm_arm.h | 27 +- arch/arm64/include/asm/kvm_asm.h | 4 + arch/arm64/include/asm/kvm_emulate.h | 159 +- arch/arm64/include/asm/kvm_host.h | 195 ++- arch/arm64/include/asm/kvm_hyp.h | 2 + arch/arm64/include/asm/kvm_mmu.h | 23 +- arch/arm64/include/asm/kvm_nested.h | 150 ++ arch/arm64/include/asm/sysreg.h | 102 +- arch/arm64/include/asm/vncr_mapping.h | 74 + arch/arm64/include/uapi/asm/kvm.h | 2 + arch/arm64/kernel/cpufeature.c | 36 + arch/arm64/kvm/Makefile | 4 +- arch/arm64/kvm/arch_timer.c | 301 +++- arch/arm64/kvm/arm.c | 38 +- arch/arm64/kvm/at.c | 219 +++ arch/arm64/kvm/emulate-nested.c | 217 +++ arch/arm64/kvm/guest.c | 6 + arch/arm64/kvm/handle_exit.c | 74 +- arch/arm64/kvm/hyp/exception.c | 48 +- arch/arm64/kvm/hyp/include/hyp/switch.h | 12 +- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 24 +- arch/arm64/kvm/hyp/nvhe/switch.c | 2 +- arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 2 +- arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +- arch/arm64/kvm/hyp/vhe/switch.c | 230 ++- arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 127 +- arch/arm64/kvm/hyp/vhe/tlb.c | 83 + arch/arm64/kvm/inject_fault.c | 61 +- arch/arm64/kvm/mmu.c | 247 ++- arch/arm64/kvm/nested.c | 896 +++++++++++ arch/arm64/kvm/reset.c | 17 + arch/arm64/kvm/sys_regs.c | 1377 ++++++++++++++++- arch/arm64/kvm/sys_regs.h | 14 +- arch/arm64/kvm/trace_arm.h | 65 +- arch/arm64/kvm/vgic/vgic-init.c | 32 + arch/arm64/kvm/vgic/vgic-kvm-device.c | 29 +- arch/arm64/kvm/vgic/vgic-nested-trace.h | 137 ++ arch/arm64/kvm/vgic/vgic-v3-nested.c | 244 +++ arch/arm64/kvm/vgic/vgic-v3.c | 39 +- arch/arm64/kvm/vgic/vgic.c | 44 + arch/arm64/kvm/vgic/vgic.h | 10 + arch/arm64/tools/cpucaps | 2 + include/clocksource/arm_arch_timer.h | 4 + include/kvm/arm_arch_timer.h | 9 +- include/kvm/arm_vgic.h | 16 + include/uapi/linux/kvm.h | 1 + tools/arch/arm/include/uapi/asm/kvm.h | 1 + 50 files changed, 5215 insertions(+), 217 deletions(-) create mode 100644 arch/arm64/include/asm/kvm_nested.h create mode 100644 arch/arm64/include/asm/vncr_mapping.h create mode 100644 arch/arm64/kvm/at.c create mode 100644 arch/arm64/kvm/emulate-nested.c create mode 100644 arch/arm64/kvm/nested.c create mode 100644 arch/arm64/kvm/vgic/vgic-nested-trace.h create mode 100644 arch/arm64/kvm/vgic/vgic-v3-nested.c