From patchwork Mon Nov 20 13:09:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13461281 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7EE89C54E76 for ; Mon, 20 Nov 2023 13:16:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eyYQWpqNaKH81mrWGu+D3YvsR+JgguoN9it/iSrz9tA=; b=bZh72dAUAFyRcC QLm24zfl1AOGD36h7XUmRcH18FFyobdfVTWFPnxUx24s2iYYVFDSmKBljdDZWnNG0+dB+a/YbJdW1 XVr6G0yf7UXHOPKdjgmZwJJRHc6piywr7zNyMu/K9CJdt0RCBGQd7umpsLojwWth/nj/afIySUW0F 3gps4gZpVZVVhc/emyjQ+YdrJ2/Vt9IhheuWWAjUcW+e7INASbgUcCIFNILTp9HBap0I4fJObLdEN ZbHntpGmjOJGiB2sg+4JqmmjEQALwrEa8Wx00uCQL086S2t+u4D2ubjzzfT/QzH9Dlt6gK8jbi4zY B1g5/YDnCY3YRAFaS+vw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r548H-00C6s2-0o; Mon, 20 Nov 2023 13:16:09 +0000 Received: from ams.source.kernel.org ([2604:1380:4601:e00::1]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r543F-00C4eF-1F for linux-arm-kernel@lists.infradead.org; Mon, 20 Nov 2023 13:11:16 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 0F9ECB81711; Mon, 20 Nov 2023 13:10:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 41A33C433CA; Mon, 20 Nov 2023 13:10:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700485854; bh=pcq9eDyd3zroSkY4fvUbP91Vx46hWwotJq9B7JexxGw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Tc82GFDbzWen3urS9jdf7npHTU6DfON91zJ8Wdnl6D5uAdFU+H8/DzII1TSb2ykQp gdJ4GTiPY3XVp/+lAyilDAT6Ce/Crdy63wNH/hFI8p7sUzE6m/SgZAESCByTBhB1kD E+IADbbRuYck+RUt8gMoIfvwbSbk7kAUFzP+G4YIcvKh+Jp7MCGSpTzuZqX4/4445O TLd2ojofojdcGJmqf9YLZgOCBa7YZiYc0Ka2MBm5nEb1gfrmDwGUWAv4wzm62hbtUx MmmuD+OZFBPtdH2eDbsxOfl2GM81ohwFadwLkhLiqDylXQRmlvRJsZucXS8X+H0mq9 k9ahoizeqQVTw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1r543A-00EjnU-Bb; Mon, 20 Nov 2023 13:10:52 +0000 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v11 13/43] KVM: arm64: nv: Save/Restore vEL2 sysregs Date: Mon, 20 Nov 2023 13:09:57 +0000 Message-Id: <20231120131027.854038-14-maz@kernel.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231120131027.854038-1-maz@kernel.org> References: <20231120131027.854038-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231120_051058_634576_598FC37F X-CRM114-Status: GOOD ( 22.76 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Whenever we need to restore the guest's system registers to the CPU, we now need to take care of the EL2 system registers as well. Most of them are accessed via traps only, but some have an immediate effect and also a guest running in VHE mode would expect them to be accessible via their EL1 encoding, which we do not trap. For vEL2 we write the virtual EL2 registers with an identical format directly into their EL1 counterpart, and translate the few registers that have a different format for the same effect on the execution when running a non-VHE guest guest hypervisor. Based on an initial patch from Andre Przywara, rewritten many times since. Reviewed-by: Alexandru Elisei Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 5 +- arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 2 +- arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 133 ++++++++++++++++++++- 3 files changed, 135 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h index bb6b571ec627..3472eb6628e6 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -97,9 +97,10 @@ static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt_sys_reg(ctxt, TPIDRRO_EL0), tpidrro_el0); } -static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) +static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt, + u64 mpidr) { - write_sysreg(ctxt_sys_reg(ctxt, MPIDR_EL1), vmpidr_el2); + write_sysreg(mpidr, vmpidr_el2); if (has_vhe() || !cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { diff --git a/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c index 29305022bc04..dba101565de3 100644 --- a/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c @@ -28,7 +28,7 @@ void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt) void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) { - __sysreg_restore_el1_state(ctxt); + __sysreg_restore_el1_state(ctxt, ctxt_sys_reg(ctxt, MPIDR_EL1)); __sysreg_restore_common_state(ctxt); __sysreg_restore_user_state(ctxt); __sysreg_restore_el2_return_state(ctxt); diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c index 8e1e0d5033b6..d33f64cd0745 100644 --- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c @@ -15,6 +15,104 @@ #include #include +static void __sysreg_save_vel2_state(struct kvm_cpu_context *ctxt) +{ + /* These registers are common with EL1 */ + ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg(par_el1); + ctxt_sys_reg(ctxt, TPIDR_EL1) = read_sysreg(tpidr_el1); + + ctxt_sys_reg(ctxt, ESR_EL2) = read_sysreg_el1(SYS_ESR); + ctxt_sys_reg(ctxt, AFSR0_EL2) = read_sysreg_el1(SYS_AFSR0); + ctxt_sys_reg(ctxt, AFSR1_EL2) = read_sysreg_el1(SYS_AFSR1); + ctxt_sys_reg(ctxt, FAR_EL2) = read_sysreg_el1(SYS_FAR); + ctxt_sys_reg(ctxt, MAIR_EL2) = read_sysreg_el1(SYS_MAIR); + ctxt_sys_reg(ctxt, VBAR_EL2) = read_sysreg_el1(SYS_VBAR); + ctxt_sys_reg(ctxt, CONTEXTIDR_EL2) = read_sysreg_el1(SYS_CONTEXTIDR); + ctxt_sys_reg(ctxt, AMAIR_EL2) = read_sysreg_el1(SYS_AMAIR); + + /* + * In VHE mode those registers are compatible between EL1 and EL2, + * and the guest uses the _EL1 versions on the CPU naturally. + * So we save them into their _EL2 versions here. + * For nVHE mode we trap accesses to those registers, so our + * _EL2 copy in sys_regs[] is always up-to-date and we don't need + * to save anything here. + */ + if (__vcpu_el2_e2h_is_set(ctxt)) { + u64 val; + + ctxt_sys_reg(ctxt, SCTLR_EL2) = read_sysreg_el1(SYS_SCTLR); + ctxt_sys_reg(ctxt, CPTR_EL2) = read_sysreg_el1(SYS_CPACR); + ctxt_sys_reg(ctxt, TTBR0_EL2) = read_sysreg_el1(SYS_TTBR0); + ctxt_sys_reg(ctxt, TTBR1_EL2) = read_sysreg_el1(SYS_TTBR1); + ctxt_sys_reg(ctxt, TCR_EL2) = read_sysreg_el1(SYS_TCR); + + /* + * The EL1 view of CNTKCTL_EL1 has a bunch of RES0 bits where + * the interesting CNTHCTL_EL2 bits live. So preserve these + * bits when reading back the guest-visible value. + */ + val = read_sysreg_el1(SYS_CNTKCTL); + val &= CNTKCTL_VALID_BITS; + ctxt_sys_reg(ctxt, CNTHCTL_EL2) &= ~CNTKCTL_VALID_BITS; + ctxt_sys_reg(ctxt, CNTHCTL_EL2) |= val; + } + + ctxt_sys_reg(ctxt, SP_EL2) = read_sysreg(sp_el1); + ctxt_sys_reg(ctxt, ELR_EL2) = read_sysreg_el1(SYS_ELR); + ctxt_sys_reg(ctxt, SPSR_EL2) = read_sysreg_el1(SYS_SPSR); +} + +static void __sysreg_restore_vel2_state(struct kvm_cpu_context *ctxt) +{ + u64 val; + + /* These registers are common with EL1 */ + write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1), par_el1); + write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1), tpidr_el1); + + write_sysreg(read_cpuid_id(), vpidr_el2); + write_sysreg(ctxt_sys_reg(ctxt, MPIDR_EL1), vmpidr_el2); + write_sysreg_el1(ctxt_sys_reg(ctxt, MAIR_EL2), SYS_MAIR); + write_sysreg_el1(ctxt_sys_reg(ctxt, VBAR_EL2), SYS_VBAR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CONTEXTIDR_EL2),SYS_CONTEXTIDR); + write_sysreg_el1(ctxt_sys_reg(ctxt, AMAIR_EL2), SYS_AMAIR); + + if (__vcpu_el2_e2h_is_set(ctxt)) { + /* + * In VHE mode those registers are compatible between + * EL1 and EL2. + */ + write_sysreg_el1(ctxt_sys_reg(ctxt, SCTLR_EL2), SYS_SCTLR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CPTR_EL2), SYS_CPACR); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR0_EL2), SYS_TTBR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR1_EL2), SYS_TTBR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, TCR_EL2), SYS_TCR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CNTHCTL_EL2), SYS_CNTKCTL); + } else { + /* + * CNTHCTL_EL2 only affects EL1 when running nVHE, so + * no need to restore it. + */ + val = translate_sctlr_el2_to_sctlr_el1(ctxt_sys_reg(ctxt, SCTLR_EL2)); + write_sysreg_el1(val, SYS_SCTLR); + val = translate_cptr_el2_to_cpacr_el1(ctxt_sys_reg(ctxt, CPTR_EL2)); + write_sysreg_el1(val, SYS_CPACR); + val = translate_ttbr0_el2_to_ttbr0_el1(ctxt_sys_reg(ctxt, TTBR0_EL2)); + write_sysreg_el1(val, SYS_TTBR0); + val = translate_tcr_el2_to_tcr_el1(ctxt_sys_reg(ctxt, TCR_EL2)); + write_sysreg_el1(val, SYS_TCR); + } + + write_sysreg_el1(ctxt_sys_reg(ctxt, ESR_EL2), SYS_ESR); + write_sysreg_el1(ctxt_sys_reg(ctxt, AFSR0_EL2), SYS_AFSR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, AFSR1_EL2), SYS_AFSR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, FAR_EL2), SYS_FAR); + write_sysreg(ctxt_sys_reg(ctxt, SP_EL2), sp_el1); + write_sysreg_el1(ctxt_sys_reg(ctxt, ELR_EL2), SYS_ELR); + write_sysreg_el1(ctxt_sys_reg(ctxt, SPSR_EL2), SYS_SPSR); +} + /* * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and * pstate, which are handled as part of the el2 return state) on every @@ -66,6 +164,7 @@ void __vcpu_load_switch_sysregs(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt; struct kvm_cpu_context *host_ctxt; + u64 mpidr; host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; __sysreg_save_user_state(host_ctxt); @@ -89,7 +188,29 @@ void __vcpu_load_switch_sysregs(struct kvm_vcpu *vcpu) */ __sysreg32_restore_state(vcpu); __sysreg_restore_user_state(guest_ctxt); - __sysreg_restore_el1_state(guest_ctxt); + + if (unlikely(__is_hyp_ctxt(guest_ctxt))) { + __sysreg_restore_vel2_state(guest_ctxt); + } else { + if (vcpu_has_nv(vcpu)) { + /* + * Only set VPIDR_EL2 for nested VMs, as this is the + * only time it changes. We'll restore the MIDR_EL1 + * view on put. + */ + write_sysreg(ctxt_sys_reg(guest_ctxt, VPIDR_EL2), vpidr_el2); + + /* + * As we're restoring a nested guest, set the value + * provided by the guest hypervisor. + */ + mpidr = ctxt_sys_reg(guest_ctxt, VMPIDR_EL2); + } else { + mpidr = ctxt_sys_reg(guest_ctxt, MPIDR_EL1); + } + + __sysreg_restore_el1_state(guest_ctxt, mpidr); + } vcpu_set_flag(vcpu, SYSREGS_ON_CPU); } @@ -112,12 +233,20 @@ void __vcpu_put_switch_sysregs(struct kvm_vcpu *vcpu) host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; - __sysreg_save_el1_state(guest_ctxt); + if (unlikely(__is_hyp_ctxt(guest_ctxt))) + __sysreg_save_vel2_state(guest_ctxt); + else + __sysreg_save_el1_state(guest_ctxt); + __sysreg_save_user_state(guest_ctxt); __sysreg32_save_state(vcpu); /* Restore host user state */ __sysreg_restore_user_state(host_ctxt); + /* If leaving a nesting guest, restore MPIDR_EL1 default view */ + if (vcpu_has_nv(vcpu)) + write_sysreg(read_cpuid_id(), vpidr_el2); + vcpu_clear_flag(vcpu, SYSREGS_ON_CPU); }