From patchwork Wed Apr 5 15:39:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69133C76188 for ; Wed, 5 Apr 2023 15:40:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238798AbjDEPkd (ORCPT ); Wed, 5 Apr 2023 11:40:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238602AbjDEPkb (ORCPT ); Wed, 5 Apr 2023 11:40:31 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 727012115 for ; Wed, 5 Apr 2023 08:40:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F266162717 for ; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5893EC4339B; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709229; bh=cgECd0Xl7cqJInXMK8BolOW16mKjxex4k5xxxCQT6bM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j9A6SiIRY8OGZ2FmPWNOwOHGG2vTJ9Bc3IXZUAw5D1eZT/7wQAzO3LSLdeYfPrqeb VZPZJMkeqbwf6eyxQuT5+5LvwdqEp3d8SyZRkxPkab1VrDGRKRZBcvztBxa5szM8UV jiCHVBDZ3onOra2k/PSLWnTPoeIReRaQL6yZAJcNvd5v64K2V1SXGQp/GoECPzY6Hc qMEiQKYLhMuSv+ofP8pgMJhUVFS3M7BzKZujImclXZ9rQGOwMYcocCBVwMBRXgzb2/ 4T5cSpvu0bnX5NPaIhqxMsvCq1JFY/G1UDkAp2WXMZCLBUc+TVggk00pjkVsAtExAE hukAcLktw2PnQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FL-0062PV-3w; Wed, 05 Apr 2023 16:40:27 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 01/50] KVM: arm64: nv: Add non-VHE-EL2->EL1 translation helpers Date: Wed, 5 Apr 2023 16:39:19 +0100 Message-Id: <20230405154008.3552854-2-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Some EL2 system registers immediately affect the current execution of the system, so we need to use their respective EL1 counterparts. For this we need to define a mapping between the two. In general, this only affects non-VHE guest hypervisors, as VHE system registers are compatible with the EL1 counterparts. These helpers will get used in subsequent patches. Co-developed-by: Andre Przywara Signed-off-by: Andre Przywara Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 48 +++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 8fb67f032fd1..f8c4db870520 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -2,6 +2,7 @@ #ifndef __ARM64_KVM_NESTED_H #define __ARM64_KVM_NESTED_H +#include #include static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu) @@ -11,6 +12,53 @@ static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu) test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features)); } +/* Translation helpers from non-VHE EL2 to EL1 */ +static inline u64 tcr_el2_ps_to_tcr_el1_ips(u64 tcr_el2) +{ + return (u64)FIELD_GET(TCR_EL2_PS_MASK, tcr_el2) << TCR_IPS_SHIFT; +} + +static inline u64 translate_tcr_el2_to_tcr_el1(u64 tcr) +{ + return TCR_EPD1_MASK | /* disable TTBR1_EL1 */ + ((tcr & TCR_EL2_TBI) ? TCR_TBI0 : 0) | + tcr_el2_ps_to_tcr_el1_ips(tcr) | + (tcr & TCR_EL2_TG0_MASK) | + (tcr & TCR_EL2_ORGN0_MASK) | + (tcr & TCR_EL2_IRGN0_MASK) | + (tcr & TCR_EL2_T0SZ_MASK); +} + +static inline u64 translate_cptr_el2_to_cpacr_el1(u64 cptr_el2) +{ + u64 cpacr_el1 = 0; + + if (cptr_el2 & CPTR_EL2_TTA) + cpacr_el1 |= CPACR_ELx_TTA; + if (!(cptr_el2 & CPTR_EL2_TFP)) + cpacr_el1 |= CPACR_ELx_FPEN; + if (!(cptr_el2 & CPTR_EL2_TZ)) + cpacr_el1 |= CPACR_ELx_ZEN; + + return cpacr_el1; +} + +static inline u64 translate_sctlr_el2_to_sctlr_el1(u64 val) +{ + /* Only preserve the minimal set of bits we support */ + val &= (SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | SCTLR_ELx_SA | + SCTLR_ELx_I | SCTLR_ELx_IESB | SCTLR_ELx_WXN | SCTLR_ELx_EE); + val |= SCTLR_EL1_RES1; + + return val; +} + +static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) +{ + /* Clear the ASID field */ + return ttbr0 & ~GENMASK_ULL(63, 48); +} + struct sys_reg_params; struct sys_reg_desc; From patchwork Wed Apr 5 15:39:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3713BC77B60 for ; Wed, 5 Apr 2023 15:40:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238831AbjDEPkf (ORCPT ); Wed, 5 Apr 2023 11:40:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238761AbjDEPkc (ORCPT ); Wed, 5 Apr 2023 11:40:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADC3249F1 for ; Wed, 5 Apr 2023 08:40:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1C82963EAE for ; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5B1A3C4339C; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709229; bh=5SOKUa47esisKdQcjcqiNmrTVBDZg/xMIdz1t63VVtw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LKgPAHPElCPlZ1quiR56E3A9qhatI2N6Ng8IDVvit/NmqKFrSjl9KKIDW6ZFa71FB vFnmdQuFDKnhRZtTfY1XbjiJ2TfRA5pdSrx3bNWMj7YYPSjJnM3TeHlfSO7458EHGM L4WZNiwIY1fHUXzgfOF8U9PtROWBgWKR8bgtxJQFzVWc2AeFzxrw6NHfmRM6kgm9Bb KkTzZ+Ra0Q8cD29dmPPnq6hH0PKpB2W2mKVm0rq1sJ1+po6F9rBhvpXmDUvHDvFa55 zfI13GtQdt/ivyzxuXuSw1T8A9H6OT6l745uaaTUPSkaUmBZ3gOn7itfkQylqQ32Cb x9HofXb+vzjCA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FL-0062PV-Ci; Wed, 05 Apr 2023 16:40:27 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 02/50] KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg() Date: Wed, 5 Apr 2023 16:39:20 +0100 Message-Id: <20230405154008.3552854-3-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM internally uses accessor functions when reading or writing the guest's system registers. This takes care of accessing either the stored copy or using the "live" EL1 system registers when the host uses VHE. With the introduction of virtual EL2 we add a bunch of EL2 system registers, which now must also be taken care of: - If the guest is running in vEL2, and we access an EL1 sysreg, we must revert to the stored version of that, and not use the CPU's copy. - If the guest is running in vEL1, and we access an EL2 sysreg, we must also use the stored version, since the CPU carries the EL1 copy. - Some EL2 system registers are supposed to affect the current execution of the system, so we need to put them into their respective EL1 counterparts. For this we need to define a mapping between the two. This is done using the newly introduced struct el2_sysreg_map. - Some EL2 system registers have a different format than their EL1 counterpart, so we need to translate them before writing them to the CPU. This is done using an (optional) translate function in the map. - There are the three special registers SP_EL2, SPSR_EL2 and ELR_EL2, which need some separate handling (SPSR_EL2 is being handled in a separate patch). All of these cases are now wrapped into the existing accessor functions, so KVM users wouldn't need to care whether they access EL2 or EL1 registers and also which state the guest is in. This handles what was formerly known as the "shadow state" dynamically, without requiring a separate copy for each vCPU EL. Reviewed-by: Ganapatrao Kulkarni Reviewed-by: Alexandru Elisei Reviewed-by: Russell King (Oracle) Co-developed-by: Andre Przywara Signed-off-by: Andre Przywara Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 143 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 138 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index f19c69ef4471..73580d9440fe 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -63,24 +63,157 @@ static bool write_to_read_only(struct kvm_vcpu *vcpu, return false; } +#define PURE_EL2_SYSREG(el2) \ + case el2: { \ + *el1r = el2; \ + return true; \ + } + +#define MAPPED_EL2_SYSREG(el2, el1, fn) \ + case el2: { \ + *xlate = fn; \ + *el1r = el1; \ + return true; \ + } + +static bool get_el2_to_el1_mapping(unsigned int reg, + unsigned int *el1r, u64 (**xlate)(u64)) +{ + switch (reg) { + PURE_EL2_SYSREG( VPIDR_EL2 ); + PURE_EL2_SYSREG( VMPIDR_EL2 ); + PURE_EL2_SYSREG( ACTLR_EL2 ); + PURE_EL2_SYSREG( HCR_EL2 ); + PURE_EL2_SYSREG( MDCR_EL2 ); + PURE_EL2_SYSREG( HSTR_EL2 ); + PURE_EL2_SYSREG( HACR_EL2 ); + PURE_EL2_SYSREG( VTTBR_EL2 ); + PURE_EL2_SYSREG( VTCR_EL2 ); + PURE_EL2_SYSREG( RVBAR_EL2 ); + PURE_EL2_SYSREG( TPIDR_EL2 ); + PURE_EL2_SYSREG( HPFAR_EL2 ); + PURE_EL2_SYSREG( ELR_EL2 ); + PURE_EL2_SYSREG( SPSR_EL2 ); + PURE_EL2_SYSREG( CNTHCTL_EL2 ); + MAPPED_EL2_SYSREG(SCTLR_EL2, SCTLR_EL1, + translate_sctlr_el2_to_sctlr_el1 ); + MAPPED_EL2_SYSREG(CPTR_EL2, CPACR_EL1, + translate_cptr_el2_to_cpacr_el1 ); + MAPPED_EL2_SYSREG(TTBR0_EL2, TTBR0_EL1, + translate_ttbr0_el2_to_ttbr0_el1 ); + MAPPED_EL2_SYSREG(TTBR1_EL2, TTBR1_EL1, NULL ); + MAPPED_EL2_SYSREG(TCR_EL2, TCR_EL1, + translate_tcr_el2_to_tcr_el1 ); + MAPPED_EL2_SYSREG(VBAR_EL2, VBAR_EL1, NULL ); + MAPPED_EL2_SYSREG(AFSR0_EL2, AFSR0_EL1, NULL ); + MAPPED_EL2_SYSREG(AFSR1_EL2, AFSR1_EL1, NULL ); + MAPPED_EL2_SYSREG(ESR_EL2, ESR_EL1, NULL ); + MAPPED_EL2_SYSREG(FAR_EL2, FAR_EL1, NULL ); + MAPPED_EL2_SYSREG(MAIR_EL2, MAIR_EL1, NULL ); + MAPPED_EL2_SYSREG(AMAIR_EL2, AMAIR_EL1, NULL ); + default: + return false; + } +} + u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg) { u64 val = 0x8badf00d8badf00d; + u64 (*xlate)(u64) = NULL; + unsigned int el1r; - if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) && - __vcpu_read_sys_reg_from_cpu(reg, &val)) + if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU)) + goto memory_read; + + if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) { + if (!is_hyp_ctxt(vcpu)) + goto memory_read; + + /* + * ELR_EL2 is special cased for now. + */ + switch (reg) { + case ELR_EL2: + return read_sysreg_el1(SYS_ELR); + } + + /* + * If this register does not have an EL1 counterpart, + * then read the stored EL2 version. + */ + if (reg == el1r) + goto memory_read; + + /* + * If we have a non-VHE guest and that the sysreg + * requires translation to be used at EL1, use the + * in-memory copy instead. + */ + if (!vcpu_el2_e2h_is_set(vcpu) && xlate) + goto memory_read; + + /* Get the current version of the EL1 counterpart. */ + WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val)); + return val; + } + + /* EL1 register can't be on the CPU if the guest is in vEL2. */ + if (unlikely(is_hyp_ctxt(vcpu))) + goto memory_read; + + if (__vcpu_read_sys_reg_from_cpu(reg, &val)) return val; +memory_read: return __vcpu_sys_reg(vcpu, reg); } void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg) { - if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) && - __vcpu_write_sys_reg_to_cpu(val, reg)) + u64 (*xlate)(u64) = NULL; + unsigned int el1r; + + if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU)) + goto memory_write; + + if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) { + if (!is_hyp_ctxt(vcpu)) + goto memory_write; + + /* + * Always store a copy of the write to memory to avoid having + * to reverse-translate virtual EL2 system registers for a + * non-VHE guest hypervisor. + */ + __vcpu_sys_reg(vcpu, reg) = val; + + switch (reg) { + case ELR_EL2: + write_sysreg_el1(val, SYS_ELR); + return; + } + + /* No EL1 counterpart? We're done here.? */ + if (reg == el1r) + return; + + if (!vcpu_el2_e2h_is_set(vcpu) && xlate) + val = xlate(val); + + /* Redirect this to the EL1 version of the register. */ + WARN_ON(!__vcpu_write_sys_reg_to_cpu(val, el1r)); + return; + } + + /* EL1 register can't be on the CPU if the guest is in vEL2. */ + if (unlikely(is_hyp_ctxt(vcpu))) + goto memory_write; + + if (__vcpu_write_sys_reg_to_cpu(val, reg)) return; - __vcpu_sys_reg(vcpu, reg) = val; +memory_write: + __vcpu_sys_reg(vcpu, reg) = val; } /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ From patchwork Wed Apr 5 15:39:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAF31C76188 for ; Wed, 5 Apr 2023 15:40:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238837AbjDEPkh (ORCPT ); Wed, 5 Apr 2023 11:40:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238779AbjDEPkc (ORCPT ); Wed, 5 Apr 2023 11:40:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFF024C20 for ; Wed, 5 Apr 2023 08:40:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 401906274E for ; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9F9EFC4339E; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709229; bh=GWePQFmg78hwBTIU51fScrUFvgdI2idwR9m0Wr8qcBQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TE/83g4ctIDQ4am/oiTi3HOOFaMMps/KdE4Y6xGEIkyvz/CbqomCKx659Vxxg4Ebq ylyBwzsLuJRemSjJu4kCIFqkdGE3ylG7/HE/TL24X78rrfN0V0BOT1J7TfZHJNt8o6 IuZxfULFoYtStWbj2n20NMFVztzLRMbEEtG5H+dHq/UpUjmxm+N3nlfAsZySI0lMeg QtnxyQ5LzR+AzGj9pX4NkNta+14EhDQiKydP7EnmAMF6e/Dehneiu3cGxkibb/+5r/ KBTYNm4ejWdzTnKvIJXvvzTJCDYzusnnmj9t/tC1Zw2pAGCsei5ufpeOw3UIXFh6hh 9323rU2BvuOng== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FL-0062PV-Kx; Wed, 05 Apr 2023 16:40:27 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 03/50] KVM: arm64: nv: Handle SPSR_EL2 specially Date: Wed, 5 Apr 2023 16:39:21 +0100 Message-Id: <20230405154008.3552854-4-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org SPSR_EL2 needs special attention when running nested on ARMv8.3: If taking an exception while running at vEL2 (actually EL1), the HW will update the SPSR_EL1 register with the EL1 mode. We need to track this in order to make sure that accesses to the virtual view of SPSR_EL2 is correct. To do so, we place an illegal value in SPSR_EL1.M, and patch it accordingly if required when accessing it. Reviewed-by: Alexandru Elisei Reviewed-by: Russell King (Oracle) Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_emulate.h | 37 ++++++++++++++++++++++++++++ arch/arm64/kvm/sys_regs.c | 23 +++++++++++++++-- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index b31b32ecbe2d..2f5d3e27c017 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -245,6 +245,43 @@ static inline bool is_hyp_ctxt(const struct kvm_vcpu *vcpu) return __is_hyp_ctxt(&vcpu->arch.ctxt); } +static inline u64 __fixup_spsr_el2_write(struct kvm_cpu_context *ctxt, u64 val) +{ + if (!__vcpu_el2_e2h_is_set(ctxt)) { + /* + * Clear the .M field when writing SPSR to the CPU, so that we + * can detect when the CPU clobbered our SPSR copy during a + * local exception. + */ + val &= ~0xc; + } + + return val; +} + +static inline u64 __fixup_spsr_el2_read(const struct kvm_cpu_context *ctxt, u64 val) +{ + if (__vcpu_el2_e2h_is_set(ctxt)) + return val; + + /* + * SPSR.M == 0 means the CPU has not touched the SPSR, so the + * register has still the value we saved on the last write. + */ + if ((val & 0xc) == 0) + return ctxt_sys_reg(ctxt, SPSR_EL2); + + /* + * Otherwise there was a "local" exception on the CPU, + * which from the guest's point of view was being taken from + * EL2 to EL2, although it actually happened to be from + * EL1 to EL1. + * So we need to fix the .M field in SPSR, to make it look + * like EL2, which is what the guest would expect. + */ + return (val & ~0x0c) | CurrentEL_EL2; +} + /* * The layout of SPSR for an AArch32 state is different when observed from an * AArch64 SPSR_ELx or an AArch32 SPSR_*. This function generates the AArch32 diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 73580d9440fe..6a7edda9e287 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -130,11 +130,14 @@ u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg) goto memory_read; /* - * ELR_EL2 is special cased for now. + * ELR_EL2 and SPSR_EL2 are special cased for now. */ switch (reg) { case ELR_EL2: return read_sysreg_el1(SYS_ELR); + case SPSR_EL2: + val = read_sysreg_el1(SYS_SPSR); + return __fixup_spsr_el2_read(&vcpu->arch.ctxt, val); } /* @@ -191,6 +194,10 @@ void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg) case ELR_EL2: write_sysreg_el1(val, SYS_ELR); return; + case SPSR_EL2: + val = __fixup_spsr_el2_write(&vcpu->arch.ctxt, val); + write_sysreg_el1(val, SYS_SPSR); + return; } /* No EL1 counterpart? We're done here.? */ @@ -1877,6 +1884,18 @@ static bool access_spsr(struct kvm_vcpu *vcpu, return true; } +static bool access_spsr_el2(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + vcpu_write_sys_reg(vcpu, p->regval, SPSR_EL2); + else + p->regval = vcpu_read_sys_reg(vcpu, SPSR_EL2); + + return true; +} + /* * Architected system registers. * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2 @@ -2325,7 +2344,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(VTCR_EL2, access_rw, reset_val, 0), { SYS_DESC(SYS_DACR32_EL2), NULL, reset_unknown, DACR32_EL2 }, - EL2_REG(SPSR_EL2, access_rw, reset_val, 0), + EL2_REG(SPSR_EL2, access_spsr_el2, reset_val, 0), EL2_REG(ELR_EL2, access_rw, reset_val, 0), { SYS_DESC(SYS_SP_EL1), access_sp_el1}, From patchwork Wed Apr 5 15:39:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22396C77B6C for ; Wed, 5 Apr 2023 15:40:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238836AbjDEPkg (ORCPT ); Wed, 5 Apr 2023 11:40:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238762AbjDEPkc (ORCPT ); Wed, 5 Apr 2023 11:40:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6CEC4ED2 for ; Wed, 5 Apr 2023 08:40:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 504A663ECA for ; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE0EEC433A0; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709229; bh=C/dpwe2tEdVYQumGhA9Hxinc07lK+udR2n235alzUno=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y0lqn4yL7PGYL5iHg8J92/pHiumWExOg4OtBUYqTZpA1Q47h4yPMhk2/e2DK+eQmQ JKa1DFgZbrPnDRPcFJItiQw0G1EJiuVvM5ON5PF4kOUAbZEmEgXVjmFeqPqAGfbrSO qNN70u8jbwXG5N8VGaxLi9kPRAuaJlvorfGld3K4j9ZHDpnbVg8Vi/489KOmi0YmBC MMYc490iB70hbtideGaTk6nHK+i/HusManoabbjRUlQ7rXw/ELDXbHCYjnLRWt8sWp 1U8d6SxViGdVOAvK+nxgaoG4i2Rpmadg42jaOmPEKzrzt02Vj3IIzKufF8yEkI9XFN frmZEzF2vEgbQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FL-0062PV-UO; Wed, 05 Apr 2023 16:40:27 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 04/50] KVM: arm64: nv: Handle HCR_EL2.E2H specially Date: Wed, 5 Apr 2023 16:39:22 +0100 Message-Id: <20230405154008.3552854-5-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org HCR_EL2.E2H is nasty, as a flip of this bit completely changes the way we deal with a lot of the state. So when the guest flips this bit (sysregs are live), do the put/load dance so that we have a consistent state. Yes, this is slow. Don't do it. Suggested-by: Alexandru Elisei Reviewed-by: Russell King (Oracle) Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 6a7edda9e287..3311a6d822ee 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -180,9 +180,24 @@ void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg) goto memory_write; if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) { + bool need_put_load; + if (!is_hyp_ctxt(vcpu)) goto memory_write; + /* + * HCR_EL2.E2H is nasty: it changes the way we interpret a + * lot of the EL2 state, so treat is as a full state + * transition. + */ + need_put_load = ((reg == HCR_EL2) && + vcpu_el2_e2h_is_set(vcpu) != !!(val & HCR_E2H)); + + if (need_put_load) { + preempt_disable(); + kvm_arch_vcpu_put(vcpu); + } + /* * Always store a copy of the write to memory to avoid having * to reverse-translate virtual EL2 system registers for a @@ -190,6 +205,11 @@ void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg) */ __vcpu_sys_reg(vcpu, reg) = val; + if (need_put_load) { + kvm_arch_vcpu_load(vcpu, smp_processor_id()); + preempt_enable(); + } + switch (reg) { case ELR_EL2: write_sysreg_el1(val, SYS_ELR); From patchwork Wed Apr 5 15:39:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 019E5C7619A for ; Wed, 5 Apr 2023 15:40:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238779AbjDEPki (ORCPT ); Wed, 5 Apr 2023 11:40:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238797AbjDEPkc (ORCPT ); Wed, 5 Apr 2023 11:40:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D71462115 for ; Wed, 5 Apr 2023 08:40:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9B11E628D9 for ; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E5425C433EF; Wed, 5 Apr 2023 15:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709230; bh=jl+iXwoenl/exGtzQC6WFNEAWpLBF7mQvfHtMEdRUJU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JfYTqEjX8yykQjPj/G0hjDUF0vTbk/JKIhrErm6icYTACp2ohzJzSsPcJHFVFocEN 5g4rBw0ZSQCt9R6y0vH81TUOAyo21KMPBksvUt2WzS8UzbdzavKy3RCSb3AWCzW1qQ bRVg9WaPauqmOTC24tZXol7UGB/T7sGJHIQj+8NIaTLpjDU7NWJ99mZCpvg9G1GC5+ 2qTq176Zix16pUHrtZigxC4XTWJH6BaQyrUr9CHkpjkBHApJjS/Ku+h+HLJBaZxuL9 l7hn4jtG3Da3twtZ2gAzqkpd96Qc6oYMXLW3XCqECuwW2potG48ATrYHJvxx+n2rYN pZVT1sbhAkxog== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FM-0062PV-6h; Wed, 05 Apr 2023 16:40:28 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 05/50] KVM: arm64: nv: Save/Restore vEL2 sysregs Date: Wed, 5 Apr 2023 16:39:23 +0100 Message-Id: <20230405154008.3552854-6-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Whenever we need to restore the guest's system registers to the CPU, we now need to take care of the EL2 system registers as well. Most of them are accessed via traps only, but some have an immediate effect and also a guest running in VHE mode would expect them to be accessible via their EL1 encoding, which we do not trap. For vEL2 we write the virtual EL2 registers with an identical format directly into their EL1 counterpart, and translate the few registers that have a different format for the same effect on the execution when running a non-VHE guest guest hypervisor. Based on an initial patch from Andre Przywara, rewritten many times since. Reviewed-by: Alexandru Elisei Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 5 +- arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 2 +- arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 125 ++++++++++++++++++++- 3 files changed, 127 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h index 699ea1f8d409..d67c620f2af7 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -91,9 +91,10 @@ static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt_sys_reg(ctxt, TPIDRRO_EL0), tpidrro_el0); } -static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) +static inline void __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt, + u64 mpidr) { - write_sysreg(ctxt_sys_reg(ctxt, MPIDR_EL1), vmpidr_el2); + write_sysreg(mpidr, vmpidr_el2); if (has_vhe() || !cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { diff --git a/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c index 29305022bc04..dba101565de3 100644 --- a/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c @@ -28,7 +28,7 @@ void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt) void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt) { - __sysreg_restore_el1_state(ctxt); + __sysreg_restore_el1_state(ctxt, ctxt_sys_reg(ctxt, MPIDR_EL1)); __sysreg_restore_common_state(ctxt); __sysreg_restore_user_state(ctxt); __sysreg_restore_el2_return_state(ctxt); diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c index 7b44f6b3b547..3f9d6952fbd7 100644 --- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c @@ -13,6 +13,96 @@ #include #include #include +#include + +static void __sysreg_save_vel2_state(struct kvm_cpu_context *ctxt) +{ + /* These registers are common with EL1 */ + ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg(par_el1); + ctxt_sys_reg(ctxt, TPIDR_EL1) = read_sysreg(tpidr_el1); + + ctxt_sys_reg(ctxt, ESR_EL2) = read_sysreg_el1(SYS_ESR); + ctxt_sys_reg(ctxt, AFSR0_EL2) = read_sysreg_el1(SYS_AFSR0); + ctxt_sys_reg(ctxt, AFSR1_EL2) = read_sysreg_el1(SYS_AFSR1); + ctxt_sys_reg(ctxt, FAR_EL2) = read_sysreg_el1(SYS_FAR); + ctxt_sys_reg(ctxt, MAIR_EL2) = read_sysreg_el1(SYS_MAIR); + ctxt_sys_reg(ctxt, VBAR_EL2) = read_sysreg_el1(SYS_VBAR); + ctxt_sys_reg(ctxt, CONTEXTIDR_EL2) = read_sysreg_el1(SYS_CONTEXTIDR); + ctxt_sys_reg(ctxt, AMAIR_EL2) = read_sysreg_el1(SYS_AMAIR); + + /* + * In VHE mode those registers are compatible between EL1 and EL2, + * and the guest uses the _EL1 versions on the CPU naturally. + * So we save them into their _EL2 versions here. + * For nVHE mode we trap accesses to those registers, so our + * _EL2 copy in sys_regs[] is always up-to-date and we don't need + * to save anything here. + */ + if (__vcpu_el2_e2h_is_set(ctxt)) { + ctxt_sys_reg(ctxt, SCTLR_EL2) = read_sysreg_el1(SYS_SCTLR); + ctxt_sys_reg(ctxt, CPTR_EL2) = read_sysreg_el1(SYS_CPACR); + ctxt_sys_reg(ctxt, TTBR0_EL2) = read_sysreg_el1(SYS_TTBR0); + ctxt_sys_reg(ctxt, TTBR1_EL2) = read_sysreg_el1(SYS_TTBR1); + ctxt_sys_reg(ctxt, TCR_EL2) = read_sysreg_el1(SYS_TCR); + ctxt_sys_reg(ctxt, CNTHCTL_EL2) = read_sysreg_el1(SYS_CNTKCTL); + } + + ctxt_sys_reg(ctxt, SP_EL2) = read_sysreg(sp_el1); + ctxt_sys_reg(ctxt, ELR_EL2) = read_sysreg_el1(SYS_ELR); + ctxt_sys_reg(ctxt, SPSR_EL2) = __fixup_spsr_el2_read(ctxt, read_sysreg_el1(SYS_SPSR)); +} + +static void __sysreg_restore_vel2_state(struct kvm_cpu_context *ctxt) +{ + u64 val; + + /* These registers are common with EL1 */ + write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1), par_el1); + write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1), tpidr_el1); + + write_sysreg(read_cpuid_id(), vpidr_el2); + write_sysreg(ctxt_sys_reg(ctxt, MPIDR_EL1), vmpidr_el2); + write_sysreg_el1(ctxt_sys_reg(ctxt, MAIR_EL2), SYS_MAIR); + write_sysreg_el1(ctxt_sys_reg(ctxt, VBAR_EL2), SYS_VBAR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CONTEXTIDR_EL2),SYS_CONTEXTIDR); + write_sysreg_el1(ctxt_sys_reg(ctxt, AMAIR_EL2), SYS_AMAIR); + + if (__vcpu_el2_e2h_is_set(ctxt)) { + /* + * In VHE mode those registers are compatible between + * EL1 and EL2. + */ + write_sysreg_el1(ctxt_sys_reg(ctxt, SCTLR_EL2), SYS_SCTLR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CPTR_EL2), SYS_CPACR); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR0_EL2), SYS_TTBR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR1_EL2), SYS_TTBR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, TCR_EL2), SYS_TCR); + write_sysreg_el1(ctxt_sys_reg(ctxt, CNTHCTL_EL2), SYS_CNTKCTL); + } else { + /* + * CNTHCTL_EL2 only affects EL1 when running nVHE, so + * no need to restore it. + */ + val = translate_sctlr_el2_to_sctlr_el1(ctxt_sys_reg(ctxt, SCTLR_EL2)); + write_sysreg_el1(val, SYS_SCTLR); + val = translate_cptr_el2_to_cpacr_el1(ctxt_sys_reg(ctxt, CPTR_EL2)); + write_sysreg_el1(val, SYS_CPACR); + val = translate_ttbr0_el2_to_ttbr0_el1(ctxt_sys_reg(ctxt, TTBR0_EL2)); + write_sysreg_el1(val, SYS_TTBR0); + val = translate_tcr_el2_to_tcr_el1(ctxt_sys_reg(ctxt, TCR_EL2)); + write_sysreg_el1(val, SYS_TCR); + } + + write_sysreg_el1(ctxt_sys_reg(ctxt, ESR_EL2), SYS_ESR); + write_sysreg_el1(ctxt_sys_reg(ctxt, AFSR0_EL2), SYS_AFSR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, AFSR1_EL2), SYS_AFSR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, FAR_EL2), SYS_FAR); + write_sysreg(ctxt_sys_reg(ctxt, SP_EL2), sp_el1); + write_sysreg_el1(ctxt_sys_reg(ctxt, ELR_EL2), SYS_ELR); + + val = __fixup_spsr_el2_write(ctxt, ctxt_sys_reg(ctxt, SPSR_EL2)); + write_sysreg_el1(val, SYS_SPSR); +} /* * VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and @@ -65,6 +155,7 @@ void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt; struct kvm_cpu_context *host_ctxt; + u64 mpidr; host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; __sysreg_save_user_state(host_ctxt); @@ -77,7 +168,29 @@ void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu) */ __sysreg32_restore_state(vcpu); __sysreg_restore_user_state(guest_ctxt); - __sysreg_restore_el1_state(guest_ctxt); + + if (unlikely(__is_hyp_ctxt(guest_ctxt))) { + __sysreg_restore_vel2_state(guest_ctxt); + } else { + if (vcpu_has_nv(vcpu)) { + /* + * Only set VPIDR_EL2 for nested VMs, as this is the + * only time it changes. We'll restore the MIDR_EL1 + * view on put. + */ + write_sysreg(ctxt_sys_reg(guest_ctxt, VPIDR_EL2), vpidr_el2); + + /* + * As we're restoring a nested guest, set the value + * provided by the guest hypervisor. + */ + mpidr = ctxt_sys_reg(guest_ctxt, VMPIDR_EL2); + } else { + mpidr = ctxt_sys_reg(guest_ctxt, MPIDR_EL1); + } + + __sysreg_restore_el1_state(guest_ctxt, mpidr); + } vcpu_set_flag(vcpu, SYSREGS_ON_CPU); @@ -103,12 +216,20 @@ void kvm_vcpu_put_sysregs_vhe(struct kvm_vcpu *vcpu) host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt; deactivate_traps_vhe_put(vcpu); - __sysreg_save_el1_state(guest_ctxt); + if (unlikely(__is_hyp_ctxt(guest_ctxt))) + __sysreg_save_vel2_state(guest_ctxt); + else + __sysreg_save_el1_state(guest_ctxt); + __sysreg_save_user_state(guest_ctxt); __sysreg32_save_state(vcpu); /* Restore host user state */ __sysreg_restore_user_state(host_ctxt); + /* If leaving a nesting guest, restore MPIDR_EL1 default view */ + if (vcpu_has_nv(vcpu)) + write_sysreg(read_cpuid_id(), vpidr_el2); + vcpu_clear_flag(vcpu, SYSREGS_ON_CPU); } From patchwork Wed Apr 5 15:39:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 948E8C76188 for ; Wed, 5 Apr 2023 15:40:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238896AbjDEPkl (ORCPT ); Wed, 5 Apr 2023 11:40:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238799AbjDEPkd (ORCPT ); Wed, 5 Apr 2023 11:40:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA58B5259 for ; Wed, 5 Apr 2023 08:40:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F354163ECC for ; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6577AC433A4; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709230; bh=rBRc/7G60+gG3muqj9osUmi9CKGSlkOggpLZ/n3cytw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KbN7NyY/oS7DeWjAVOYaXa3dQ0jLGDewCJFLK/tXwXEniuYI+BNkR3PcA3f0yZXCC L002Js3f9Haum7rHMXFqGbG8cwCFqFO1A/9QaKPi1qIQJ/mmazHBEqqHVmy9/osfPs hxlqZUV8DjYYFstLO15HBHGA+LqFHY+KMohZFnszT20lI8pWjQGzjPKYPjT7gzRWxX dqfjA+NDlXwfu6aUEn9/iRozxZPTCHI/EAMVIwux9HGV6jRq0COLhGNK7LRgPgT+2S ufS97cnFwFhrkM+QI5Lr88qTu6A2gYCEiOEbulB71J+PfeGvmnhGlsYPdA74wVbFcZ y1q+xetfwfdiw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FM-0062PV-GK; Wed, 05 Apr 2023 16:40:28 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 06/50] KVM: arm64: nv: Trap EL1 VM register accesses in virtual EL2 Date: Wed, 5 Apr 2023 16:39:24 +0100 Message-Id: <20230405154008.3552854-7-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Christoffer Dall When running in virtual EL2 mode, we actually run the hardware in EL1 and therefore have to use the EL1 registers to ensure correct operation. By setting the HCR.TVM and HCR.TVRM we ensure that the virtual EL2 mode doesn't shoot itself in the foot when setting up what it believes to be a different mode's system register state (for example when preparing to switch to a VM). We can leverage the existing sysregs infrastructure to support trapped accesses to these registers. Reviewed-by: Russell King (Oracle) Reviewed-by: Alexandru Elisei Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/include/hyp/switch.h | 4 +--- arch/arm64/kvm/hyp/nvhe/switch.c | 2 +- arch/arm64/kvm/hyp/vhe/switch.c | 7 ++++++- arch/arm64/kvm/sys_regs.c | 19 ++++++++++++++++--- 4 files changed, 24 insertions(+), 8 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index c41166f1a1dd..f7bc6f1a4e0b 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -119,10 +119,8 @@ static inline void __deactivate_traps_common(struct kvm_vcpu *vcpu) } } -static inline void ___activate_traps(struct kvm_vcpu *vcpu) +static inline void ___activate_traps(struct kvm_vcpu *vcpu, u64 hcr) { - u64 hcr = vcpu->arch.hcr_el2; - if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM)) hcr |= HCR_TVM; diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c index c2cb46ca4fb6..efac8fbe0b20 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -40,7 +40,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu) { u64 val; - ___activate_traps(vcpu); + ___activate_traps(vcpu, vcpu->arch.hcr_el2); __activate_traps_common(vcpu); val = vcpu->arch.cptr_el2; diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index cd3f3117bf16..c8da8d350453 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -35,9 +35,14 @@ DEFINE_PER_CPU(unsigned long, kvm_hyp_vector); static void __activate_traps(struct kvm_vcpu *vcpu) { + u64 hcr = vcpu->arch.hcr_el2; u64 val; - ___activate_traps(vcpu); + /* Trap VM sysreg accesses if an EL2 guest is not using VHE. */ + if (vcpu_is_el2(vcpu) && !vcpu_el2_e2h_is_set(vcpu)) + hcr |= HCR_TVM | HCR_TRVM; + + ___activate_traps(vcpu, hcr); val = read_sysreg(cpacr_el1); val |= CPACR_ELx_TTA; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 3311a6d822ee..5dc158c39baa 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -391,8 +391,15 @@ static void get_access_mask(const struct sys_reg_desc *r, u64 *mask, u64 *shift) /* * Generic accessor for VM registers. Only called as long as HCR_TVM - * is set. If the guest enables the MMU, we stop trapping the VM - * sys_regs and leave it in complete control of the caches. + * is set. + * + * This is set in two cases: either (1) we're running at vEL2, or (2) + * we're running at EL1 and the guest has its MMU off. + * + * (1) TVM/TRVM is set, as we need to virtualise some of the VM + * registers for the guest hypervisor + * (2) Once the guest enables the MMU, we stop trapping the VM sys_regs + * and leave it in complete control of the caches. */ static bool access_vm_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *p, @@ -401,7 +408,13 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, bool was_enabled = vcpu_has_cache_enabled(vcpu); u64 val, mask, shift; - BUG_ON(!p->is_write); + /* We don't expect TRVM on the host */ + BUG_ON(!vcpu_is_el2(vcpu) && !p->is_write); + + if (!p->is_write) { + p->regval = vcpu_read_sys_reg(vcpu, r->reg); + return true; + } get_access_mask(r, &mask, &shift); From patchwork Wed Apr 5 15:39:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F5D0C76188 for ; Wed, 5 Apr 2023 15:40:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238860AbjDEPko (ORCPT ); Wed, 5 Apr 2023 11:40:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238811AbjDEPkd (ORCPT ); Wed, 5 Apr 2023 11:40:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5DA5059C0 for ; Wed, 5 Apr 2023 08:40:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6FB6E63ED4 for ; Wed, 5 Apr 2023 15:40:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D5B65C433A7; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709230; bh=PU842ofrcMWedmXdFu0S0QmwVSfMPxO2WGHpEYZcEO4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LlwcTVnxkBw1sH+Xkqz8DcoE0af+YCQnkKMO2R2msKQeozqK1RNBX54wdfABIRXxE q9zQAU1VvSAIFYCKBAq20iiQXVf7JHbZwZHAZBzk01QDNVOBBF5hoMpoRF0c7TB555 IcXHC7eU6YqPaAk6km1Kp7O3CK1zZw+Iy10crqwoUrjOzicOZ+WJ1I0+HBn3+QIYqq d/CSuXCvAQGkegbgwCxNOeBYK+WXFi2aamZt3ipWnWCe8XRTWzj0JoqwR4afRwlyVZ NyLmj3NaiMzbwweHb5Xynkgh/rXEwbXbqMfnJrGUvWHaTC6Olh+S/UVjjhyDdBHMks 9x+AnT9LPdOSw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FM-0062PV-Rc; Wed, 05 Apr 2023 16:40:28 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 07/50] KVM: arm64: nv: Trap CPACR_EL1 access in virtual EL2 Date: Wed, 5 Apr 2023 16:39:25 +0100 Message-Id: <20230405154008.3552854-8-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim For the same reason we trap virtual memory register accesses in virtual EL2, we trap CPACR_EL1 access too; We allow the virtual EL2 mode to access EL1 system register state instead of the virtual EL2 one. Reviewed-by: Alexandru Elisei Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vhe/switch.c | 3 +++ arch/arm64/kvm/sys_regs.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index c8da8d350453..41385bb02d6b 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -68,6 +68,9 @@ static void __activate_traps(struct kvm_vcpu *vcpu) __activate_traps_fpsimd32(vcpu); } + if (vcpu_is_el2(vcpu) && !vcpu_el2_e2h_is_set(vcpu)) + val |= CPTR_EL2_TCPAC; + write_sysreg(val, cpacr_el1); write_sysreg(__this_cpu_read(kvm_hyp_vector), vbar_el1); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 5dc158c39baa..cc3104167232 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2066,7 +2066,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 }, { SYS_DESC(SYS_ACTLR_EL1), access_actlr, reset_actlr, ACTLR_EL1 }, - { SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 }, + { SYS_DESC(SYS_CPACR_EL1), access_rw, reset_val, CPACR_EL1, 0 }, MTE_REG(RGSR_EL1), MTE_REG(GCR_EL1), From patchwork Wed Apr 5 15:39:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D423C7619A for ; Wed, 5 Apr 2023 15:40:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238853AbjDEPkj (ORCPT ); Wed, 5 Apr 2023 11:40:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238807AbjDEPkd (ORCPT ); Wed, 5 Apr 2023 11:40:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CEF1559A for ; Wed, 5 Apr 2023 08:40:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0D2B463ED0 for ; Wed, 5 Apr 2023 15:40:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1773C433A1; Wed, 5 Apr 2023 15:40:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709230; bh=jbq5Iy6DFxO0TwOY6vUO/BAaSe6Vl+DDm//UTjVVy9A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YO2zc3D8Kmw6jbiN1XUmiA5LWc49yV6vXX2bM1j1UubSMAXXbhxFCCyT65QGCRPfj HZ/REU6nFCInsbX7hNxXoCWUh7Lo/buaesmMffvLqaSD34tQFo1xqqweAw4IwT/Hb4 7bHtPUNU/GwvNQcLr18k9VUNeL0RY1UDGlTT9d8gr3IxhLiI/kDXwAYyW7C05kY6RH ffQ6pwyFt9QiXJztIP62aBhwPdZQFRyCP6Ts8aoLV5oECy0MZCPLZfz3v6ZvTxuwXC 0cs88rHY5e0xOyiJD5WUQxn4duUkaOupb+LlUFzJ7fSz1+n2204G+j/vU/Bt0nbFrP 9Nl39MYCX/VnA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FN-0062PV-3q; Wed, 05 Apr 2023 16:40:29 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 08/50] KVM: arm64: nv: Respect virtual HCR_EL2.TWX setting Date: Wed, 5 Apr 2023 16:39:26 +0100 Message-Id: <20230405154008.3552854-9-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim Forward exceptions due to WFI or WFE instructions to the virtual EL2 if they are not coming from the virtual EL2 and virtual HCR_EL2.TWX is set. Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_emulate.h | 14 ++++++++++++++ arch/arm64/kvm/handle_exit.c | 6 +++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 2f5d3e27c017..6e9c0d76a818 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -333,6 +334,19 @@ static __always_inline u64 kvm_vcpu_get_esr(const struct kvm_vcpu *vcpu) return vcpu->arch.fault.esr_el2; } +static inline bool guest_hyp_wfx_traps_enabled(const struct kvm_vcpu *vcpu) +{ + u64 esr = kvm_vcpu_get_esr(vcpu); + bool is_wfe = !!(esr & ESR_ELx_WFx_ISS_WFE); + u64 hcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2); + + if (!vcpu_has_nv(vcpu) || vcpu_is_el2(vcpu)) + return false; + + return ((is_wfe && (hcr_el2 & HCR_TWE)) || + (!is_wfe && (hcr_el2 & HCR_TWI))); +} + static __always_inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu) { u64 esr = kvm_vcpu_get_esr(vcpu); diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index a798c0b4d717..758ab72e4dc4 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -126,8 +126,12 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu) static int kvm_handle_wfx(struct kvm_vcpu *vcpu) { u64 esr = kvm_vcpu_get_esr(vcpu); + bool is_wfe = !!(esr & ESR_ELx_WFx_ISS_WFE); - if (esr & ESR_ELx_WFx_ISS_WFE) { + if (guest_hyp_wfx_traps_enabled(vcpu)) + return kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); + + if (is_wfe) { trace_kvm_wfx_arm64(*vcpu_pc(vcpu), true); vcpu->stat.wfe_exit_stat++; } else { From patchwork Wed Apr 5 15:39:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3F6AC7619A for ; Wed, 5 Apr 2023 15:40:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238847AbjDEPkn (ORCPT ); Wed, 5 Apr 2023 11:40:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238814AbjDEPkd (ORCPT ); Wed, 5 Apr 2023 11:40:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF2F5E64 for ; Wed, 5 Apr 2023 08:40:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7F61362717 for ; Wed, 5 Apr 2023 15:40:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50013C433A8; Wed, 5 Apr 2023 15:40:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709231; bh=JKKoAGC8d7vK2Pk0PNat/KBDnXL2QiMCU/lXfdYh7Mg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=A/lGMsZMWg8NQNhGtXR1u9lUHEwnNecMwqenC6GaGsYLGrfGVCVgc3cHCvsC3HCEV kwC4I7effnr/2Bzu+hSmsOTUTIHSU/B7yLdIbj+oY5TGGeppITpBbS0R7i2jD0bKSq OhPfAEz1fAZ0sCAq43384O/vRu9nl++GzsT//z8lhEzds6LMUSTYE4YWdeQowCOunX di+LKUgtDZjOTPUjs64irDfh1JDpNNHYO6+3fDBgAfnNXNyyOlJUp8DOrkdv2ARa6L yPlGmRUow59fYJovukPcMVs0d9qSoQkYT5zVK6p0cMF3N5u/Cjisoe/TgY9QJw5/gh Iq9zDGpIABokQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FN-0062PV-Bt; Wed, 05 Apr 2023 16:40:29 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 09/50] KVM: arm64: nv: Respect virtual CPTR_EL2.{TFP,FPEN} settings Date: Wed, 5 Apr 2023 16:39:27 +0100 Message-Id: <20230405154008.3552854-10-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim Forward traps due to FP/ASIMD register accesses to the virtual EL2 if virtual CPTR_EL2.TFP is set (with HCR_EL2.E2H == 0) or CPTR_EL2.FPEN is configure to do so (with HCR_EL2.E2h == 1). Signed-off-by: Jintack Lim Signed-off-by: Christoffer Dall [maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with all the hard work actually being done by Chase Conklin] Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_emulate.h | 25 +++++++++++++++++++++++++ arch/arm64/kvm/handle_exit.c | 16 ++++++++++++---- arch/arm64/kvm/hyp/include/hyp/switch.h | 8 ++++++-- 3 files changed, 43 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 6e9c0d76a818..fe4b4b893fb8 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -11,6 +11,7 @@ #ifndef __ARM64_KVM_EMULATE_H__ #define __ARM64_KVM_EMULATE_H__ +#include #include #include @@ -329,6 +330,30 @@ static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu) return mode != PSR_MODE_EL0t; } +static inline bool guest_hyp_fpsimd_traps_enabled(const struct kvm_vcpu *vcpu) +{ + u64 val; + + if (!vcpu_has_nv(vcpu)) + return false; + + val = vcpu_read_sys_reg(vcpu, CPTR_EL2); + + if (!vcpu_el2_e2h_is_set(vcpu)) + return (val & CPTR_EL2_TFP); + + switch (FIELD_GET(CPACR_ELx_FPEN, val)) { + case 0b00: + case 0b10: + return true; + case 0b01: + return vcpu_el2_tge_is_set(vcpu) && !vcpu_is_el2(vcpu); + case 0b11: + default: /* GCC is dumb */ + return false; + } +} + static __always_inline u64 kvm_vcpu_get_esr(const struct kvm_vcpu *vcpu) { return vcpu->arch.fault.esr_el2; diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 758ab72e4dc4..8355f7c2054f 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -99,11 +99,19 @@ static int handle_smc(struct kvm_vcpu *vcpu) } /* - * Guest access to FP/ASIMD registers are routed to this handler only - * when the system doesn't support FP/ASIMD. + * This handles the cases where the system does not support FP/ASIMD or when + * we are running nested virtualization and the guest hypervisor is trapping + * FP/ASIMD accesses by its guest guest. + * + * All other handling of guest vs. host FP/ASIMD register state is handled in + * fixup_guest_exit(). */ -static int handle_no_fpsimd(struct kvm_vcpu *vcpu) +static int kvm_handle_fpasimd(struct kvm_vcpu *vcpu) { + if (guest_hyp_fpsimd_traps_enabled(vcpu)) + return kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); + + /* This is the case when the system doesn't support FP/ASIMD. */ kvm_inject_undefined(vcpu); return 1; } @@ -265,7 +273,7 @@ static exit_handle_fn arm_exit_handlers[] = { [ESR_ELx_EC_BREAKPT_LOW]= kvm_handle_guest_debug, [ESR_ELx_EC_BKPT32] = kvm_handle_guest_debug, [ESR_ELx_EC_BRK64] = kvm_handle_guest_debug, - [ESR_ELx_EC_FP_ASIMD] = handle_no_fpsimd, + [ESR_ELx_EC_FP_ASIMD] = kvm_handle_fpasimd, [ESR_ELx_EC_PAC] = kvm_handle_ptrauth, }; diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index f7bc6f1a4e0b..5092bfcf4c96 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -175,8 +175,12 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code) sve_guest = vcpu_has_sve(vcpu); esr_ec = kvm_vcpu_trap_get_class(vcpu); - /* Don't handle SVE traps for non-SVE vcpus here: */ - if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD) + /* + * Don't handle SVE traps for non-SVE vcpus here. This + * includes NV guests for the time being. + */ + if (!sve_guest && (esr_ec != ESR_ELx_EC_FP_ASIMD || + guest_hyp_fpsimd_traps_enabled(vcpu))) return false; /* Valid trap. Switch the context: */ From patchwork Wed Apr 5 15:39:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CA48C77B60 for ; Wed, 5 Apr 2023 15:42:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238981AbjDEPm3 (ORCPT ); Wed, 5 Apr 2023 11:42:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238984AbjDEPm1 (ORCPT ); Wed, 5 Apr 2023 11:42:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FA986E8B for ; Wed, 5 Apr 2023 08:42:05 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7CFE963ED2 for ; Wed, 5 Apr 2023 15:41:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E53AAC433EF; Wed, 5 Apr 2023 15:41:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709296; bh=dFYqWAkqI7rmHSrp0oaI2dKputJS77Ob1ZODlvanPJU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nlf4WSg1EtXJ/cCOn5ndtcsowJ9MqaVXfcilDhAxCdvkxm7YZ0IsYAUY8lhDDhR6W BbSPsu80Z+sY9b277vPrd1qCYtEZlbE0VhK6fYT8BsnXySbh3Evubw+ToiTLTJ8g1v /LUJ0KEzwYAMdIKSfgHH06hIFn3f9q28Ogw2atbMOYPbjjOS0RBWYY0rVDhAoYbMfc Yd9wdZxjeBr6X1wOhynHZM9YdJRVw/UCxa/xluD1iJF6oyetOh7j33fsgJgqws1WWB YsN1+zM2S5ngN3k16bxivzXHXGTlRhWZrwdka9HeGLI6T+QnAX/umIQOwkbohblHfv 3Iajw2+Xu7rHw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FN-0062PV-KM; Wed, 05 Apr 2023 16:40:29 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 10/50] KVM: arm64: nv: Respect the virtual HCR_EL2.NV bit setting Date: Wed, 5 Apr 2023 16:39:28 +0100 Message-Id: <20230405154008.3552854-11-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim Forward traps due to HCR_EL2.NV bit to the virtual EL2 if they are not coming from the virtual EL2 and the virtual HCR_EL2.NV bit is set. In addition to EL2 register accesses, setting NV bit will also make EL12 register accesses trap to EL2. To emulate this for the virtual EL2, forword traps due to EL12 register accessses to the virtual EL2 if the virtual HCR_EL2.NV bit is set. Same thing is done for SMCs issued by a guest and trapped by the virtual EL2. This is for recursive nested virtualization. Signed-off-by: Jintack Lim [Moved code to emulate-nested.c] Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_nested.h | 3 +++ arch/arm64/kvm/emulate-nested.c | 27 +++++++++++++++++++++++++++ arch/arm64/kvm/handle_exit.c | 7 +++++++ arch/arm64/kvm/sys_regs.c | 21 +++++++++++++++++++++ 5 files changed, 59 insertions(+) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index baef29fcbeee..e47faadbfde1 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -20,6 +20,7 @@ #define HCR_AMVOFFEN (UL(1) << 51) #define HCR_FIEN (UL(1) << 47) #define HCR_FWB (UL(1) << 46) +#define HCR_NV (UL(1) << 42) #define HCR_API (UL(1) << 41) #define HCR_APK (UL(1) << 40) #define HCR_TEA (UL(1) << 37) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index f8c4db870520..04096113cdf0 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -59,6 +59,9 @@ static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) return ttbr0 & ~GENMASK_ULL(63, 48); } +extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); +extern bool forward_nv_traps(struct kvm_vcpu *vcpu); + struct sys_reg_params; struct sys_reg_desc; diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index b96662029fb1..75cf6f15accc 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -14,6 +14,26 @@ #include "trace.h" +bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit) +{ + bool control_bit_set; + + if (!vcpu_has_nv(vcpu)) + return false; + + control_bit_set = __vcpu_sys_reg(vcpu, HCR_EL2) & control_bit; + if (!vcpu_is_el2(vcpu) && control_bit_set) { + kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); + return true; + } + return false; +} + +bool forward_nv_traps(struct kvm_vcpu *vcpu) +{ + return forward_traps(vcpu, HCR_NV); +} + static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) { u64 mode = spsr & PSR_MODE_MASK; @@ -52,6 +72,13 @@ void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu) u64 spsr, elr, mode; bool direct_eret; + /* + * Forward this trap to the virtual EL2 if the virtual + * HCR_EL2.NV bit is set and this is coming from !EL2. + */ + if (forward_nv_traps(vcpu)) + return; + /* * Going through the whole put/load motions is a waste of time * if this is a VHE guest hypervisor returning to its own diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 8355f7c2054f..49468f86aa2e 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -65,6 +65,13 @@ static int handle_smc(struct kvm_vcpu *vcpu) { int ret; + /* + * Forward this trapped smc instruction to the virtual EL2 if + * the guest has asked for it. + */ + if (forward_traps(vcpu, HCR_TSC)) + return 1; + /* * "If an SMC instruction executed at Non-secure EL1 is * trapped to EL2 because HCR_EL2.TSC is 1, the exception is a diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index cc3104167232..6628b66d02fb 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -336,10 +336,19 @@ static int set_ccsidr(struct kvm_vcpu *vcpu, u32 csselr, u32 val) return 0; } +static bool el12_reg(struct sys_reg_params *p) +{ + /* All *_EL12 registers have Op1=5. */ + return (p->Op1 == 5); +} + static bool access_rw(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { + if (el12_reg(p) && forward_nv_traps(vcpu)) + return false; + if (p->is_write) vcpu_write_sys_reg(vcpu, p->regval, r->reg); else @@ -408,6 +417,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, bool was_enabled = vcpu_has_cache_enabled(vcpu); u64 val, mask, shift; + if (el12_reg(p) && forward_nv_traps(vcpu)) + return false; + /* We don't expect TRVM on the host */ BUG_ON(!vcpu_is_el2(vcpu) && !p->is_write); @@ -1897,6 +1909,9 @@ static bool access_elr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { + if (el12_reg(p) && forward_nv_traps(vcpu)) + return false; + if (p->is_write) vcpu_write_sys_reg(vcpu, p->regval, ELR_EL1); else @@ -1909,6 +1924,9 @@ static bool access_spsr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { + if (el12_reg(p) && forward_nv_traps(vcpu)) + return false; + if (p->is_write) __vcpu_sys_reg(vcpu, SPSR_EL1) = p->regval; else @@ -1921,6 +1939,9 @@ static bool access_spsr_el2(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { + if (el12_reg(p) && forward_nv_traps(vcpu)) + return false; + if (p->is_write) vcpu_write_sys_reg(vcpu, p->regval, SPSR_EL2); else From patchwork Wed Apr 5 15:39:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31ED5C76188 for ; Wed, 5 Apr 2023 15:42:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238982AbjDEPm2 (ORCPT ); Wed, 5 Apr 2023 11:42:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238902AbjDEPm1 (ORCPT ); Wed, 5 Apr 2023 11:42:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E4334C20 for ; Wed, 5 Apr 2023 08:42:05 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1B9AD63ED4 for ; Wed, 5 Apr 2023 15:41:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71F7AC4339B; Wed, 5 Apr 2023 15:41:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709285; bh=+Dgp2ZystkCF2LXfDawgMy9ijTfY3yI7W5pOElk7QNY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gpLtZ49aq6xshvE9Avjwx1vUS2du8qAoMfANObn1e0tIAFqqgrflklTTR8V1mEqZD QNQ60Hx6SogXnZTWYGolIsJhqQ11kecOhqiHt1OAVmFRmQdQb/r+bgVDZ3YcbhysoP imk8+4q6r4cEj4/xAAhdb8ViTipZ5PkBpy/2nL4ugwE070jqACH8z/RW8ZHRfHYejH q2s/W3N9plS/fM50NIgGikCECfsSybfD72GpFzOX/dMcu+CxSoxE81IUi30l+E5hs5 DYCbiOj7jVtDu8yPz0gOl3geOZW8ER4VlYBXDmGHftW/paRBTDnQTc5JUiZzosocJH okfnYO8n06qFw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FN-0062PV-U3; Wed, 05 Apr 2023 16:40:29 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 11/50] KVM: arm64: nv: Respect virtual HCR_EL2.TVM and TRVM settings Date: Wed, 5 Apr 2023 16:39:29 +0100 Message-Id: <20230405154008.3552854-12-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim Forward the EL1 virtual memory register traps to the virtual EL2 if they are not coming from the virtual EL2 and the virtual HCR_EL2.TVM or TRVM bit is set. This is for recursive nested virtualization. Reviewed-by: Alexandru Elisei Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 6628b66d02fb..56cc60eede59 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -420,6 +420,13 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (el12_reg(p) && forward_nv_traps(vcpu)) return false; + if (!el12_reg(p)) { + u64 bit = p->is_write ? HCR_TVM : HCR_TRVM; + + if (forward_traps(vcpu, bit)) + return false; + } + /* We don't expect TRVM on the host */ BUG_ON(!vcpu_is_el2(vcpu) && !p->is_write); From patchwork Wed Apr 5 15:39:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DD4DC76188 for ; Wed, 5 Apr 2023 15:42:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238841AbjDEPmZ (ORCPT ); Wed, 5 Apr 2023 11:42:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238981AbjDEPmW (ORCPT ); Wed, 5 Apr 2023 11:42:22 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE45172A9 for ; Wed, 5 Apr 2023 08:42:01 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3926A63EE2 for ; Wed, 5 Apr 2023 15:41:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9C83FC4339C; Wed, 5 Apr 2023 15:41:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709304; bh=PtqEYkawU6HTlhZU1Y8UJwggJinVQ5xhTLJHUn3rVyE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rEWNPY/QYBo2fbKmjJ1tgT4SaRv4XhMjgFQSCcah3iWG6tZXYrMSv8RYksZ/IPur1 4LxKEKmzuB3YyNlbliaNXegcP9xAGsb8JvuMrfWloSfoNZUBCdZdjMV9wfSrPFFfOZ Sno/zcGR90IwhlXZBBgtuI1ytrg1oJgR8C04FuJss+AL9FQSfoaxOq+P1J+A7ixksO jpPZgPK2zb5kVQqYaHN5DqcpTp/I98GBmXrhViWGeEIx6P+60IUVBsx3bww1AgT5SZ cUaVn65EsqyRTsJtkBvX854lEeLPSKi8cwzf3CQMOxCh2slKnucGDwt5JiUZTBqngi fDS1wyaz+Zmug== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FO-0062PV-6e; Wed, 05 Apr 2023 16:40:30 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 12/50] KVM: arm64: nv: Respect the virtual HCR_EL2.NV1 bit setting Date: Wed, 5 Apr 2023 16:39:30 +0100 Message-Id: <20230405154008.3552854-13-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim Forward ELR_EL1, SPSR_EL1 and VBAR_EL1 traps to the virtual EL2 if the virtual HCR_EL2.{NV,NV1}={1,1}. This is for recursive nested virtualization. Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_nested.h | 1 + arch/arm64/kvm/emulate-nested.c | 5 +++++ arch/arm64/kvm/sys_regs.c | 21 ++++++++++++++++++++- 4 files changed, 27 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index e47faadbfde1..8e2b0bf1f484 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -20,6 +20,7 @@ #define HCR_AMVOFFEN (UL(1) << 51) #define HCR_FIEN (UL(1) << 47) #define HCR_FWB (UL(1) << 46) +#define HCR_NV1 (UL(1) << 43) #define HCR_NV (UL(1) << 42) #define HCR_API (UL(1) << 41) #define HCR_APK (UL(1) << 40) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 04096113cdf0..4b7f9f35ece9 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -61,6 +61,7 @@ static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); extern bool forward_nv_traps(struct kvm_vcpu *vcpu); +extern bool forward_nv1_traps(struct kvm_vcpu *vcpu); struct sys_reg_params; struct sys_reg_desc; diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index 75cf6f15accc..7c06844a5113 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -34,6 +34,11 @@ bool forward_nv_traps(struct kvm_vcpu *vcpu) return forward_traps(vcpu, HCR_NV); } +bool forward_nv1_traps(struct kvm_vcpu *vcpu) +{ + return forward_traps(vcpu, HCR_NV1); +} + static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) { u64 mode = spsr & PSR_MODE_MASK; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 56cc60eede59..d40e5181bf74 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -357,6 +357,16 @@ static bool access_rw(struct kvm_vcpu *vcpu, return true; } +static bool access_vbar_el1(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (forward_nv1_traps(vcpu)) + return false; + + return access_rw(vcpu, p, r); +} + /* * See note at ARMv7 ARM B1.14.4 (TL;DR: S/W ops are not easily virtualized). */ @@ -1919,6 +1929,9 @@ static bool access_elr(struct kvm_vcpu *vcpu, if (el12_reg(p) && forward_nv_traps(vcpu)) return false; + if (!el12_reg(p) && forward_nv1_traps(vcpu)) + return false; + if (p->is_write) vcpu_write_sys_reg(vcpu, p->regval, ELR_EL1); else @@ -1934,6 +1947,9 @@ static bool access_spsr(struct kvm_vcpu *vcpu, if (el12_reg(p) && forward_nv_traps(vcpu)) return false; + if (!el12_reg(p) && forward_nv1_traps(vcpu)) + return false; + if (p->is_write) __vcpu_sys_reg(vcpu, SPSR_EL1) = p->regval; else @@ -1949,6 +1965,9 @@ static bool access_spsr_el2(struct kvm_vcpu *vcpu, if (el12_reg(p) && forward_nv_traps(vcpu)) return false; + if (!el12_reg(p) && forward_nv1_traps(vcpu)) + return false; + if (p->is_write) vcpu_write_sys_reg(vcpu, p->regval, SPSR_EL2); else @@ -2163,7 +2182,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_LORC_EL1), trap_loregion }, { SYS_DESC(SYS_LORID_EL1), trap_loregion }, - { SYS_DESC(SYS_VBAR_EL1), access_rw, reset_val, VBAR_EL1, 0 }, + { SYS_DESC(SYS_VBAR_EL1), access_vbar_el1, reset_val, VBAR_EL1, 0 }, { SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 }, { SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only }, From patchwork Wed Apr 5 15:39:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202046 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5B18C7619A for ; Wed, 5 Apr 2023 15:42:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238978AbjDEPmN (ORCPT ); Wed, 5 Apr 2023 11:42:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238762AbjDEPmJ (ORCPT ); Wed, 5 Apr 2023 11:42:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F09F77289 for ; Wed, 5 Apr 2023 08:41:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5ECFF63EE3 for ; Wed, 5 Apr 2023 15:41:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C73F3C433D2; Wed, 5 Apr 2023 15:41:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709274; bh=9g8STqPXgzD9Bbvt489TT/ehnNJeuV9gTn/DLP9G+xA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hePkJJlk5YajhTZKr/QJVXlr7camHFaYl03u4jag1YT+bDB1S917xnet7sPA54+yf Cgv8ewYMNzOeKQ/Zg70TM159xzXIlqhrWznEu3oPmv8CnCXmlz2Bo6Ro66tMxcQR8r QKMLMSxNHilDwNkq3eec/UxdOegUzvgBYFSYWo0q442upgGPwMYOSg3MR7DWkT+xUx dKoJEgVdpVsVWoBdP0FX7DEoRQheW49z83+J8YVzsLBtNYQ+GotiQQSsayMLra96mo xze7eZqH/y1anCpXfhxRU5lDaRK8GwyWvcmtj2DJVCCRj0T5pPtZFzXlb56RNy2KDN zOzpVx+IW04/Q== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FO-0062PV-Fq; Wed, 05 Apr 2023 16:40:30 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 13/50] KVM: arm64: nv: Forward debug traps to the nested guest Date: Wed, 5 Apr 2023 16:39:31 +0100 Message-Id: <20230405154008.3552854-14-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On handling a debug trap, check whether we need to forward it to the guest before handling it. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 2 ++ arch/arm64/kvm/emulate-nested.c | 9 +++++++-- arch/arm64/kvm/sys_regs.c | 5 +++++ 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 4b7f9f35ece9..f2820c82e956 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -59,6 +59,8 @@ static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) return ttbr0 & ~GENMASK_ULL(63, 48); } +extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, + u64 control_bit); extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); extern bool forward_nv_traps(struct kvm_vcpu *vcpu); extern bool forward_nv1_traps(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index 7c06844a5113..555771e1260d 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -14,14 +14,14 @@ #include "trace.h" -bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit) +bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit) { bool control_bit_set; if (!vcpu_has_nv(vcpu)) return false; - control_bit_set = __vcpu_sys_reg(vcpu, HCR_EL2) & control_bit; + control_bit_set = __vcpu_sys_reg(vcpu, reg) & control_bit; if (!vcpu_is_el2(vcpu) && control_bit_set) { kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); return true; @@ -29,6 +29,11 @@ bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit) return false; } +bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit) +{ + return __forward_traps(vcpu, HCR_EL2, control_bit); +} + bool forward_nv_traps(struct kvm_vcpu *vcpu) { return forward_traps(vcpu, HCR_NV); diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index d40e5181bf74..548f637e57ac 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -666,6 +666,11 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { + if (forward_traps(vcpu, HCR_TGE) || + __forward_traps(vcpu, MDCR_EL2, MDCR_EL2_TDE) || + __forward_traps(vcpu, MDCR_EL2, MDCR_EL2_TDA)) + return false; + access_rw(vcpu, p, r); if (p->is_write) vcpu_set_flag(vcpu, DEBUG_DIRTY); From patchwork Wed Apr 5 15:39:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC2A4C76188 for ; Wed, 5 Apr 2023 15:42:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238974AbjDEPmL (ORCPT ); Wed, 5 Apr 2023 11:42:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238948AbjDEPmH (ORCPT ); Wed, 5 Apr 2023 11:42:07 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1357D5259 for ; Wed, 5 Apr 2023 08:41:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A3B0063ED1 for ; Wed, 5 Apr 2023 15:41:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 17E97C4339B; Wed, 5 Apr 2023 15:41:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709302; bh=YcShdA+tKDmSUCRb0tpq6I2td/HqIONC//0886kHbTk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vJ/ZjPamONV55S5ZY9UJQbKaARAEMNJ2gcjFltknwRc22ksxKfqSqe06VtjVHLH3D UIFJNKnESOlpyt3SxQpGiTQTmTesYftimqi3W9upc/cicrZDJOcUecpa+VJe4wnhS8 BPli7wNU7+82GpLZ4Q7t1tSzydyxmCc7L2x4s3o3k4A5mWuXPcXPfuaBu3Ygc2EV5u r0rlHYnOqAjvGDbNYbApG7Bwk1Dqi8z2MgU0DXWG29eWPGF2VUeiH8yEJAs9Vg5HNs P92ROAjhQG+p2gVqljp0J1uL0vSD4SaR7TtHVM8ySlzPt1Qv9jtT13JuPHDd1wLroD THUoJdFBClQ9A== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FO-0062PV-NZ; Wed, 05 Apr 2023 16:40:30 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 14/50] KVM: arm64: nv: Configure HCR_EL2 for nested virtualization Date: Wed, 5 Apr 2023 16:39:32 +0100 Message-Id: <20230405154008.3552854-15-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim We enable nested virtualization by setting the HCR NV and NV1 bit. When the virtual E2H bit is set, we can support EL2 register accesses via EL1 registers from the virtual EL2 by doing trap-and-emulate. A better alternative, however, is to allow the virtual EL2 to access EL2 register states without trap. This can be easily achieved by not traping EL1 registers since those registers already have EL2 register states. Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kvm/hyp/vhe/switch.c | 38 +++++++++++++++++++++++++++++--- 2 files changed, 36 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 8e2b0bf1f484..1ea71d26823c 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -89,6 +89,7 @@ HCR_BSU_IS | HCR_FB | HCR_TACR | \ HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \ HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID2) +#define HCR_GUEST_NV_FILTER_FLAGS (HCR_ATA | HCR_API | HCR_APK | HCR_FIEN) #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF) #define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA) #define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 41385bb02d6b..36b50e315504 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -38,9 +38,41 @@ static void __activate_traps(struct kvm_vcpu *vcpu) u64 hcr = vcpu->arch.hcr_el2; u64 val; - /* Trap VM sysreg accesses if an EL2 guest is not using VHE. */ - if (vcpu_is_el2(vcpu) && !vcpu_el2_e2h_is_set(vcpu)) - hcr |= HCR_TVM | HCR_TRVM; + if (is_hyp_ctxt(vcpu)) { + hcr |= HCR_NV; + + if (!vcpu_el2_e2h_is_set(vcpu)) { + /* + * For a guest hypervisor on v8.0, trap and emulate + * the EL1 virtual memory control register accesses. + */ + hcr |= HCR_TVM | HCR_TRVM | HCR_NV1; + } else { + /* + * For a guest hypervisor on v8.1 (VHE), allow to + * access the EL1 virtual memory control registers + * natively. These accesses are to access EL2 register + * states. + * Note that we still need to respect the virtual + * HCR_EL2 state. + */ + u64 vhcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2); + + vhcr_el2 &= ~HCR_GUEST_NV_FILTER_FLAGS; + + /* + * We already set TVM to handle set/way cache maint + * ops traps, this somewhat collides with the nested + * virt trapping for nVHE. So turn this off for now + * here, in the hope that VHE guests won't ever do this. + * TODO: find out whether it's worth to support both + * cases at the same time. + */ + hcr &= ~HCR_TVM; + + hcr |= vhcr_el2 & (HCR_TVM | HCR_TRVM); + } + } ___activate_traps(vcpu, hcr); From patchwork Wed Apr 5 15:39:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3DDDC761A6 for ; Wed, 5 Apr 2023 15:41:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238936AbjDEPlv (ORCPT ); Wed, 5 Apr 2023 11:41:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238932AbjDEPlq (ORCPT ); Wed, 5 Apr 2023 11:41:46 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CB794C20 for ; Wed, 5 Apr 2023 08:41:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 799A363EDA for ; Wed, 5 Apr 2023 15:40:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9E16DC433D2; Wed, 5 Apr 2023 15:40:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709257; bh=tnYNvzJonISHgSOHNoBUrnmPX/XDh86aiSUNtkqFOSM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dGM7ghyxECovs1w81Bop6wCEbZ72AZERPFXDAYhEhSnQEE5lpMSJrw5gHVE+0yfiC 98yDegeRDPVmhDuDsEC5Hqip0P4TMU/44db4ub+OI82XGXiKmgsQz2uwLF7QPk4hCg 9aRXKj/VPJtc5QPxzMc4b+c9ZHyfQUoR+FErNSklv9jWVk2wKXcvOaXihPtPIJnx3J NAER+0e6PjugXZESgAlBwBiClKfMZVKNDnlcLOOu8fAmt9qQ6Oh98tqIcgZAsqIWLS pPUgzhJ3CEvZR83UDZvhYKDJpgkUcbZgvsuanEORsXbPiZvREuPa0ZdMuoemBsWG0m PayZe1FzcxWjg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FP-0062PV-3T; Wed, 05 Apr 2023 16:40:31 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 15/50] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Date: Wed, 5 Apr 2023 16:39:33 +0100 Message-Id: <20230405154008.3552854-16-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add Stage-2 mmu data structures for virtual EL2 and for nested guests. We don't yet populate shadow Stage-2 page tables, but we now have a framework for getting to a shadow Stage-2 pgd. We allocate twice the number of vcpus as Stage-2 mmu structures because that's sufficient for each vcpu running two translation regimes without having to flush the Stage-2 page tables. Co-developed-by: Christoffer Dall Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 40 ++++++ arch/arm64/include/asm/kvm_mmu.h | 9 ++ arch/arm64/include/asm/kvm_nested.h | 7 + arch/arm64/kvm/arm.c | 18 ++- arch/arm64/kvm/mmu.c | 77 +++++++--- arch/arm64/kvm/nested.c | 210 ++++++++++++++++++++++++++++ 6 files changed, 337 insertions(+), 24 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 633a7c0750bb..fefd5156049e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -159,8 +159,40 @@ struct kvm_s2_mmu { int __percpu *last_vcpu_ran; struct kvm_arch *arch; + + /* + * For a shadow stage-2 MMU, the virtual vttbr used by the + * host to parse the guest S2. + * This either contains: + * - the virtual VTTBR programmed by the guest hypervisor with + * CnP cleared + * - The value 1 (VMID=0, BADDR=0, CnP=1) if invalid + * + * We also cache the full VTCR which gets used for TLB invalidation, + * taking the ARM ARM's "Any of the bits in VTCR_EL2 are permitted + * to be cached in a TLB" to the letter. + */ + u64 vttbr; + u64 vtcr; + + /* + * true when this represents a nested context where virtual + * HCR_EL2.VM == 1 + */ + bool nested_stage2_enabled; + + /* + * 0: Nobody is currently using this, check vttbr for validity + * >0: Somebody is actively using this. + */ + atomic_t refcnt; }; +static inline bool kvm_s2_mmu_valid(struct kvm_s2_mmu *mmu) +{ + return !(mmu->vttbr & 1); +} + struct kvm_arch_memory_slot { }; @@ -187,6 +219,14 @@ struct kvm_protected_vm { struct kvm_arch { struct kvm_s2_mmu mmu; + /* + * Stage 2 paging state for VMs with nested S2 using a virtual + * VMID. + */ + struct kvm_s2_mmu *nested_mmus; + size_t nested_mmus_size; + int nested_mmus_next; + /* VTCR_EL2 value for this VM */ u64 vtcr; diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 27e63c111f78..cbb198ebd2a0 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -119,6 +119,7 @@ alternative_cb_end #include #include #include +#include void kvm_update_va_mask(struct alt_instr *alt, __le32 *origptr, __le32 *updptr, int nr_inst); @@ -170,6 +171,7 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, void **haddr); void __init free_hyp_pgds(void); +void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size); void stage2_unmap_vm(struct kvm *kvm); int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); @@ -311,5 +313,12 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu) { return container_of(mmu->arch, struct kvm, arch); } + +static inline u64 get_vmid(u64 vttbr) +{ + return (vttbr & VTTBR_VMID_MASK(kvm_get_vmid_bits())) >> + VTTBR_VMID_SHIFT; +} + #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index f2820c82e956..f9c57b37de0a 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -59,6 +59,13 @@ static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) return ttbr0 & ~GENMASK_ULL(63, 48); } +extern void kvm_init_nested(struct kvm *kvm); +extern int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu); +extern void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu); +extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu); +extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); +extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); + extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit); extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 4c5e9dfbf83a..2fc529519ae7 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -36,9 +36,10 @@ #include #include #include +#include #include +#include #include -#include #include #include @@ -128,6 +129,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret; + kvm_init_nested(kvm); + ret = kvm_share_hyp(kvm, kvm + 1); if (ret) return ret; @@ -390,6 +393,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) struct kvm_s2_mmu *mmu; int *last_ran; + if (vcpu_has_nv(vcpu)) + kvm_vcpu_load_hw_mmu(vcpu); + mmu = vcpu->arch.hw_mmu; last_ran = this_cpu_ptr(mmu->last_vcpu_ran); @@ -440,9 +446,12 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) kvm_timer_vcpu_put(vcpu); kvm_vgic_put(vcpu); kvm_vcpu_pmu_restore_host(vcpu); + if (vcpu_has_nv(vcpu)) + kvm_vcpu_put_hw_mmu(vcpu); kvm_arm_vmid_clear_active(); vcpu_clear_on_unsupported_cpu(vcpu); + vcpu->cpu = -1; } @@ -1172,8 +1181,13 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, vcpu->arch.target = phys_target; + /* Prepare for nested if required */ + ret = kvm_vcpu_init_nested(vcpu); + /* Now we know what it is, we can reset it. */ - ret = kvm_reset_vcpu(vcpu); + if (!ret) + ret = kvm_reset_vcpu(vcpu); + if (ret) { vcpu->arch.target = -1; bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 3b9d4d24c361..b2612763abc1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -217,7 +217,7 @@ static void invalidate_icache_guest_page(void *va, size_t size) * does. */ /** - * unmap_stage2_range -- Clear stage2 page table entries to unmap a range + * __unmap_stage2_range -- Clear stage2 page table entries to unmap a range * @mmu: The KVM stage-2 MMU pointer * @start: The intermediate physical base address of the range to unmap * @size: The size of the area to unmap @@ -240,7 +240,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 may_block)); } -static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) +void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) { __unmap_stage2_range(mmu, start, size, true); } @@ -711,21 +711,9 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { .icache_inval_pou = invalidate_icache_guest_page, }; -/** - * kvm_init_stage2_mmu - Initialise a S2 MMU structure - * @kvm: The pointer to the KVM structure - * @mmu: The pointer to the s2 MMU structure - * @type: The machine type of the virtual machine - * - * Allocates only the stage-2 HW PGD level table(s). - * Note we don't need locking here as this is only called when the VM is - * created, which can only be done once. - */ -int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type) +static int kvm_init_ipa_range(struct kvm *kvm, unsigned long type) { u32 kvm_ipa_limit = get_kvm_ipa_limit(); - int cpu, err; - struct kvm_pgtable *pgt; u64 mmfr0, mmfr1; u32 phys_shift; @@ -752,7 +740,53 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); + return 0; +} + +/** + * kvm_init_stage2_mmu - Initialise a S2 MMU structure + * @kvm: The pointer to the KVM structure + * @mmu: The pointer to the s2 MMU structure + * @type: The machine type of the virtual machine + * + * Allocates only the stage-2 HW PGD level table(s). + * Note we don't need locking here as this is only called in two cases: + * + * - when the VM is created, which can't race against anything + * + * - when secondary kvm_s2_mmu structures are initialised for NV + * guests, and the caller must hold kvm->lock as this is called on a + * per-vcpu basis. + */ +int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type) +{ + int cpu, err; + struct kvm_pgtable *pgt; + + /* + * We only initialise the IPA range on the canonical MMU, so + * the type is meaningless in all other situations. + */ + if (&kvm->arch.mmu == mmu) { + err = kvm_init_ipa_range(kvm, type); + if (err) + return err; + } + + /* + * If we already have our page tables in place, and that the + * MMU context is the canonical one, we have a bug somewhere, + * as this is only supposed to ever happen once per VM. + * + * Otherwise, we're building nested page tables, and that's + * probably because userspace called KVM_ARM_VCPU_INIT more + * than once on the same vcpu. Since that's actually legal, + * don't kick a fuss and leave gracefully. + */ if (mmu->pgt != NULL) { + if (&kvm->arch.mmu != mmu) + return 0; + kvm_err("kvm_arch already initialized?\n"); return -EINVAL; } @@ -777,6 +811,10 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); + + if (&kvm->arch.mmu != mmu) + kvm_init_nested_s2_mmu(mmu); + return 0; out_destroy_pgtable: @@ -822,7 +860,7 @@ static void stage2_unmap_memslot(struct kvm *kvm, if (!(vma->vm_flags & VM_PFNMAP)) { gpa_t gpa = addr + (vm_start - memslot->userspace_addr); - unmap_stage2_range(&kvm->arch.mmu, gpa, vm_end - vm_start); + kvm_unmap_stage2_range(&kvm->arch.mmu, gpa, vm_end - vm_start); } hva = vm_end; } while (hva < reg_end); @@ -1875,11 +1913,6 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) { } -void kvm_arch_flush_shadow_all(struct kvm *kvm) -{ - kvm_free_stage2_pgd(&kvm->arch.mmu); -} - void kvm_arch_flush_shadow_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) { @@ -1887,7 +1920,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, phys_addr_t size = slot->npages << PAGE_SHIFT; write_lock(&kvm->mmu_lock); - unmap_stage2_range(&kvm->arch.mmu, gpa, size); + kvm_unmap_stage2_range(&kvm->arch.mmu, gpa, size); write_unlock(&kvm->mmu_lock); } diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 315354d27978..8b21925bf086 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -7,7 +7,9 @@ #include #include +#include #include +#include #include #include @@ -16,6 +18,214 @@ /* Protection against the sysreg repainting madness... */ #define NV_FTR(r, f) ID_AA64##r##_EL1_##f +void kvm_init_nested(struct kvm *kvm) +{ + kvm->arch.nested_mmus = NULL; + kvm->arch.nested_mmus_size = 0; +} + +int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + struct kvm_s2_mmu *tmp; + int num_mmus; + int ret = -ENOMEM; + + if (!test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features)) + return 0; + + if (!cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) + return -EINVAL; + + mutex_lock(&kvm->lock); + + /* + * Let's treat memory allocation failures as benign: If we fail to + * allocate anything, return an error and keep the allocated array + * alive. Userspace may try to recover by intializing the vcpu + * again, and there is no reason to affect the whole VM for this. + */ + num_mmus = atomic_read(&kvm->online_vcpus) * 2; + tmp = krealloc(kvm->arch.nested_mmus, + num_mmus * sizeof(*kvm->arch.nested_mmus), + GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (tmp) { + /* + * If we went through a realocation, adjust the MMU + * back-pointers in the previously initialised + * pg_table structures. + */ + if (kvm->arch.nested_mmus != tmp) { + int i; + + for (i = 0; i < num_mmus - 2; i++) + tmp[i].pgt->mmu = &tmp[i]; + } + + if (kvm_init_stage2_mmu(kvm, &tmp[num_mmus - 1], 0) || + kvm_init_stage2_mmu(kvm, &tmp[num_mmus - 2], 0)) { + kvm_free_stage2_pgd(&tmp[num_mmus - 1]); + kvm_free_stage2_pgd(&tmp[num_mmus - 2]); + } else { + kvm->arch.nested_mmus_size = num_mmus; + ret = 0; + } + + kvm->arch.nested_mmus = tmp; + } + + mutex_unlock(&kvm->lock); + return ret; +} + +/* Must be called with kvm->mmu_lock held */ +struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu) +{ + bool nested_stage2_enabled; + u64 vttbr, vtcr, hcr; + struct kvm *kvm; + int i; + + kvm = vcpu->kvm; + + vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + vtcr = vcpu_read_sys_reg(vcpu, VTCR_EL2); + hcr = vcpu_read_sys_reg(vcpu, HCR_EL2); + + nested_stage2_enabled = hcr & HCR_VM; + + /* Don't consider the CnP bit for the vttbr match */ + vttbr = vttbr & ~VTTBR_CNP_BIT; + + /* + * Two possibilities when looking up a S2 MMU context: + * + * - either S2 is enabled in the guest, and we need a context that is + * S2-enabled and matches the full VTTBR (VMID+BADDR) and VTCR, + * which makes it safe from a TLB conflict perspective (a broken + * guest won't be able to generate them), + * + * - or S2 is disabled, and we need a context that is S2-disabled + * and matches the VMID only, as all TLBs are tagged by VMID even + * if S2 translation is disabled. + */ + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (!kvm_s2_mmu_valid(mmu)) + continue; + + if (nested_stage2_enabled && + mmu->nested_stage2_enabled && + vttbr == mmu->vttbr && + vtcr == mmu->vtcr) + return mmu; + + if (!nested_stage2_enabled && + !mmu->nested_stage2_enabled && + get_vmid(vttbr) == get_vmid(mmu->vttbr)) + return mmu; + } + return NULL; +} + +/* Must be called with kvm->mmu_lock held */ +static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + struct kvm_s2_mmu *s2_mmu; + int i; + + s2_mmu = lookup_s2_mmu(vcpu); + if (s2_mmu) + goto out; + + /* + * Make sure we don't always search from the same point, or we + * will always reuse a potentially active context, leaving + * free contexts unused. + */ + for (i = kvm->arch.nested_mmus_next; + i < (kvm->arch.nested_mmus_size + kvm->arch.nested_mmus_next); + i++) { + s2_mmu = &kvm->arch.nested_mmus[i % kvm->arch.nested_mmus_size]; + + if (atomic_read(&s2_mmu->refcnt) == 0) + break; + } + BUG_ON(atomic_read(&s2_mmu->refcnt)); /* We have struct MMUs to spare */ + + /* Set the scene for the next search */ + kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size; + + if (kvm_s2_mmu_valid(s2_mmu)) { + /* Clear the old state */ + kvm_unmap_stage2_range(s2_mmu, 0, kvm_phys_size(kvm)); + if (atomic64_read(&s2_mmu->vmid.id)) + kvm_call_hyp(__kvm_tlb_flush_vmid, s2_mmu); + } + + /* + * The virtual VMID (modulo CnP) will be used as a key when matching + * an existing kvm_s2_mmu. + * + * We cache VTCR at allocation time, once and for all. It'd be great + * if the guest didn't screw that one up, as this is not very + * forgiving... + */ + s2_mmu->vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2) & ~VTTBR_CNP_BIT; + s2_mmu->vtcr = vcpu_read_sys_reg(vcpu, VTCR_EL2); + s2_mmu->nested_stage2_enabled = vcpu_read_sys_reg(vcpu, HCR_EL2) & HCR_VM; + +out: + atomic_inc(&s2_mmu->refcnt); + return s2_mmu; +} + +void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu) +{ + mmu->vttbr = 1; + mmu->nested_stage2_enabled = false; + atomic_set(&mmu->refcnt, 0); +} + +void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu) +{ + if (is_hyp_ctxt(vcpu)) { + vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu; + } else { + write_lock(&vcpu->kvm->mmu_lock); + vcpu->arch.hw_mmu = get_s2_mmu_nested(vcpu); + write_unlock(&vcpu->kvm->mmu_lock); + } +} + +void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.hw_mmu != &vcpu->kvm->arch.mmu) { + atomic_dec(&vcpu->arch.hw_mmu->refcnt); + vcpu->arch.hw_mmu = NULL; + } +} + +void kvm_arch_flush_shadow_all(struct kvm *kvm) +{ + int i; + + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + WARN_ON(atomic_read(&mmu->refcnt)); + + if (!atomic_read(&mmu->refcnt)) + kvm_free_stage2_pgd(mmu); + } + kfree(kvm->arch.nested_mmus); + kvm->arch.nested_mmus = NULL; + kvm->arch.nested_mmus_size = 0; + kvm_free_stage2_pgd(&kvm->arch.mmu); +} + /* * Our emulated CPU doesn't support all the possible features. For the * sake of simplicity (and probably mental sanity), wipe out a number From patchwork Wed Apr 5 15:39:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8B89C7619A for ; Wed, 5 Apr 2023 15:42:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238935AbjDEPmR (ORCPT ); Wed, 5 Apr 2023 11:42:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238982AbjDEPmO (ORCPT ); Wed, 5 Apr 2023 11:42:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B376D65A6 for ; Wed, 5 Apr 2023 08:41:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 11EB6623DE for ; Wed, 5 Apr 2023 15:41:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D189C433EF; Wed, 5 Apr 2023 15:41:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709266; bh=cPf7WEU+FDqwaRzogF7wQCS4u+2E7ZKl56VZokbQoFQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VsaBczl7m4YkA3pWlpD78NjuYOIhktp3JOk7cPSQzg0kP2DkwR0Wb+FmQU/7qdKMX 6599PTN22Dp8jp1ibRasSyg5h7EZnycFNCEsDiX5AZrbUQzJ/37tPLxoxfERL5IdvA kINiwcl8QYFnbTw82ch5dyRXzwArypp6bYH8lktETqVgQnMEfhV2hqg4qYbkLTKBfk awNynLlhz2/Baub8VjMgQOemQYZMPETT/c1kJk/8I9Qw+sv3Juwe2DZAvZcRfw2fbQ ZV32SUvIU8uB8kJQaRo0BXx9rznCiJt4kLYIaznxJvzc1SeTXhpSg7/vTvRxC4T4ab bs1SClEsOeFiw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FP-0062PV-Bs; Wed, 05 Apr 2023 16:40:31 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 16/50] KVM: arm64: nv: Implement nested Stage-2 page table walk logic Date: Wed, 5 Apr 2023 16:39:34 +0100 Message-Id: <20230405154008.3552854-17-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Christoffer Dall Based on the pseudo-code in the ARM ARM, implement a stage 2 software page table walker. Signed-off-by: Christoffer Dall Signed-off-by: Jintack Lim [maz: heavily reworked for future ARMv8.4-TTL support] Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/kvm_arm.h | 2 + arch/arm64/include/asm/kvm_nested.h | 13 ++ arch/arm64/kvm/nested.c | 270 ++++++++++++++++++++++++++++ 4 files changed, 286 insertions(+) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 8487aec9b658..380156a9b956 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -141,6 +141,7 @@ #define ESR_ELx_CM (UL(1) << ESR_ELx_CM_SHIFT) /* ISS field definitions for exceptions taken in to Hyp */ +#define ESR_ELx_FSC_ADDRSZ (0x00) #define ESR_ELx_CV (UL(1) << 24) #define ESR_ELx_COND_SHIFT (20) #define ESR_ELx_COND_MASK (UL(0xF) << ESR_ELx_COND_SHIFT) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 1ea71d26823c..680c02d8f38f 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -274,6 +274,8 @@ #define VTTBR_VMID_SHIFT (UL(48)) #define VTTBR_VMID_MASK(size) (_AT(u64, (1 << size) - 1) << VTTBR_VMID_SHIFT) +#define SCTLR_EE (UL(1) << 25) + /* Hyp System Trap Register */ #define HSTR_EL2_T(x) (1 << x) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index f9c57b37de0a..19796d4b0798 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -66,6 +66,19 @@ extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); +struct kvm_s2_trans { + phys_addr_t output; + unsigned long block_size; + bool writable; + bool readable; + int level; + u32 esr; + u64 upper_attr; +}; + +extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, + struct kvm_s2_trans *result); + extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit); extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 8b21925bf086..1e5eb8140012 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -78,6 +78,276 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) return ret; } +struct s2_walk_info { + int (*read_desc)(phys_addr_t pa, u64 *desc, void *data); + void *data; + u64 baddr; + unsigned int max_pa_bits; + unsigned int max_ipa_bits; + unsigned int pgshift; + unsigned int pgsize; + unsigned int ps; + unsigned int sl; + unsigned int t0sz; + bool be; + bool el1_aarch32; +}; + +static unsigned int ps_to_output_size(unsigned int ps) +{ + switch (ps) { + case 0: return 32; + case 1: return 36; + case 2: return 40; + case 3: return 42; + case 4: return 44; + case 5: + default: + return 48; + } +} + +static u32 compute_fsc(int level, u32 fsc) +{ + return fsc | (level & 0x3); +} + +static int check_base_s2_limits(struct s2_walk_info *wi, + int level, int input_size, int stride) +{ + int start_size; + + /* Check translation limits */ + switch (wi->pgsize) { + case SZ_64K: + if (level == 0 || (level == 1 && wi->max_ipa_bits <= 42)) + return -EFAULT; + break; + case SZ_16K: + if (level == 0 || (level == 1 && wi->max_ipa_bits <= 40)) + return -EFAULT; + break; + case SZ_4K: + if (level < 0 || (level == 0 && wi->max_ipa_bits <= 42)) + return -EFAULT; + break; + } + + /* Check input size limits */ + if (input_size > wi->max_ipa_bits && + (!wi->el1_aarch32 || input_size > 40)) + return -EFAULT; + + /* Check number of entries in starting level table */ + start_size = input_size - ((3 - level) * stride + wi->pgshift); + if (start_size < 1 || start_size > stride + 4) + return -EFAULT; + + return 0; +} + +/* Check if output is within boundaries */ +static int check_output_size(struct s2_walk_info *wi, phys_addr_t output) +{ + unsigned int output_size = ps_to_output_size(wi->ps); + + if (output_size > wi->max_pa_bits) + output_size = wi->max_pa_bits; + + if (output_size != 48 && (output & GENMASK_ULL(47, output_size))) + return -1; + + return 0; +} + +/* + * This is essentially a C-version of the pseudo code from the ARM ARM + * AArch64.TranslationTableWalk function. I strongly recommend looking at + * that pseudocode in trying to understand this. + * + * Must be called with the kvm->srcu read lock held + */ +static int walk_nested_s2_pgd(phys_addr_t ipa, + struct s2_walk_info *wi, struct kvm_s2_trans *out) +{ + int first_block_level, level, stride, input_size, base_lower_bound; + phys_addr_t base_addr; + unsigned int addr_top, addr_bottom; + u64 desc; /* page table entry */ + int ret; + phys_addr_t paddr; + + switch (wi->pgsize) { + case SZ_64K: + case SZ_16K: + level = 3 - wi->sl; + first_block_level = 2; + break; + case SZ_4K: + level = 2 - wi->sl; + first_block_level = 1; + break; + default: + /* GCC is braindead */ + unreachable(); + } + + stride = wi->pgshift - 3; + input_size = 64 - wi->t0sz; + if (input_size > 48 || input_size < 25) + return -EFAULT; + + ret = check_base_s2_limits(wi, level, input_size, stride); + if (WARN_ON(ret)) + return ret; + + base_lower_bound = 3 + input_size - ((3 - level) * stride + + wi->pgshift); + base_addr = wi->baddr & GENMASK_ULL(47, base_lower_bound); + + if (check_output_size(wi, base_addr)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + return 1; + } + + addr_top = input_size - 1; + + while (1) { + phys_addr_t index; + + addr_bottom = (3 - level) * stride + wi->pgshift; + index = (ipa & GENMASK_ULL(addr_top, addr_bottom)) + >> (addr_bottom - 3); + + paddr = base_addr | index; + ret = wi->read_desc(paddr, &desc, wi->data); + if (ret < 0) + return ret; + + /* + * Handle reversedescriptors if endianness differs between the + * host and the guest hypervisor. + */ + if (wi->be) + desc = be64_to_cpu(desc); + else + desc = le64_to_cpu(desc); + + /* Check for valid descriptor at this point */ + if (!(desc & 1) || ((desc & 3) == 1 && level == 3)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_FAULT); + out->upper_attr = desc; + return 1; + } + + /* We're at the final level or block translation level */ + if ((desc & 3) == 1 || level == 3) + break; + + if (check_output_size(wi, desc)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + out->upper_attr = desc; + return 1; + } + + base_addr = desc & GENMASK_ULL(47, wi->pgshift); + + level += 1; + addr_top = addr_bottom - 1; + } + + if (level < first_block_level) { + out->esr = compute_fsc(level, ESR_ELx_FSC_FAULT); + out->upper_attr = desc; + return 1; + } + + /* + * We don't use the contiguous bit in the stage-2 ptes, so skip check + * for misprogramming of the contiguous bit. + */ + + if (check_output_size(wi, desc)) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ADDRSZ); + out->upper_attr = desc; + return 1; + } + + if (!(desc & BIT(10))) { + out->esr = compute_fsc(level, ESR_ELx_FSC_ACCESS); + out->upper_attr = desc; + return 1; + } + + /* Calculate and return the result */ + paddr = (desc & GENMASK_ULL(47, addr_bottom)) | + (ipa & GENMASK_ULL(addr_bottom - 1, 0)); + out->output = paddr; + out->block_size = 1UL << ((3 - level) * stride + wi->pgshift); + out->readable = desc & (0b01 << 6); + out->writable = desc & (0b10 << 6); + out->level = level; + out->upper_attr = desc & GENMASK_ULL(63, 52); + return 0; +} + +static int read_guest_s2_desc(phys_addr_t pa, u64 *desc, void *data) +{ + struct kvm_vcpu *vcpu = data; + + return kvm_read_guest(vcpu->kvm, pa, desc, sizeof(*desc)); +} + +static void vtcr_to_walk_info(u64 vtcr, struct s2_walk_info *wi) +{ + wi->t0sz = vtcr & TCR_EL2_T0SZ_MASK; + + switch (vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + wi->pgshift = 12; break; + case VTCR_EL2_TG0_16K: + wi->pgshift = 14; break; + case VTCR_EL2_TG0_64K: + default: + wi->pgshift = 16; break; + } + + wi->pgsize = BIT(wi->pgshift); + wi->ps = FIELD_GET(VTCR_EL2_PS_MASK, vtcr); + wi->sl = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); + wi->max_ipa_bits = VTCR_EL2_IPA(vtcr); + /* Global limit for now, should eventually be per-VM */ + wi->max_pa_bits = get_kvm_ipa_limit(); +} + +int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, + struct kvm_s2_trans *result) +{ + u64 vtcr = vcpu_read_sys_reg(vcpu, VTCR_EL2); + struct s2_walk_info wi; + int ret; + + result->esr = 0; + + if (!vcpu_has_nv(vcpu)) + return 0; + + wi.read_desc = read_guest_s2_desc; + wi.data = vcpu; + wi.baddr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + + vtcr_to_walk_info(vtcr, &wi); + + wi.be = vcpu_read_sys_reg(vcpu, SCTLR_EL2) & SCTLR_EE; + wi.el1_aarch32 = vcpu_mode_is_32bit(vcpu); + + ret = walk_nested_s2_pgd(gipa, &wi, result); + if (ret) + result->esr |= (kvm_vcpu_get_esr(vcpu) & ~ESR_ELx_FSC); + + return ret; +} + /* Must be called with kvm->mmu_lock held */ struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu) { From patchwork Wed Apr 5 15:39:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202048 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC943C76188 for ; Wed, 5 Apr 2023 15:42:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238979AbjDEPmP (ORCPT ); Wed, 5 Apr 2023 11:42:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238969AbjDEPmK (ORCPT ); Wed, 5 Apr 2023 11:42:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD3FF6EB7 for ; Wed, 5 Apr 2023 08:41:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 83ADD63EE5 for ; Wed, 5 Apr 2023 15:41:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD810C4339B; Wed, 5 Apr 2023 15:41:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709300; bh=z0pFL4yVq3sodb751Dorfz1dJBXY3wzfMUaHYK2wnVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OFlRXXQsCCaS9C+cWCAaEIT1p+jHczkJ57CBGtvg2rJ2hKfIN7IF/y5Pgxw15ptdy XfBY4nOtp7V0bC4Vmhvb8Vlq1ejYiPiaBa7zpnsriu4TClNBE69ai9DfpEFcM2SGRc T1+1D8Sdq9A1OZNpVe8ZZdHyAWKQGUyP30p7Gduy6t4z77FbdAzTDgam39iatJhcEL 5Ek1YCn/k1Lh7kVo/kIo951XFxPNFT0mEt9gWkpz7U6wNo/gWxu9GFoqdKsKJ5hNyx +oTNd6I86fkY+8G2M3VoheAt6YqrxdewYLEZaSdC59u4DZzCi27whkFEfOTlkRqlug 2ZoytP8QPywBQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FP-0062PV-Kx; Wed, 05 Apr 2023 16:40:31 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 17/50] KVM: arm64: nv: Handle shadow stage 2 page faults Date: Wed, 5 Apr 2023 16:39:35 +0100 Message-Id: <20230405154008.3552854-18-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If we are faulting on a shadow stage 2 translation, we first walk the guest hypervisor's stage 2 page table to see if it has a mapping. If not, we inject a stage 2 page fault to the virtual EL2. Otherwise, we create a mapping in the shadow stage 2 page table. Note that we have to deal with two IPAs when we got a shadow stage 2 page fault. One is the address we faulted on, and is in the L2 guest phys space. The other is from the guest stage-2 page table walk, and is in the L1 guest phys space. To differentiate them, we rename variables so that fault_ipa is used for the former and ipa is used for the latter. Co-developed-by: Christoffer Dall Co-developed-by: Jintack Lim Signed-off-by: Christoffer Dall Signed-off-by: Jintack Lim [maz: rewrote this multiple times...] Signed-off-by: Marc Zyngier Reviewed-by: Ganapatrao Kulkarni --- arch/arm64/include/asm/kvm_emulate.h | 6 ++ arch/arm64/include/asm/kvm_nested.h | 19 ++++++ arch/arm64/kvm/mmu.c | 89 ++++++++++++++++++++++++---- arch/arm64/kvm/nested.c | 48 +++++++++++++++ 4 files changed, 152 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index fe4b4b893fb8..1aa059ebb569 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -646,4 +646,10 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature) return test_bit(feature, vcpu->arch.features); } +static inline bool kvm_is_shadow_s2_fault(struct kvm_vcpu *vcpu) +{ + return (vcpu->arch.hw_mmu != &vcpu->kvm->arch.mmu && + vcpu->arch.hw_mmu->nested_stage2_enabled); +} + #endif /* __ARM64_KVM_EMULATE_H__ */ diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 19796d4b0798..33a25ac0e258 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -76,9 +76,28 @@ struct kvm_s2_trans { u64 upper_attr; }; +static inline phys_addr_t kvm_s2_trans_output(struct kvm_s2_trans *trans) +{ + return trans->output; +} + +static inline unsigned long kvm_s2_trans_size(struct kvm_s2_trans *trans) +{ + return trans->block_size; +} + +static inline u32 kvm_s2_trans_esr(struct kvm_s2_trans *trans) +{ + return trans->esr; +} + extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, struct kvm_s2_trans *result); +extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, + struct kvm_s2_trans *trans); +extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2); +int handle_wfx_nested(struct kvm_vcpu *vcpu, bool is_wfe); extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit); extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index b2612763abc1..e08001a45a89 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1251,14 +1251,16 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) } static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, - struct kvm_memory_slot *memslot, unsigned long hva, - unsigned long fault_status) + struct kvm_s2_trans *nested, + struct kvm_memory_slot *memslot, + unsigned long hva, unsigned long fault_status) { int ret = 0; bool write_fault, writable, force_pte = false; bool exec_fault, mte_allowed; bool device = false; unsigned long mmu_seq; + phys_addr_t ipa = fault_ipa; struct kvm *kvm = vcpu->kvm; struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; struct vm_area_struct *vma; @@ -1343,10 +1345,38 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, } vma_pagesize = 1UL << vma_shift; + + if (nested) { + unsigned long max_map_size; + + max_map_size = force_pte ? PUD_SIZE : PAGE_SIZE; + + ipa = kvm_s2_trans_output(nested); + + /* + * If we're about to create a shadow stage 2 entry, then we + * can only create a block mapping if the guest stage 2 page + * table uses at least as big a mapping. + */ + max_map_size = min(kvm_s2_trans_size(nested), max_map_size); + + /* + * Be careful that if the mapping size falls between + * two host sizes, take the smallest of the two. + */ + if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE) + max_map_size = PMD_SIZE; + else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE) + max_map_size = PAGE_SIZE; + + force_pte = (max_map_size == PAGE_SIZE); + vma_pagesize = min(vma_pagesize, (long)max_map_size); + } + if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE) fault_ipa &= ~(vma_pagesize - 1); - gfn = fault_ipa >> PAGE_SHIFT; + gfn = ipa >> PAGE_SHIFT; mte_allowed = kvm_vma_mte_allowed(vma); /* Don't use the VMA after the unlock -- it may have vanished */ @@ -1497,8 +1527,10 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa) */ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) { + struct kvm_s2_trans nested_trans, *nested = NULL; unsigned long fault_status; - phys_addr_t fault_ipa; + phys_addr_t fault_ipa; /* The address we faulted on */ + phys_addr_t ipa; /* Always the IPA in the L1 guest phys space */ struct kvm_memory_slot *memslot; unsigned long hva; bool is_iabt, write_fault, writable; @@ -1507,7 +1539,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) fault_status = kvm_vcpu_trap_get_fault_type(vcpu); - fault_ipa = kvm_vcpu_get_fault_ipa(vcpu); + ipa = fault_ipa = kvm_vcpu_get_fault_ipa(vcpu); is_iabt = kvm_vcpu_trap_is_iabt(vcpu); if (fault_status == ESR_ELx_FSC_FAULT) { @@ -1548,6 +1580,12 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) if (fault_status != ESR_ELx_FSC_FAULT && fault_status != ESR_ELx_FSC_PERM && fault_status != ESR_ELx_FSC_ACCESS) { + /* + * We must never see an address size fault on shadow stage 2 + * page table walk, because we would have injected an addr + * size fault when we walked the nested s2 page and not + * create the shadow entry. + */ kvm_err("Unsupported FSC: EC=%#x xFSC=%#lx ESR_EL2=%#lx\n", kvm_vcpu_trap_get_class(vcpu), (unsigned long)kvm_vcpu_trap_get_fault(vcpu), @@ -1557,7 +1595,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) idx = srcu_read_lock(&vcpu->kvm->srcu); - gfn = fault_ipa >> PAGE_SHIFT; + /* + * We may have faulted on a shadow stage 2 page table if we are + * running a nested guest. In this case, we have to resolve the L2 + * IPA to the L1 IPA first, before knowing what kind of memory should + * back the L1 IPA. + * + * If the shadow stage 2 page table walk faults, then we simply inject + * this to the guest and carry on. + */ + if (kvm_is_shadow_s2_fault(vcpu)) { + u32 esr; + + ret = kvm_walk_nested_s2(vcpu, fault_ipa, &nested_trans); + if (ret) { + esr = kvm_s2_trans_esr(&nested_trans); + kvm_inject_s2_fault(vcpu, esr); + goto out_unlock; + } + + ret = kvm_s2_handle_perm_fault(vcpu, &nested_trans); + if (ret) { + esr = kvm_s2_trans_esr(&nested_trans); + kvm_inject_s2_fault(vcpu, esr); + goto out_unlock; + } + + ipa = kvm_s2_trans_output(&nested_trans); + nested = &nested_trans; + } + + gfn = ipa >> PAGE_SHIFT; memslot = gfn_to_memslot(vcpu->kvm, gfn); hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable); write_fault = kvm_is_write_fault(vcpu); @@ -1601,13 +1669,13 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) * faulting VA. This is always 12 bits, irrespective * of the page size. */ - fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1); - ret = io_mem_abort(vcpu, fault_ipa); + ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1); + ret = io_mem_abort(vcpu, ipa); goto out_unlock; } /* Userspace should not be able to register out-of-bounds IPAs */ - VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm)); + VM_BUG_ON(ipa >= kvm_phys_size(vcpu->kvm)); if (fault_status == ESR_ELx_FSC_ACCESS) { handle_access_fault(vcpu, fault_ipa); @@ -1615,7 +1683,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } - ret = user_mem_abort(vcpu, fault_ipa, memslot, hva, fault_status); + ret = user_mem_abort(vcpu, fault_ipa, nested, + memslot, hva, fault_status); if (ret == 0) ret = 1; out: diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 1e5eb8140012..1cf2ad18a5cd 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -112,6 +112,15 @@ static u32 compute_fsc(int level, u32 fsc) return fsc | (level & 0x3); } +static int esr_s2_fault(struct kvm_vcpu *vcpu, int level, u32 fsc) +{ + u32 esr; + + esr = kvm_vcpu_get_esr(vcpu) & ~ESR_ELx_FSC; + esr |= compute_fsc(level, fsc); + return esr; +} + static int check_base_s2_limits(struct s2_walk_info *wi, int level, int input_size, int stride) { @@ -478,6 +487,45 @@ void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu) } } +/* + * Returns non-zero if permission fault is handled by injecting it to the next + * level hypervisor. + */ +int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) +{ + unsigned long fault_status = kvm_vcpu_trap_get_fault_type(vcpu); + bool forward_fault = false; + + trans->esr = 0; + + if (fault_status != ESR_ELx_FSC_PERM) + return 0; + + if (kvm_vcpu_trap_is_iabt(vcpu)) { + forward_fault = (trans->upper_attr & BIT(54)); + } else { + bool write_fault = kvm_is_write_fault(vcpu); + + forward_fault = ((write_fault && !trans->writable) || + (!write_fault && !trans->readable)); + } + + if (forward_fault) { + trans->esr = esr_s2_fault(vcpu, trans->level, ESR_ELx_FSC_PERM); + return 1; + } + + return 0; +} + +int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2) +{ + vcpu_write_sys_reg(vcpu, vcpu->arch.fault.far_el2, FAR_EL2); + vcpu_write_sys_reg(vcpu, vcpu->arch.fault.hpfar_el2, HPFAR_EL2); + + return kvm_inject_nested_sync(vcpu, esr_el2); +} + void kvm_arch_flush_shadow_all(struct kvm *kvm) { int i; From patchwork Wed Apr 5 15:39:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202044 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F5D5C77B60 for ; Wed, 5 Apr 2023 15:42:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238977AbjDEPmJ (ORCPT ); Wed, 5 Apr 2023 11:42:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238967AbjDEPmH (ORCPT ); Wed, 5 Apr 2023 11:42:07 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48BDC65AB for ; Wed, 5 Apr 2023 08:41:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1EE8663EDE for ; Wed, 5 Apr 2023 15:41:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8755AC433D2; Wed, 5 Apr 2023 15:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709264; bh=yT9wXbPiMdT1IZpoNETik8P5JTreoipBVd7VTRLDztw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Isttl4B29P9T1uQfSywxLfKn0JbUSDLINzjydZG5sou9TOVW2ePq5WpwDbr3TxhCx 15GF2K35cnuKCAKlgL1Pla4ZFfhfHg2zGzLAb1gVWFgHEhDtX0AE9fGTfc/UBZsIGK 3hSPeEhu0/uQNDhFksCflU7dEoWE0wFm+3jBZ5y5tvIkr1kxT7o+zjV528+WJOT4rA I93PCEepZ7Ve5zVAeZ+Mz/fQ0DJhbGx91ksNelqfAM+c3wH+V64btkCT92fQ+sl/WK fsPIIbALBEnEqoSfTPoPbakA0wDXJ0UQQvLk0MaKE7kASUK0YMs+jIkjc+TfJXYUEl 40f97meh+Wwjg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FP-0062PV-TG; Wed, 05 Apr 2023 16:40:31 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 18/50] KVM: arm64: nv: Restrict S2 RD/WR permissions to match the guest's Date: Wed, 5 Apr 2023 16:39:36 +0100 Message-Id: <20230405154008.3552854-19-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When mapping a page in a shadow stage-2, special care must be taken not to be more permissive than the guest is (writable or readable page when the guest hasn't set that permission). Reviewed-by: Alexandru Elisei Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 15 +++++++++++++++ arch/arm64/kvm/mmu.c | 14 +++++++++++++- arch/arm64/kvm/nested.c | 2 +- 3 files changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 33a25ac0e258..d414749cb791 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -91,6 +91,21 @@ static inline u32 kvm_s2_trans_esr(struct kvm_s2_trans *trans) return trans->esr; } +static inline bool kvm_s2_trans_readable(struct kvm_s2_trans *trans) +{ + return trans->readable; +} + +static inline bool kvm_s2_trans_writable(struct kvm_s2_trans *trans) +{ + return trans->writable; +} + +static inline bool kvm_s2_trans_executable(struct kvm_s2_trans *trans) +{ + return !(trans->upper_attr & BIT(54)); +} + extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, struct kvm_s2_trans *result); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index e08001a45a89..6ef930365cc3 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1425,6 +1425,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (exec_fault && device) return -ENOEXEC; + /* + * Potentially reduce shadow S2 permissions to match the guest's own + * S2. For exec faults, we'd only reach this point if the guest + * actually allowed it (see kvm_s2_handle_perm_fault). + */ + if (nested) { + writable &= kvm_s2_trans_writable(nested); + if (!kvm_s2_trans_readable(nested)) + prot &= ~KVM_PGTABLE_PROT_R; + } + read_lock(&kvm->mmu_lock); pgt = vcpu->arch.hw_mmu->pgt; if (mmu_invalidate_retry(kvm, mmu_seq)) @@ -1467,7 +1478,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (device) prot |= KVM_PGTABLE_PROT_DEVICE; - else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) + else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC) && + (!nested || kvm_s2_trans_executable(nested))) prot |= KVM_PGTABLE_PROT_X; /* diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 1cf2ad18a5cd..bca209cf3fac 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -502,7 +502,7 @@ int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans) return 0; if (kvm_vcpu_trap_is_iabt(vcpu)) { - forward_fault = (trans->upper_attr & BIT(54)); + forward_fault = !kvm_s2_trans_executable(trans); } else { bool write_fault = kvm_is_write_fault(vcpu); From patchwork Wed Apr 5 15:39:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A4CCC77B6C for ; Wed, 5 Apr 2023 15:41:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238952AbjDEPlz (ORCPT ); Wed, 5 Apr 2023 11:41:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238944AbjDEPlv (ORCPT ); Wed, 5 Apr 2023 11:41:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6ADA7AC for ; Wed, 5 Apr 2023 08:41:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8C49C63EE0 for ; Wed, 5 Apr 2023 15:41:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D689AC433EF; Wed, 5 Apr 2023 15:40:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709260; bh=SNsRQLuosDp6ca8xk50yttdRmZgnFUo3rnmCUR30FB0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y4ns/M33Z6ZuvnEyRRZ1vBT68EibosLMhZFMmeDbwbFqFZliIGMKSGsZbEV7gKuTG llHwBQw0804uleamRnEi4GyHokPuyAneJJdQzwpDbnOFlPAoYAekD04gC0f9tt5yaE cAXKh2BXtfuGDIzRfssRsm9NSkqMDyGJnbS4FwzvdKm3WMoS20ODTOhRjn7ydy/EZ1 0AnXJSbSpGSe5ouULr/OPeMAjvZObOFyaBtDnQKetINeuRsFOs7oaufFBqzdNz+j4o j5SH/DAb1IXhTs+fDvHoK4ysBVWXUT9d2aDV5PPCkFSp4MOGiB+f+PfKRu+c5iTyIr wcwmp/xBeK4mg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FQ-0062PV-7M; Wed, 05 Apr 2023 16:40:32 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 19/50] KVM: arm64: nv: Unmap/flush shadow stage 2 page tables Date: Wed, 5 Apr 2023 16:39:37 +0100 Message-Id: <20230405154008.3552854-20-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Christoffer Dall Unmap/flush shadow stage 2 page tables for the nested VMs as well as the stage 2 page table for the guest hypervisor. Note: A bunch of the code in mmu.c relating to MMU notifiers is currently dealt with in an extremely abrupt way, for example by clearing out an entire shadow stage-2 table. This will be handled in a more efficient way using the reverse mapping feature in a later version of the patch series. Signed-off-by: Christoffer Dall Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_mmu.h | 3 +++ arch/arm64/include/asm/kvm_nested.h | 3 +++ arch/arm64/kvm/mmu.c | 30 ++++++++++++++++++---- arch/arm64/kvm/nested.c | 39 +++++++++++++++++++++++++++++ 4 files changed, 70 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index cbb198ebd2a0..2d068a1d097b 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -169,6 +169,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size, void __iomem **haddr); int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, void **haddr); +void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, + phys_addr_t addr, phys_addr_t end); void __init free_hyp_pgds(void); void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size); @@ -177,6 +179,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, phys_addr_t pa, unsigned long size, bool writable); +void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end); int kvm_handle_guest_abort(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index d414749cb791..bcf5fb80c033 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -112,6 +112,9 @@ extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu, struct kvm_s2_trans *trans); extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2); +extern void kvm_nested_s2_wp(struct kvm *kvm); +extern void kvm_nested_s2_unmap(struct kvm *kvm); +extern void kvm_nested_s2_flush(struct kvm *kvm); int handle_wfx_nested(struct kvm_vcpu *vcpu, bool is_wfe); extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 6ef930365cc3..968cdef56f76 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -245,13 +245,19 @@ void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) __unmap_stage2_range(mmu, start, size, true); } +void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, + phys_addr_t addr, phys_addr_t end) +{ + stage2_apply_range_resched(mmu, addr, end, kvm_pgtable_stage2_flush); +} + static void stage2_flush_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot) { phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT; phys_addr_t end = addr + PAGE_SIZE * memslot->npages; - stage2_apply_range_resched(&kvm->arch.mmu, addr, end, kvm_pgtable_stage2_flush); + kvm_stage2_flush_range(&kvm->arch.mmu, addr, end); } /** @@ -274,6 +280,8 @@ static void stage2_flush_vm(struct kvm *kvm) kvm_for_each_memslot(memslot, bkt, slots) stage2_flush_memslot(kvm, memslot); + kvm_nested_s2_flush(kvm); + write_unlock(&kvm->mmu_lock); srcu_read_unlock(&kvm->srcu, idx); } @@ -887,6 +895,8 @@ void stage2_unmap_vm(struct kvm *kvm) kvm_for_each_memslot(memslot, bkt, slots) stage2_unmap_memslot(kvm, memslot); + kvm_nested_s2_unmap(kvm); + write_unlock(&kvm->mmu_lock); mmap_read_unlock(current->mm); srcu_read_unlock(&kvm->srcu, idx); @@ -985,12 +995,12 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, } /** - * stage2_wp_range() - write protect stage2 memory region range + * kvm_stage2_wp_range() - write protect stage2 memory region range * @mmu: The KVM stage-2 MMU pointer * @addr: Start address of range * @end: End address of range */ -static void stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end) +void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end) { stage2_apply_range_resched(mmu, addr, end, kvm_pgtable_stage2_wrprotect); } @@ -1021,7 +1031,8 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; write_lock(&kvm->mmu_lock); - stage2_wp_range(&kvm->arch.mmu, start, end); + kvm_stage2_wp_range(&kvm->arch.mmu, start, end); + kvm_nested_s2_wp(kvm); write_unlock(&kvm->mmu_lock); kvm_flush_remote_tlbs(kvm); } @@ -1045,7 +1056,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; - stage2_wp_range(&kvm->arch.mmu, start, end); + kvm_stage2_wp_range(&kvm->arch.mmu, start, end); } /* @@ -1060,6 +1071,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, gfn_t gfn_offset, unsigned long mask) { kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask); + kvm_nested_s2_wp(kvm); } static void kvm_send_hwpoison_signal(unsigned long address, short lsb) @@ -1718,6 +1730,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) (range->end - range->start) << PAGE_SHIFT, range->may_block); + kvm_nested_s2_unmap(kvm); return false; } @@ -1752,6 +1765,7 @@ bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) PAGE_SIZE, __pfn_to_phys(pfn), KVM_PGTABLE_PROT_R, NULL, 0); + kvm_nested_s2_unmap(kvm); return false; } @@ -1770,6 +1784,11 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) range->start << PAGE_SHIFT); pte = __pte(kpte); return pte_valid(pte) && pte_young(pte); + + /* + * TODO: Handle nested_mmu structures here using the reverse mapping in + * a later version of patch series. + */ } bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) @@ -2002,6 +2021,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, write_lock(&kvm->mmu_lock); kvm_unmap_stage2_range(&kvm->arch.mmu, gpa, size); + kvm_nested_s2_unmap(kvm); write_unlock(&kvm->mmu_lock); } diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index bca209cf3fac..3340850e0fcf 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -526,6 +526,45 @@ int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2) return kvm_inject_nested_sync(vcpu, esr_el2); } +/* expects kvm->mmu_lock to be held */ +void kvm_nested_s2_wp(struct kvm *kvm) +{ + int i; + + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (kvm_s2_mmu_valid(mmu)) + kvm_stage2_wp_range(mmu, 0, kvm_phys_size(kvm)); + } +} + +/* expects kvm->mmu_lock to be held */ +void kvm_nested_s2_unmap(struct kvm *kvm) +{ + int i; + + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (kvm_s2_mmu_valid(mmu)) + kvm_unmap_stage2_range(mmu, 0, kvm_phys_size(kvm)); + } +} + +/* expects kvm->mmu_lock to be held */ +void kvm_nested_s2_flush(struct kvm *kvm) +{ + int i; + + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (kvm_s2_mmu_valid(mmu)) + kvm_stage2_flush_range(mmu, 0, kvm_phys_size(kvm)); + } +} + void kvm_arch_flush_shadow_all(struct kvm *kvm) { int i; From patchwork Wed Apr 5 15:39:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B322C76188 for ; Wed, 5 Apr 2023 15:42:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238975AbjDEPmW (ORCPT ); Wed, 5 Apr 2023 11:42:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238981AbjDEPmU (ORCPT ); Wed, 5 Apr 2023 11:42:20 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE9A572A8 for ; Wed, 5 Apr 2023 08:42:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 48E1B63EEA for ; Wed, 5 Apr 2023 15:41:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B1718C4339B; Wed, 5 Apr 2023 15:41:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709276; bh=mbfRSCVS0+LD0xUmsRpeGuUS2JXLEcE0dUkMqujNWEA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hmo4XSYvv7wVxbwsPK4wrVk9jsAgSxBHyi43axSPMlQ6cd3wg90HfR+LWxQB1yCN4 4/Eu84LDHSgg8NwPYlf1xurV4HF5+DpLxiFl4XyBHsxswJ4PH3FMyt3irMUWh5vVNa /0Cg8rbotqFt7vcNEMNV5nWyzd/bl7tpBuwuLCKecBEZbkYr8by8eXqnxa1MVwoHzW ySUCuHuKyNIXtuu3h9mpqaTwx3nS/RUrTFTayEDIYmd0jsXj//Lh1eVrQ5EdL0ShoO AvbE+ZxT8JOSdk0ow/qHGnmPG4g6sT8aVlIY88c6wM+rQ2TzTJrHzmPqYyW9+9X2T4 hrj3Qequ5o6GQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FQ-0062PV-GE; Wed, 05 Apr 2023 16:40:32 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 20/50] KVM: arm64: nv: Set a handler for the system instruction traps Date: Wed, 5 Apr 2023 16:39:38 +0100 Message-Id: <20230405154008.3552854-21-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When HCR.NV bit is set, execution of the EL2 translation regime address aranslation instructions and TLB maintenance instructions are trapped to EL2. In addition, execution of the EL1 translation regime address aranslation instructions and TLB maintenance instructions that are only accessible from EL2 and above are trapped to EL2. In these cases, ESR_EL2.EC will be set to 0x18. Rework the system instruction emulation framework to handle potentially all system instruction traps other than MSR/MRS instructions. Those system instructions would be AT and TLBI instructions controlled by HCR_EL2.NV, AT, and TTLB bits. Signed-off-by: Jintack Lim [maz: squashed two patches together, redispatched various bits around] Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 47 +++++++++++++++++++++++++++++++-------- 1 file changed, 38 insertions(+), 9 deletions(-) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 548f637e57ac..900b96970452 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1993,10 +1993,6 @@ static bool access_spsr_el2(struct kvm_vcpu *vcpu, * guest... */ static const struct sys_reg_desc sys_reg_descs[] = { - { SYS_DESC(SYS_DC_ISW), access_dcsw }, - { SYS_DESC(SYS_DC_CSW), access_dcsw }, - { SYS_DESC(SYS_DC_CISW), access_dcsw }, - DBG_BCR_BVR_WCR_WVR_EL1(0), DBG_BCR_BVR_WCR_WVR_EL1(1), { SYS_DESC(SYS_MDCCINT_EL1), trap_debug_regs, reset_val, MDCCINT_EL1, 0 }, @@ -2475,6 +2471,12 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(SP_EL2, NULL, reset_unknown, 0), }; +static struct sys_reg_desc sys_insn_descs[] = { + { SYS_DESC(SYS_DC_ISW), access_dcsw }, + { SYS_DESC(SYS_DC_CSW), access_dcsw }, + { SYS_DESC(SYS_DC_CISW), access_dcsw }, +}; + static bool trap_dbgdidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) @@ -3164,6 +3166,24 @@ static bool emulate_sys_reg(struct kvm_vcpu *vcpu, return false; } +static int emulate_sys_instr(struct kvm_vcpu *vcpu, struct sys_reg_params *p) +{ + const struct sys_reg_desc *r; + + /* Search from the system instruction table. */ + r = find_reg(p, sys_insn_descs, ARRAY_SIZE(sys_insn_descs)); + + if (likely(r)) { + perform_access(vcpu, p, r); + } else { + kvm_err("Unsupported guest sys instruction at: %lx\n", + *vcpu_pc(vcpu)); + print_sys_reg_instr(p); + kvm_inject_undefined(vcpu); + } + return 1; +} + /** * kvm_reset_sys_regs - sets system registers to reset value * @vcpu: The VCPU pointer @@ -3181,7 +3201,8 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu) } /** - * kvm_handle_sys_reg -- handles a mrs/msr trap on a guest sys_reg access + * kvm_handle_sys_reg -- handles a system instruction or mrs/msr instruction + * trap on a guest execution * @vcpu: The VCPU pointer */ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu) @@ -3195,12 +3216,19 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu) params = esr_sys64_to_params(esr); params.regval = vcpu_get_reg(vcpu, Rt); - if (!emulate_sys_reg(vcpu, ¶ms)) + /* System register? */ + if (params.Op0 == 2 || params.Op0 == 3) { + if (!emulate_sys_reg(vcpu, ¶ms)) + return 1; + + if (!params.is_write) + vcpu_set_reg(vcpu, Rt, params.regval); + return 1; + } - if (!params.is_write) - vcpu_set_reg(vcpu, Rt, params.regval); - return 1; + /* Hints, PSTATE (Op0 == 0) and System instructions (Op0 == 1) */ + return emulate_sys_instr(vcpu, ¶ms); } /****************************************************************************** @@ -3592,6 +3620,7 @@ int __init kvm_sys_reg_table_init(void) valid &= check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs), true); valid &= check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs), true); valid &= check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs), false); + valid &= check_sysreg_table(sys_insn_descs, ARRAY_SIZE(sys_insn_descs), false); if (!valid) return -EINVAL; From patchwork Wed Apr 5 15:39:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B1B5C77B60 for ; Wed, 5 Apr 2023 15:42:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238944AbjDEPmD (ORCPT ); Wed, 5 Apr 2023 11:42:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238935AbjDEPmC (ORCPT ); Wed, 5 Apr 2023 11:42:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E45A96EA0 for ; Wed, 5 Apr 2023 08:41:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C52F363EED for ; Wed, 5 Apr 2023 15:41:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9E4EC433D2; Wed, 5 Apr 2023 15:41:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709282; bh=rIVcG44fLsjhhBejkFRUjrQAp0+Pfvmabnlh9VRw4wo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MVFKGke9PZAd+YQSdI++sramSK3+zApTo3RSek9v8ur4yCpd0ls/bMrRJnJDADfQH a/85mIsgvLT9bP9aE9qe5itYkK/DwK4yGpfLk0rRMS/fIHu3VtrVEabbvaTs6A0XpR qGGopECu0HYrQomwA9MzD6TwKLOnqVXuv6xRgwGgMqlmwzEk5ecPIqITu//kt2lMbO S/aUMCSDXPq1iYWWtiP2fGUt2egpjwU8H6O/TH34XamV18S+p8MudcHvkDYm6hOZeX wHbVjz/8tIvq2+Y2SBCkxnnBzJS5KlSpAMo/cm73LZvY9jJNQF49+T98KfK/Kkz512 QxPi6qWYJ9eKA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FQ-0062PV-Ot; Wed, 05 Apr 2023 16:40:32 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 21/50] KVM: arm64: nv: Trap and emulate AT instructions from virtual EL2 Date: Wed, 5 Apr 2023 16:39:39 +0100 Message-Id: <20230405154008.3552854-22-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When supporting nested virtualization a guest hypervisor executing AT instructions must be trapped and emulated by the host hypervisor, because untrapped AT instructions operating on S1E1 will use the wrong translation regieme (the one used to emulate virtual EL2 in EL1 instead of virtual EL1) and AT instructions operating on S12 will not work from EL1. This patch does several things. 1. List and define all AT system instructions to emulate and document the emulation design. 2. Implement AT instruction handling logic in EL2. This will be used to emulate AT instructions executed in the virtual EL2. AT instruction emulation works by loading the proper processor context, which depends on the trapped instruction and the virtual HCR_EL2, to the EL1 virtual memory control registers and executing AT instructions. Note that ctxt->hw_sys_regs is expected to have the proper processor context before calling the handling function(__kvm_at_insn) implemented in this patch. 4. Emulate AT S1E[01] instructions by issuing the same instructions in EL2. We set the physical EL1 registers, NV and NV1 bits as described in the AT instruction emulation overview. 5. Emulate AT A12E[01] instructions in two steps: First, do the stage-1 translation by reusing the existing AT emulation functions. Second, do the stage-2 translation by walking the guest hypervisor's stage-2 page table in software. Record the translation result to PAR_EL1. 6. Emulate AT S1E2 instructions by issuing the corresponding S1E1 instructions in EL2. We set the physical EL1 registers and the HCR_EL2 register as described in the AT instruction emulation overview. 7. Forward system instruction traps to the virtual EL2 if the corresponding virtual AT bit is set in the virtual HCR_EL2. [ Much logic above has been reworked by Marc Zyngier ] Signed-off-by: Jintack Lim Signed-off-by: Marc Zyngier Signed-off-by: Christoffer Dall --- arch/arm64/include/asm/kvm_arm.h | 2 + arch/arm64/include/asm/kvm_asm.h | 2 + arch/arm64/include/asm/sysreg.h | 17 +++ arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/at.c | 219 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/vhe/switch.c | 13 +- arch/arm64/kvm/sys_regs.c | 226 +++++++++++++++++++++++++++++++ 7 files changed, 478 insertions(+), 3 deletions(-) create mode 100644 arch/arm64/kvm/at.c diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 680c02d8f38f..d446fccdca5a 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -20,6 +20,7 @@ #define HCR_AMVOFFEN (UL(1) << 51) #define HCR_FIEN (UL(1) << 47) #define HCR_FWB (UL(1) << 46) +#define HCR_AT (UL(1) << 44) #define HCR_NV1 (UL(1) << 43) #define HCR_NV (UL(1) << 42) #define HCR_API (UL(1) << 41) @@ -120,6 +121,7 @@ #define VTCR_EL2_TG0_16K TCR_TG0_16K #define VTCR_EL2_TG0_64K TCR_TG0_64K #define VTCR_EL2_SH0_MASK TCR_SH0_MASK +#define VTCR_EL2_SH0_SHIFT TCR_SH0_SHIFT #define VTCR_EL2_SH0_INNER TCR_SH0_INNER #define VTCR_EL2_ORGN0_MASK TCR_ORGN0_MASK #define VTCR_EL2_ORGN0_WBWA TCR_ORGN0_WBWA diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 43c3bc0f9544..373558ea24d8 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -228,6 +228,8 @@ extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); extern void __kvm_timer_set_cntvoff(u64 cntvoff); +extern void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr); +extern void __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr); extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index f8da9e1b0c11..473fb2c7d1f8 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -528,6 +528,23 @@ #define SYS_SP_EL2 sys_reg(3, 6, 4, 1, 0) +/* AT instructions */ +#define AT_Op0 1 +#define AT_CRn 7 + +#define OP_AT_S1E1R sys_insn(AT_Op0, 0, AT_CRn, 8, 0) +#define OP_AT_S1E1W sys_insn(AT_Op0, 0, AT_CRn, 8, 1) +#define OP_AT_S1E0R sys_insn(AT_Op0, 0, AT_CRn, 8, 2) +#define OP_AT_S1E0W sys_insn(AT_Op0, 0, AT_CRn, 8, 3) +#define OP_AT_S1E1RP sys_insn(AT_Op0, 0, AT_CRn, 9, 0) +#define OP_AT_S1E1WP sys_insn(AT_Op0, 0, AT_CRn, 9, 1) +#define OP_AT_S1E2R sys_insn(AT_Op0, 4, AT_CRn, 8, 0) +#define OP_AT_S1E2W sys_insn(AT_Op0, 4, AT_CRn, 8, 1) +#define OP_AT_S12E1R sys_insn(AT_Op0, 4, AT_CRn, 8, 4) +#define OP_AT_S12E1W sys_insn(AT_Op0, 4, AT_CRn, 8, 5) +#define OP_AT_S12E0R sys_insn(AT_Op0, 4, AT_CRn, 8, 6) +#define OP_AT_S12E0W sys_insn(AT_Op0, 4, AT_CRn, 8, 7) + /* Common SCTLR_ELx flags. */ #define SCTLR_ELx_ENTP2 (BIT(60)) #define SCTLR_ELx_DSSBS (BIT(44)) diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index c0c050e53157..c2717a8f12f5 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -14,7 +14,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ inject_fault.o va_layout.o handle_exit.o \ guest.o debug.o reset.o sys_regs.o stacktrace.o \ vgic-sys-reg-v3.o fpsimd.o pkvm.o \ - arch_timer.o trng.o vmid.o emulate-nested.o nested.o \ + arch_timer.o trng.o vmid.o emulate-nested.o nested.o at.o \ vgic/vgic.o vgic/vgic-init.o \ vgic/vgic-irqfd.o vgic/vgic-v2.o \ vgic/vgic-v3.o vgic/vgic-v4.o \ diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c new file mode 100644 index 000000000000..6d47dd409384 --- /dev/null +++ b/arch/arm64/kvm/at.c @@ -0,0 +1,219 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2017 - Linaro Ltd + * Author: Jintack Lim + */ + +#include +#include + +struct mmu_config { + u64 ttbr0; + u64 ttbr1; + u64 tcr; + u64 sctlr; + u64 vttbr; + u64 vtcr; + u64 hcr; +}; + +static void __mmu_config_save(struct mmu_config *config) +{ + config->ttbr0 = read_sysreg_el1(SYS_TTBR0); + config->ttbr1 = read_sysreg_el1(SYS_TTBR1); + config->tcr = read_sysreg_el1(SYS_TCR); + config->sctlr = read_sysreg_el1(SYS_SCTLR); + config->vttbr = read_sysreg(vttbr_el2); + config->vtcr = read_sysreg(vtcr_el2); + config->hcr = read_sysreg(hcr_el2); +} + +static void __mmu_config_restore(struct mmu_config *config) +{ + write_sysreg_el1(config->ttbr0, SYS_TTBR0); + write_sysreg_el1(config->ttbr1, SYS_TTBR1); + write_sysreg_el1(config->tcr, SYS_TCR); + write_sysreg_el1(config->sctlr, SYS_SCTLR); + write_sysreg(config->vttbr, vttbr_el2); + write_sysreg(config->vtcr, vtcr_el2); + write_sysreg(config->hcr, hcr_el2); + + isb(); +} + +void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr) +{ + struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; + struct mmu_config config; + struct kvm_s2_mmu *mmu; + bool fail; + + write_lock(&vcpu->kvm->mmu_lock); + + /* + * If HCR_EL2.{E2H,TGE} == {1,1}, the MMU context is already + * the right one (as we trapped from vEL2). + */ + if (vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu)) + goto skip_mmu_switch; + + /* + * FIXME: Obtaining the S2 MMU for a L2 is horribly racy, and + * we may not find it (recycled by another vcpu, for example). + * See the other FIXME comment below about the need for a SW + * PTW in this case. + */ + mmu = lookup_s2_mmu(vcpu); + if (WARN_ON(!mmu)) + goto out; + + /* We've trapped, so everything is live on the CPU. */ + __mmu_config_save(&config); + + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR0_EL1), SYS_TTBR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR1_EL1), SYS_TTBR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, TCR_EL1), SYS_TCR); + write_sysreg_el1(ctxt_sys_reg(ctxt, SCTLR_EL1), SYS_SCTLR); + write_sysreg(kvm_get_vttbr(mmu), vttbr_el2); + /* + * REVISIT: do we need anything from the guest's VTCR_EL2? If + * looks like keeping the hosts configuration is the right + * thing to do at this stage (and we could avoid save/restore + * it. Keep the host's version for now. + */ + write_sysreg((config.hcr & ~HCR_TGE) | HCR_VM, hcr_el2); + + isb(); + +skip_mmu_switch: + + switch (op) { + case OP_AT_S1E1R: + case OP_AT_S1E1RP: + fail = __kvm_at("s1e1r", vaddr); + break; + case OP_AT_S1E1W: + case OP_AT_S1E1WP: + fail = __kvm_at("s1e1w", vaddr); + break; + case OP_AT_S1E0R: + fail = __kvm_at("s1e0r", vaddr); + break; + case OP_AT_S1E0W: + fail = __kvm_at("s1e0w", vaddr); + break; + default: + WARN_ON_ONCE(1); + break; + } + + if (!fail) + ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg(par_el1); + else + ctxt_sys_reg(ctxt, PAR_EL1) = SYS_PAR_EL1_F; + + /* + * Failed? let's leave the building now. + * + * FIXME: how about a failed translation because the shadow S2 + * wasn't populated? We may need to perform a SW PTW, + * populating our shadow S2 and retry the instruction. + */ + if (ctxt_sys_reg(ctxt, PAR_EL1) & SYS_PAR_EL1_F) + goto nopan; + + /* No PAN? No problem. */ + if (!vcpu_el2_e2h_is_set(vcpu) || !(*vcpu_cpsr(vcpu) & PSR_PAN_BIT)) + goto nopan; + + /* + * For PAN-involved AT operations, perform the same + * translation, using EL0 this time. + */ + switch (op) { + case OP_AT_S1E1RP: + fail = __kvm_at("s1e0r", vaddr); + break; + case OP_AT_S1E1WP: + fail = __kvm_at("s1e0w", vaddr); + break; + default: + goto nopan; + } + + /* + * If the EL0 translation has succeeded, we need to pretend + * the AT operation has failed, as the PAN setting forbids + * such a translation. + * + * FIXME: we hardcode a Level-3 permission fault. We really + * should return the real fault level. + */ + if (fail || !(read_sysreg(par_el1) & SYS_PAR_EL1_F)) + ctxt_sys_reg(ctxt, PAR_EL1) = (0xf << 1) | SYS_PAR_EL1_F; + +nopan: + if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))) + __mmu_config_restore(&config); + +out: + write_unlock(&vcpu->kvm->mmu_lock); +} + +void __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr) +{ + struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; + struct mmu_config config; + struct kvm_s2_mmu *mmu; + u64 val; + + write_lock(&vcpu->kvm->mmu_lock); + + mmu = &vcpu->kvm->arch.mmu; + + /* We've trapped, so everything is live on the CPU. */ + __mmu_config_save(&config); + + if (vcpu_el2_e2h_is_set(vcpu)) { + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR0_EL2), SYS_TTBR0); + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR1_EL2), SYS_TTBR1); + write_sysreg_el1(ctxt_sys_reg(ctxt, TCR_EL2), SYS_TCR); + write_sysreg_el1(ctxt_sys_reg(ctxt, SCTLR_EL2), SYS_SCTLR); + + val = config.hcr; + } else { + write_sysreg_el1(ctxt_sys_reg(ctxt, TTBR0_EL2), SYS_TTBR0); + val = translate_tcr_el2_to_tcr_el1(ctxt_sys_reg(ctxt, TCR_EL2)); + write_sysreg_el1(val, SYS_TCR); + val = translate_sctlr_el2_to_sctlr_el1(ctxt_sys_reg(ctxt, SCTLR_EL2)); + write_sysreg_el1(val, SYS_SCTLR); + + val = config.hcr | HCR_NV | HCR_NV1; + } + + write_sysreg(kvm_get_vttbr(mmu), vttbr_el2); + /* FIXME: write S2 MMU VTCR_EL2? */ + write_sysreg((val & ~HCR_TGE) | HCR_VM, hcr_el2); + + isb(); + + switch (op) { + case OP_AT_S1E2R: + asm volatile("at s1e1r, %0" : : "r" (vaddr)); + break; + case OP_AT_S1E2W: + asm volatile("at s1e1w, %0" : : "r" (vaddr)); + break; + default: + WARN_ON_ONCE(1); + break; + } + + isb(); + + /* FIXME: handle failed translation due to shadow S2 */ + ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg(par_el1); + + __mmu_config_restore(&config); + write_unlock(&vcpu->kvm->mmu_lock); +} diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 36b50e315504..6d186a3e5e5f 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -44,9 +44,10 @@ static void __activate_traps(struct kvm_vcpu *vcpu) if (!vcpu_el2_e2h_is_set(vcpu)) { /* * For a guest hypervisor on v8.0, trap and emulate - * the EL1 virtual memory control register accesses. + * the EL1 virtual memory control register accesses + * as well as the AT S1 operations. */ - hcr |= HCR_TVM | HCR_TRVM | HCR_NV1; + hcr |= HCR_TVM | HCR_TRVM | HCR_AT | HCR_NV1; } else { /* * For a guest hypervisor on v8.1 (VHE), allow to @@ -71,6 +72,14 @@ static void __activate_traps(struct kvm_vcpu *vcpu) hcr &= ~HCR_TVM; hcr |= vhcr_el2 & (HCR_TVM | HCR_TRVM); + + /* + * If we're using the EL1 translation regime + * (TGE clear), then ensure that AT S1 ops are + * trapped too. + */ + if (!vcpu_el2_tge_is_set(vcpu)) + hcr |= HCR_AT; } } diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 900b96970452..028be1db3897 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2471,10 +2471,236 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(SP_EL2, NULL, reset_unknown, 0), }; +static bool handle_s1e01(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + int sys_encoding = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + + if (vcpu_has_nv(vcpu) && forward_traps(vcpu, HCR_AT)) + return false; + + __kvm_at_s1e01(vcpu, sys_encoding, p->regval); + + return true; +} + +static bool handle_s1e2(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + int sys_encoding = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + __kvm_at_s1e2(vcpu, sys_encoding, p->regval); + + return true; +} + +static u64 setup_par_aborted(u32 esr) +{ + u64 par = 0; + + /* S [9]: fault in the stage 2 translation */ + par |= (1 << 9); + /* FST [6:1]: Fault status code */ + par |= (esr << 1); + /* F [0]: translation is aborted */ + par |= 1; + + return par; +} + +static u64 setup_par_completed(struct kvm_vcpu *vcpu, struct kvm_s2_trans *out) +{ + u64 par, vtcr_sh0; + + /* F [0]: Translation is completed successfully */ + par = 0; + /* ATTR [63:56] */ + par |= out->upper_attr; + /* PA [47:12] */ + par |= out->output & GENMASK_ULL(11, 0); + /* RES1 [11] */ + par |= (1UL << 11); + /* SH [8:7]: Shareability attribute */ + vtcr_sh0 = vcpu_read_sys_reg(vcpu, VTCR_EL2) & VTCR_EL2_SH0_MASK; + par |= (vtcr_sh0 >> VTCR_EL2_SH0_SHIFT) << 7; + + return par; +} + +static bool handle_s12(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r, bool write) +{ + u64 par, va; + u32 esr, op; + phys_addr_t ipa; + struct kvm_s2_trans out; + int ret; + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + /* Do the stage-1 translation */ + va = p->regval; + op = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + switch (op) { + case OP_AT_S12E1R: + op = OP_AT_S1E1R; + break; + case OP_AT_S12E1W: + op = OP_AT_S1E1W; + break; + case OP_AT_S12E0R: + op = OP_AT_S1E0R; + break; + case OP_AT_S12E0W: + op = OP_AT_S1E0W; + break; + default: + WARN_ON_ONCE(1); + return true; + } + + __kvm_at_s1e01(vcpu, op, va); + par = vcpu_read_sys_reg(vcpu, PAR_EL1); + if (par & 1) { + /* The stage-1 translation aborted */ + return true; + } + + /* Do the stage-2 translation */ + ipa = (par & GENMASK_ULL(47, 12)) | (va & GENMASK_ULL(11, 0)); + out.esr = 0; + ret = kvm_walk_nested_s2(vcpu, ipa, &out); + if (ret < 0) + return false; + + /* Check if the stage-2 PTW is aborted */ + if (out.esr) { + esr = out.esr; + goto s2_trans_abort; + } + + /* Check the access permission */ + if ((!write && !out.readable) || (write && !out.writable)) { + esr = ESR_ELx_FSC_PERM; + esr |= out.level & 0x3; + goto s2_trans_abort; + } + + vcpu_write_sys_reg(vcpu, setup_par_completed(vcpu, &out), PAR_EL1); + return true; + +s2_trans_abort: + vcpu_write_sys_reg(vcpu, setup_par_aborted(esr), PAR_EL1); + return true; +} + +static bool handle_s12r(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + return handle_s12(vcpu, p, r, false); +} + +static bool handle_s12w(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + return handle_s12(vcpu, p, r, true); +} + +/* + * AT instruction emulation + * + * We emulate AT instructions executed in the virtual EL2. + * Basic strategy for the stage-1 translation emulation is to load proper + * context, which depends on the trapped instruction and the virtual HCR_EL2, + * to the EL1 virtual memory control registers and execute S1E[01] instructions + * in EL2. See below for more detail. + * + * For the stage-2 translation, which is necessary for S12E[01] emulation, + * we walk the guest hypervisor's stage-2 page table in software. + * + * The stage-1 translation emulations can be divided into two groups depending + * on the translation regime. + * + * 1. EL2 AT instructions: S1E2x + * +-----------------------------------------------------------------------+ + * | | Setting for the emulation | + * | Virtual HCR_EL2.E2H on trap |-----------------------------------------+ + * | | Phys EL1 regs | Phys NV, NV1 | Phys TGE | + * |-----------------------------------------------------------------------| + * | 0 | vEL2 | (1, 1) | 0 | + * | 1 | vEL2 | (0, 0) | 0 | + * +-----------------------------------------------------------------------+ + * + * We emulate the EL2 AT instructions by loading virtual EL2 context + * to the EL1 virtual memory control registers and executing corresponding + * EL1 AT instructions. + * + * We set physical NV and NV1 bits to use EL2 page table format for non-VHE + * guest hypervisor (i.e. HCR_EL2.E2H == 0). As a VHE guest hypervisor uses the + * EL1 page table format, we don't set those bits. + * + * We should clear physical TGE bit not to use the EL2 translation regime when + * the host uses the VHE feature. + * + * + * 2. EL0/EL1 AT instructions: S1E[01]x, S12E1x + * +----------------------------------------------------------------------+ + * | Virtual HCR_EL2 on trap | Setting for the emulation | + * |----------------------------------------------------------------------+ + * | (vE2H, vTGE) | (vNV, vNV1) | Phys EL1 regs | Phys NV, NV1 | Phys TGE | + * |----------------------------------------------------------------------| + * | (0, 0)* | (0, 0) | vEL1 | (0, 0) | 0 | + * | (0, 0) | (1, 1) | vEL1 | (1, 1) | 0 | + * | (1, 1) | (0, 0) | vEL2 | (0, 0) | 0 | + * | (1, 1) | (1, 1) | vEL2 | (1, 1) | 0 | + * +----------------------------------------------------------------------+ + * + * *For (0, 0) in the 'Virtual HCR_EL2 on trap' column, it actually means + * (1, 1). Keep them (0, 0) just for the readability. + * + * We set physical EL1 virtual memory control registers depending on + * (vE2H, vTGE) pair. When the pair is (0, 0) where AT instructions are + * supposed to use EL0/EL1 translation regime, we load the EL1 registers with + * the virtual EL1 registers (i.e. EL1 registers from the guest hypervisor's + * point of view). When the pair is (1, 1), however, AT instructions are defined + * to apply EL2 translation regime. To emulate this behavior, we load the EL1 + * registers with the virtual EL2 context. (i.e the shadow registers) + * + * We respect the virtual NV and NV1 bit for the emulation. When those bits are + * set, it means that a guest hypervisor would like to use EL2 page table format + * for the EL1 translation regime. We emulate this by setting the physical + * NV and NV1 bits. + */ + +#define SYS_INSN(insn, access_fn) \ + { \ + SYS_DESC(OP_##insn), \ + .access = (access_fn), \ + } + static struct sys_reg_desc sys_insn_descs[] = { { SYS_DESC(SYS_DC_ISW), access_dcsw }, + + SYS_INSN(AT_S1E1R, handle_s1e01), + SYS_INSN(AT_S1E1W, handle_s1e01), + SYS_INSN(AT_S1E0R, handle_s1e01), + SYS_INSN(AT_S1E0W, handle_s1e01), + SYS_INSN(AT_S1E1RP, handle_s1e01), + SYS_INSN(AT_S1E1WP, handle_s1e01), + { SYS_DESC(SYS_DC_CSW), access_dcsw }, { SYS_DESC(SYS_DC_CISW), access_dcsw }, + + SYS_INSN(AT_S1E2R, handle_s1e2), + SYS_INSN(AT_S1E2W, handle_s1e2), + SYS_INSN(AT_S12E1R, handle_s12r), + SYS_INSN(AT_S12E1W, handle_s12w), + SYS_INSN(AT_S12E0R, handle_s12r), + SYS_INSN(AT_S12E0W, handle_s12w), }; static bool trap_dbgdidr(struct kvm_vcpu *vcpu, From patchwork Wed Apr 5 15:39:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 924DFC77B6C for ; Wed, 5 Apr 2023 15:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238965AbjDEPlv (ORCPT ); Wed, 5 Apr 2023 11:41:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238927AbjDEPlq (ORCPT ); Wed, 5 Apr 2023 11:41:46 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD5F16E90 for ; Wed, 5 Apr 2023 08:41:29 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id ECA8263EE9 for ; Wed, 5 Apr 2023 15:41:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1A550C433D2; Wed, 5 Apr 2023 15:41:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709279; bh=LUzNr/zI7ceOhectm1++uHiPK784UqYxemqNmrO5ny0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ALIfhH8nLSm8BE8WlnzwjfY78GjP98GNqM1BgCV19EUdPhfNvbAeh+qmMNICPJPeP EgF570wEN9JmjTmRK6nhuNjC9o6b/nGTwIXmnogK66opkLT2YhKTofBXzoY6CWL6MV 41Oyc0SN6mVA3Ykm/p042f6TSpzM7Y0IcZxgrptHjFp7L3vzTWyG4vDCGAJ22XiN3C CELgjuMqdtgdjpaClayIbRctpwXnymdlOKT10/Ht1ld3YwMlUbWn5FF/WfHpR3pP4z mBog6IMISUuIQikUOcETeCRv8UEz+tqN1doXav/iyAqS1td2aYjK3O3kOaRpna29+N 8qmYl+fHvnDBg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FR-0062PV-10; Wed, 05 Apr 2023 16:40:33 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 22/50] KVM: arm64: nv: Trap and emulate TLBI instructions from virtual EL2 Date: Wed, 5 Apr 2023 16:39:40 +0100 Message-Id: <20230405154008.3552854-23-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim When supporting nested virtualization a guest hypervisor executing TLBI instructions must be trapped and emulated by the host hypervisor, because the guest hypervisor can only affect physical TLB entries relating to its own execution environment (virtual EL2 in EL1) but not to the nested guests as required by the semantics of the instructions and TLBI instructions might also result in updates (invalidations) to shadow page tables. This patch does several things. 1. List and define all TLBI system instructions to emulate. 2. Emulate TLBI ALLE2(IS) instruction executed in the virtual EL2. Since we emulate the virtual EL2 in the EL1, we invalidate EL1&0 regime stage 1 TLB entries with setting vttbr_el2 having the VMID of the virtual EL2. 3. Emulate TLBI VAE2* instruction executed in the virtual EL2. Based on the same principle as TLBI ALLE2 instruction, we can simply emulate those instructions by executing corresponding VAE1* instructions with the virtual EL2's VMID assigned by the host hypervisor. Note that we are able to emulate TLBI ALLE2IS precisely by only invalidating stage 1 TLB entries via TLBI VMALL1IS instruction, but to make it simeple, we reuse the existing function, __kvm_tlb_flush_vmid(), which invalidates both of stage 1 and 2 TLB entries. 4. TLBI ALLE1(IS) instruction invalidates all EL1&0 regime stage 1 and 2 TLB entries (on all PEs in the same Inner Shareable domain). To emulate these instructions, we first need to clear all the mappings in the shadow page tables since executing those instructions implies the change of mappings in the stage 2 page tables maintained by the guest hypervisor. We then need to invalidate all EL1&0 regime stage 1 and 2 TLB entries of all VMIDs, which are assigned by the host hypervisor, for this VM. 5. Based on the same principle as TLBI ALLE1(IS) emulation, we clear the mappings in the shadow stage-2 page tables and invalidate TLB entries. But this time we do it only for the current VMID from the guest hypervisor's perspective, not for all VMIDs. 6. Based on the same principle as TLBI ALLE1(IS) and TLBI VMALLS12E1(IS) emulation, we clear the mappings in the shadow stage-2 page tables and invalidate TLB entries. We do it only for one mapping for the current VMID from the guest hypervisor's view. 7. Forward system instruction traps to the virtual EL2 if a corresponding bit in the virtual HCR_EL2 is set. 8. Even though a guest hypervisor can execute TLBI instructions that are accesible at EL1 without trap, it's wrong; All those TLBI instructions work based on current VMID, and when running a guest hypervisor current VMID is the one for itself, not the one from the virtual vttbr_el2. So letting a guest hypervisor execute those TLBI instructions results in invalidating its own TLB entries and leaving invalid TLB entries unhandled. Therefore we trap and emulate those TLBI instructions. The emulation is simple; we find a shadow VMID mapped to the virtual vttbr_el2, set it in the physical vttbr_el2, then execute the same instruction in EL2. We don't set HCR_EL2.TTLB bit yet. [ Changes performed by Marc Zynger: The TLBI handling code more or less directly execute the same instruction that has been trapped (with an EL2->EL1 conversion in the case of an EL2 TLBI), but that's unfortunately not enough: - TLBIs must be upgraded to the Inner Shareable domain to account for vcpu migration, just like we already have with HCR_EL2.FB. - The DSB instruction that synchronises these must thus be on the Inner Shareable domain as well. - Prior to executing the TLBI, we need another DSB ISHST to make sure that the update to the page tables is now visible. Ordering of system instructions fixed - The current TLB invalidation code is pretty buggy, as it assume a page mapping. On the contrary, it is likely that TLB invalidation will cover more than a single page, and the size should be decided by the guests configuration (and not the host's). Since we don't cache the guest mapping sizes in the shadow PT yet, let's assume the worse case (a block mapping) and invalidate that. Take this opportunity to fix the decoding of the parameter (it isn't a straight IPA). - In general, we always emulate local TBL invalidations as being as upgraded to the Inner Shareable domain so that we can easily deal with vcpu migration. This is consistent with the fact that we set HCR_EL2.FB when running non-nested VMs. So let's emulate TLBI ALLE2 as ALLE2IS. ] [ Changes performed by Christoffer Dall: Sometimes when we are invalidating the TLB for a certain S2 MMU context, this context can also have EL2 context associated with it and we have to invalidate this too. ] Signed-off-by: Jintack Lim Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_asm.h | 2 + arch/arm64/include/asm/kvm_nested.h | 7 + arch/arm64/include/asm/sysreg.h | 36 ++++ arch/arm64/kvm/hyp/vhe/switch.c | 8 +- arch/arm64/kvm/hyp/vhe/tlb.c | 81 +++++++++ arch/arm64/kvm/mmu.c | 17 +- arch/arm64/kvm/nested.c | 35 ++++ arch/arm64/kvm/sys_regs.c | 255 ++++++++++++++++++++++++++++ 8 files changed, 436 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 373558ea24d8..d73f8d5a8a25 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -226,6 +226,8 @@ extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level); extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); +extern void __kvm_tlb_vae2is(struct kvm_s2_mmu *mmu, u64 va, u64 sys_encoding); +extern void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding); extern void __kvm_timer_set_cntvoff(u64 cntvoff); extern void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr); diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index bcf5fb80c033..bc783492d1c9 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -63,6 +63,13 @@ extern void kvm_init_nested(struct kvm *kvm); extern int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu); extern void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu); extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu); + +union tlbi_info; + +extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid, + const union tlbi_info *info, + void (*)(struct kvm_s2_mmu *, + const union tlbi_info *)); extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 473fb2c7d1f8..c4a173b136b1 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -545,6 +545,42 @@ #define OP_AT_S12E0R sys_insn(AT_Op0, 4, AT_CRn, 8, 6) #define OP_AT_S12E0W sys_insn(AT_Op0, 4, AT_CRn, 8, 7) +/* TLBI instructions */ +#define TLBI_Op0 1 +#define TLBI_Op1_EL1 0 /* Accessible from EL1 or higher */ +#define TLBI_Op1_EL2 4 /* Accessible from EL2 or higher */ +#define TLBI_CRn 8 +#define tlbi_insn_el1(CRm, Op2) sys_insn(TLBI_Op0, TLBI_Op1_EL1, TLBI_CRn, (CRm), (Op2)) +#define tlbi_insn_el2(CRm, Op2) sys_insn(TLBI_Op0, TLBI_Op1_EL2, TLBI_CRn, (CRm), (Op2)) + +#define OP_TLBI_VMALLE1IS tlbi_insn_el1(3, 0) +#define OP_TLBI_VAE1IS tlbi_insn_el1(3, 1) +#define OP_TLBI_ASIDE1IS tlbi_insn_el1(3, 2) +#define OP_TLBI_VAAE1IS tlbi_insn_el1(3, 3) +#define OP_TLBI_VALE1IS tlbi_insn_el1(3, 5) +#define OP_TLBI_VAALE1IS tlbi_insn_el1(3, 7) +#define OP_TLBI_VMALLE1 tlbi_insn_el1(7, 0) +#define OP_TLBI_VAE1 tlbi_insn_el1(7, 1) +#define OP_TLBI_ASIDE1 tlbi_insn_el1(7, 2) +#define OP_TLBI_VAAE1 tlbi_insn_el1(7, 3) +#define OP_TLBI_VALE1 tlbi_insn_el1(7, 5) +#define OP_TLBI_VAALE1 tlbi_insn_el1(7, 7) + +#define OP_TLBI_IPAS2E1IS tlbi_insn_el2(0, 1) +#define OP_TLBI_IPAS2LE1IS tlbi_insn_el2(0, 5) +#define OP_TLBI_ALLE2IS tlbi_insn_el2(3, 0) +#define OP_TLBI_VAE2IS tlbi_insn_el2(3, 1) +#define OP_TLBI_ALLE1IS tlbi_insn_el2(3, 4) +#define OP_TLBI_VALE2IS tlbi_insn_el2(3, 5) +#define OP_TLBI_VMALLS12E1IS tlbi_insn_el2(3, 6) +#define OP_TLBI_IPAS2E1 tlbi_insn_el2(4, 1) +#define OP_TLBI_IPAS2LE1 tlbi_insn_el2(4, 5) +#define OP_TLBI_ALLE2 tlbi_insn_el2(7, 0) +#define OP_TLBI_VAE2 tlbi_insn_el2(7, 1) +#define OP_TLBI_ALLE1 tlbi_insn_el2(7, 4) +#define OP_TLBI_VALE2 tlbi_insn_el2(7, 5) +#define OP_TLBI_VMALLS12E1 tlbi_insn_el2(7, 6) + /* Common SCTLR_ELx flags. */ #define SCTLR_ELx_ENTP2 (BIT(60)) #define SCTLR_ELx_DSSBS (BIT(44)) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 6d186a3e5e5f..a17bf261d501 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -47,7 +47,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu) * the EL1 virtual memory control register accesses * as well as the AT S1 operations. */ - hcr |= HCR_TVM | HCR_TRVM | HCR_AT | HCR_NV1; + hcr |= HCR_TVM | HCR_TRVM | HCR_AT | HCR_TTLB | HCR_NV1; } else { /* * For a guest hypervisor on v8.1 (VHE), allow to @@ -75,11 +75,11 @@ static void __activate_traps(struct kvm_vcpu *vcpu) /* * If we're using the EL1 translation regime - * (TGE clear), then ensure that AT S1 ops are - * trapped too. + * (TGE clear), then ensure that AT S1 and + * TLBI E1 ops are trapped too. */ if (!vcpu_el2_tge_is_set(vcpu)) - hcr |= HCR_AT; + hcr |= HCR_AT | HCR_TTLB; } } diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 24cef9b87f9e..c4389db4cc22 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -161,3 +161,84 @@ void __kvm_flush_vm_context(void) dsb(ish); } + +void __kvm_tlb_vae2is(struct kvm_s2_mmu *mmu, u64 va, u64 sys_encoding) +{ + struct tlb_inv_context cxt; + + dsb(ishst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * Execute the EL1 version of TLBI VAE2* instruction, forcing + * an upgrade to the Inner Shareable domain in order to + * perform the invalidation on all CPUs. + */ + switch (sys_encoding) { + case OP_TLBI_VAE2: + case OP_TLBI_VAE2IS: + __tlbi(vae1is, va); + break; + case OP_TLBI_VALE2: + case OP_TLBI_VALE2IS: + __tlbi(vale1is, va); + break; + default: + break; + } + dsb(ish); + isb(); + + __tlb_switch_to_host(&cxt); +} + +void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding) +{ + struct tlb_inv_context cxt; + + dsb(ishst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * Execute the same instruction as the guest hypervisor did, + * expanding the scope of local TLB invalidations to the Inner + * Shareable domain so that it takes place on all CPUs. This + * is equivalent to having HCR_EL2.FB set. + */ + switch (sys_encoding) { + case OP_TLBI_VMALLE1: + case OP_TLBI_VMALLE1IS: + __tlbi(vmalle1is); + break; + case OP_TLBI_VAE1: + case OP_TLBI_VAE1IS: + __tlbi(vae1is, val); + break; + case OP_TLBI_ASIDE1: + case OP_TLBI_ASIDE1IS: + __tlbi(aside1is, val); + break; + case OP_TLBI_VAAE1: + case OP_TLBI_VAAE1IS: + __tlbi(vaae1is, val); + break; + case OP_TLBI_VALE1: + case OP_TLBI_VALE1IS: + __tlbi(vale1is, val); + break; + case OP_TLBI_VAALE1: + case OP_TLBI_VAALE1IS: + __tlbi(vaale1is, val); + break; + default: + break; + } + dsb(ish); + isb(); + + __tlb_switch_to_host(&cxt); +} diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 968cdef56f76..7fd122d687f7 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -89,7 +89,22 @@ static bool memslot_is_logging(struct kvm_memory_slot *memslot) void kvm_flush_remote_tlbs(struct kvm *kvm) { ++kvm->stat.generic.remote_tlb_flush_requests; - kvm_call_hyp(__kvm_tlb_flush_vmid, &kvm->arch.mmu); + + if (!kvm->arch.nested_mmus) { + /* + * For a normal (i.e. non-nested) guest, flush entries for the + * given VMID. + */ + kvm_call_hyp(__kvm_tlb_flush_vmid, &kvm->arch.mmu); + } else { + /* + * When supporting nested virtualization, we can have multiple + * VMIDs in play for each VCPU in the VM, so it's really not + * worth it to try to quiesce the system and flush all the + * VMIDs that may be in use, instead just nuke the whole thing. + */ + kvm_call_hyp(__kvm_flush_vm_context); + } } static bool kvm_is_device_pfn(unsigned long pfn) diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 3340850e0fcf..426d447f82c2 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -357,6 +357,41 @@ int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, return ret; } +/* + * We can have multiple *different* MMU contexts with the same VMID: + * + * - S2 being enabled or not, hence differing by the HCR_EL2.VM bit + * + * - Multiple vcpus using private S2s (huh huh...), hence differing by the + * VBBTR_EL2.BADDR address + * + * - A combination of the above... + * + * We can always identify which MMU context to pick at run-time. However, + * TLB invalidation involving a VMID must take action on all the TLBs using + * this particular VMID. This translates into applying the same invalidation + * operation to all the contexts that are using this VMID. Moar phun! + */ +void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid, + const union tlbi_info *info, + void (*tlbi_callback)(struct kvm_s2_mmu *, + const union tlbi_info *)) +{ + write_lock(&kvm->mmu_lock); + + for (int i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (!kvm_s2_mmu_valid(mmu)) + continue; + + if (vmid == get_vmid(mmu->vttbr)) + tlbi_callback(mmu, info); + } + + write_unlock(&kvm->mmu_lock); +} + /* Must be called with kvm->mmu_lock held */ struct kvm_s2_mmu *lookup_s2_mmu(struct kvm_vcpu *vcpu) { diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 028be1db3897..0c868c9af8d9 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2610,6 +2610,233 @@ static bool handle_s12w(struct kvm_vcpu *vcpu, struct sys_reg_params *p, return handle_s12(vcpu, p, r, true); } +static bool handle_alle2is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + /* + * To emulate invalidating all EL2 regime stage 1 TLB entries for all + * PEs, executing TLBI VMALLE1IS is enough. But reuse the existing + * interface for the simplicity; invalidating stage 2 entries doesn't + * affect the correctness. + */ + __kvm_tlb_flush_vmid(&vcpu->kvm->arch.mmu); + return true; +} + +static bool handle_vae2is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + int sys_encoding = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + /* + * Based on the same principle as TLBI ALLE2 instruction + * emulation, we emulate TLBI VAE2* instructions by executing + * corresponding TLBI VAE1* instructions with the virtual + * EL2's VMID assigned by the host hypervisor. + */ + __kvm_tlb_vae2is(&vcpu->kvm->arch.mmu, p->regval, sys_encoding); + return true; +} + +static bool handle_alle1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + struct kvm_s2_mmu *mmu = &vcpu->kvm->arch.mmu; + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + write_lock(&vcpu->kvm->mmu_lock); + + /* + * Clear all mappings in the shadow page tables and invalidate the stage + * 1 and 2 TLB entries via kvm_tlb_flush_vmid_ipa(). + */ + kvm_nested_s2_unmap(vcpu->kvm); + + if (atomic64_read(&mmu->vmid.id)) { + /* + * Invalidate the stage 1 and 2 TLB entries for the host OS + * in a VM only if there is one. + */ + __kvm_tlb_flush_vmid(mmu); + } + + write_unlock(&vcpu->kvm->mmu_lock); + + return true; +} + +/* Only defined here as this is an internal "abstraction" */ +union tlbi_info { + struct { + u64 start; + u64 size; + } range; + + struct { + u64 addr; + } ipa; + + struct { + u64 addr; + u32 encoding; + } va; +}; + +static void s2_mmu_unmap_stage2_range(struct kvm_s2_mmu *mmu, + const union tlbi_info *info) +{ + kvm_unmap_stage2_range(mmu, info->range.start, info->range.size); +} + +static bool handle_vmalls12e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + u64 vttbr; + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + + kvm_s2_mmu_iterate_by_vmid(vcpu->kvm, get_vmid(vttbr), + &(union tlbi_info) { + .range = { + .start = 0, + .size = kvm_phys_size(vcpu->kvm), + }, + }, + s2_mmu_unmap_stage2_range); + + return true; +} + +static void s2_mmu_unmap_stage2_ipa(struct kvm_s2_mmu *mmu, + const union tlbi_info *info) +{ + unsigned long max_size; + u64 base_addr; + + /* + * We drop a number of things from the supplied value: + * + * - NS bit: we're non-secure only. + * + * - TTL field: We already have the granule size from the + * VTCR_EL2.TG0 field, and the level is only relevant to the + * guest's S2PT. + * + * - IPA[51:48]: We don't support 52bit IPA just yet... + * + * And of course, adjust the IPA to be on an actual address. + */ + base_addr = (info->ipa.addr & GENMASK_ULL(35, 0)) << 12; + + /* Compute the maximum extent of the invalidation */ + switch (mmu->vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + max_size = SZ_1G; + break; + case VTCR_EL2_TG0_16K: + max_size = SZ_32M; + break; + case VTCR_EL2_TG0_64K: + /* + * No, we do not support 52bit IPA in nested yet. Once + * we do, this should be 4TB. + */ + /* FIXME: remove the 52bit PA support from the IDregs */ + max_size = SZ_512M; + break; + default: + BUG(); + } + + base_addr &= ~(max_size - 1); + + kvm_unmap_stage2_range(mmu, base_addr, max_size); +} + +static bool handle_ipas2e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + u64 vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + + if (vcpu_has_nv(vcpu) && forward_nv_traps(vcpu)) + return false; + + kvm_s2_mmu_iterate_by_vmid(vcpu->kvm, get_vmid(vttbr), + &(union tlbi_info) { + .ipa = { + .addr = p->regval, + }, + }, + s2_mmu_unmap_stage2_ipa); + + return true; +} + +static void s2_mmu_unmap_stage2_va(struct kvm_s2_mmu *mmu, + const union tlbi_info *info) +{ + __kvm_tlb_el1_instr(mmu, info->va.addr, info->va.encoding); +} + +static bool handle_tlbi_el1(struct kvm_vcpu *vcpu, struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + u32 sys_encoding = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + u64 vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + + if (vcpu_has_nv(vcpu) && forward_traps(vcpu, HCR_TTLB)) + return false; + + /* + * If we're here, this is because we've trapped on a EL1 TLBI + * instruction that affects the EL1 translation regime while + * we're running in a context that doesn't allow us to let the + * HW do its thing (aka vEL2): + * + * - HCR_EL2.E2H == 0 : a non-VHE guest + * - HCR_EL2.{E2H,TGE} == { 1, 0 } : a VHE guest in guest mode + * + * We don't expect these helpers to ever be called when running + * in a vEL1 context. + */ + + WARN_ON(!vcpu_is_el2(vcpu)); + + if ((__vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) { + mutex_lock(&vcpu->kvm->lock); + /* + * ARMv8.4-NV allows the guest to change TGE behind + * our back, so we always trap EL1 TLBIs from vEL2... + */ + __kvm_tlb_el1_instr(&vcpu->kvm->arch.mmu, p->regval, sys_encoding); + mutex_unlock(&vcpu->kvm->lock); + + return true; + } + + kvm_s2_mmu_iterate_by_vmid(vcpu->kvm, get_vmid(vttbr), + &(union tlbi_info) { + .va = { + .addr = p->regval, + .encoding = sys_encoding, + }, + }, + s2_mmu_unmap_stage2_va); + + return true; +} + /* * AT instruction emulation * @@ -2695,12 +2922,40 @@ static struct sys_reg_desc sys_insn_descs[] = { { SYS_DESC(SYS_DC_CSW), access_dcsw }, { SYS_DESC(SYS_DC_CISW), access_dcsw }, + SYS_INSN(TLBI_VMALLE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_VAE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_ASIDE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_VAAE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_VALE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_VAALE1IS, handle_tlbi_el1), + SYS_INSN(TLBI_VMALLE1, handle_tlbi_el1), + SYS_INSN(TLBI_VAE1, handle_tlbi_el1), + SYS_INSN(TLBI_ASIDE1, handle_tlbi_el1), + SYS_INSN(TLBI_VAAE1, handle_tlbi_el1), + SYS_INSN(TLBI_VALE1, handle_tlbi_el1), + SYS_INSN(TLBI_VAALE1, handle_tlbi_el1), + SYS_INSN(AT_S1E2R, handle_s1e2), SYS_INSN(AT_S1E2W, handle_s1e2), SYS_INSN(AT_S12E1R, handle_s12r), SYS_INSN(AT_S12E1W, handle_s12w), SYS_INSN(AT_S12E0R, handle_s12r), SYS_INSN(AT_S12E0W, handle_s12w), + + SYS_INSN(TLBI_IPAS2E1IS, handle_ipas2e1is), + SYS_INSN(TLBI_IPAS2LE1IS, handle_ipas2e1is), + SYS_INSN(TLBI_ALLE2IS, handle_alle2is), + SYS_INSN(TLBI_VAE2IS, handle_vae2is), + SYS_INSN(TLBI_ALLE1IS, handle_alle1is), + SYS_INSN(TLBI_VALE2IS, handle_vae2is), + SYS_INSN(TLBI_VMALLS12E1IS, handle_vmalls12e1is), + SYS_INSN(TLBI_IPAS2E1, handle_ipas2e1is), + SYS_INSN(TLBI_IPAS2LE1, handle_ipas2e1is), + SYS_INSN(TLBI_ALLE2, handle_alle2is), + SYS_INSN(TLBI_VAE2, handle_vae2is), + SYS_INSN(TLBI_ALLE1, handle_alle1is), + SYS_INSN(TLBI_VALE2, handle_vae2is), + SYS_INSN(TLBI_VMALLS12E1, handle_vmalls12e1is), }; static bool trap_dbgdidr(struct kvm_vcpu *vcpu, From patchwork Wed Apr 5 15:39:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9B02C76188 for ; Wed, 5 Apr 2023 15:40:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238854AbjDEPk6 (ORCPT ); Wed, 5 Apr 2023 11:40:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238865AbjDEPky (ORCPT ); Wed, 5 Apr 2023 11:40:54 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D45576A4C for ; Wed, 5 Apr 2023 08:40:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0ADCB63ED1 for ; Wed, 5 Apr 2023 15:40:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 72E32C433A0; Wed, 5 Apr 2023 15:40:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709245; bh=BqWGGG6P0Q1EZNUaobu4OneFEfRFSI4gFWADamGwyEI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OsId5NIhMgEELAQTVXaett/UwYU0WZDM+4Bc6KforaGi2jmIm8UpjkKfuzGkczQoo szsoDH6dFPh9tefbnv4fs0+17eiT7E+azoG0diQunCWU1T4SAYluqlYNKSGP4U1Vd+ aMgcfOH8zNpAC1TY6JSZM72D2XNqfGPylPV7T5ObH8lehSH4US6FpUaUPfZp1nkEX/ ML1gd5DTEe3i4UcdGjCHq4538Lb5B/9+TzWkmmfGLMw3Cam/+Z8duKccTkuTV0tBN9 k1Ppw0SUOUg5VLwezV9y8RFiXZP9te44ID9sy6uj+4RzatGXLE3gpPKJESiVBpfPIE 13kltom17ImSw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FR-0062PV-FK; Wed, 05 Apr 2023 16:40:33 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 23/50] KVM: arm64: nv: Fold guest's HCR_EL2 configuration into the host's Date: Wed, 5 Apr 2023 16:39:41 +0100 Message-Id: <20230405154008.3552854-24-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When entering a L2 guest (nested virt enabled, but not in hypervisor context), we need to honor the traps the L1 guest has asked enabled. For now, just OR the guest's HCR_EL2 into the host's. We may have to do some filtering in the future though. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vhe/switch.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index a17bf261d501..ff77b37b6f45 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -81,6 +81,11 @@ static void __activate_traps(struct kvm_vcpu *vcpu) if (!vcpu_el2_tge_is_set(vcpu)) hcr |= HCR_AT | HCR_TTLB; } + } else if (vcpu_has_nv(vcpu)) { + u64 vhcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2); + + vhcr_el2 &= ~HCR_GUEST_NV_FILTER_FLAGS; + hcr |= vhcr_el2; } ___activate_traps(vcpu, hcr); From patchwork Wed Apr 5 15:39:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73469C761A6 for ; Wed, 5 Apr 2023 15:42:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238968AbjDEPmI (ORCPT ); Wed, 5 Apr 2023 11:42:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238769AbjDEPmG (ORCPT ); Wed, 5 Apr 2023 11:42:06 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDF3E6199 for ; Wed, 5 Apr 2023 08:41:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C133063EDC for ; Wed, 5 Apr 2023 15:41:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 30B2FC433D2; Wed, 5 Apr 2023 15:41:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709262; bh=H4ZPPuw9ENZrxHpUu14yMy4L+2lAlUmeVTn4NNvf+Uc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZUPC+pyl2rtEsiQF8NjK5PXvlVZPtQ5DUdhO82B6GgGTWjLLL9SVJZAELk73iyz74 pZfxZ62KLt2F2FL3uOqRYU/nEDrm9A/3qmdIctg2BzIGpfuxvi/A1SoJ49ecmp9AaT 5fdo19w7dXoWbggZeTT8JSJKPJS7VM1Xj7lipsyifQgu1Y02fBeQ6nN0jvSbIhnR+C lGwccZXXeNUQYBtNVgGFNZXyiyamlGQGPpSoNfDYFYdohU1zF+2VOKbmiehO3hBS1X kclNhokeEoj0sv0ATVMOMaEgDEzeRVLwp+0bUYIJTu+/PVUOz6BRhkGLdy+lUiEQOO RqoJEPuWcORgA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FR-0062PV-Ns; Wed, 05 Apr 2023 16:40:33 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 24/50] KVM: arm64: nv: Hide RAS from nested guests Date: Wed, 5 Apr 2023 16:39:42 +0100 Message-Id: <20230405154008.3552854-25-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We don't want to expose complicated features to guests until we have a good grasp on the basic CPU emulation. So let's pretend that RAS, doesn't exist in a nested guest. We already hide the feature bits, let's now make sure VDISR_EL1 will UNDEF. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0c868c9af8d9..42d67f74b54a 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2444,6 +2444,7 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(VBAR_EL2, access_rw, reset_val, 0), EL2_REG(RVBAR_EL2, access_rw, reset_val, 0), { SYS_DESC(SYS_RMR_EL2), trap_undef }, + { SYS_DESC(SYS_VDISR_EL2), trap_undef }, EL2_REG(CONTEXTIDR_EL2, access_rw, reset_val, 0), EL2_REG(TPIDR_EL2, access_rw, reset_val, 0), From patchwork Wed Apr 5 15:39:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED540C76188 for ; Wed, 5 Apr 2023 15:41:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238895AbjDEPlT (ORCPT ); Wed, 5 Apr 2023 11:41:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238865AbjDEPlS (ORCPT ); Wed, 5 Apr 2023 11:41:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 446CB6E98 for ; Wed, 5 Apr 2023 08:40:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4C0AD63ED6 for ; Wed, 5 Apr 2023 15:40:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0205C433EF; Wed, 5 Apr 2023 15:40:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709250; bh=zz1skk9grcs9frfh2cvcY/iWNHaGcTFPg5uckcycoBI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ewkj4KZaQpYZ88FIunFBg38vHaHSOs3Y4YpHswj3QRc1maySFh8l/GaYbqjY9fr/i +3KxM8hWy2X3SiyTKvX8qGwSk5fRt18zXtORQ26bttNg6UUjrNj45ErZJEXvG1Ocu4 Gx7J5l1cDt6+EiHlTxpiUo8CHZ+/pCzv4bqCd555HDaKJkCCS+U4nCrcA1Q9VIy9Uq 5ENCDdM4ndfiC8Rh7fZhWQi+RVJ6+NKFuqYbeRYiELgYAQL7+3F6drltjm9+gvIrim qoGkp4mwhUoJxXC/HrKXRyRU2uCpuGeEIuPe1TP3b+6WEM5DpXe5T++0gZd3+2GAIu 1AyEMl7z0SvNQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FS-0062PV-0N; Wed, 05 Apr 2023 16:40:34 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 25/50] KVM: arm64: nv: Add handling of EL2-specific timer registers Date: Wed, 5 Apr 2023 16:39:43 +0100 Message-Id: <20230405154008.3552854-26-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add the required handling for EL2 and EL02 registers, as well as EL1 registers used in the E2H context. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/sysreg.h | 7 +++ arch/arm64/kvm/sys_regs.c | 102 +++++++++++++++++++++++++++++++- 2 files changed, 108 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index c4a173b136b1..355480766b4a 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -389,6 +389,7 @@ #define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0) #define SYS_CNTPCT_EL0 sys_reg(3, 3, 14, 0, 1) +#define SYS_CNTVCT_EL0 sys_reg(3, 3, 14, 0, 2) #define SYS_CNTPCTSS_EL0 sys_reg(3, 3, 14, 0, 5) #define SYS_CNTVCTSS_EL0 sys_reg(3, 3, 14, 0, 6) @@ -503,6 +504,12 @@ #define SYS_CNTVOFF_EL2 sys_reg(3, 4, 14, 0, 3) #define SYS_CNTHCTL_EL2 sys_reg(3, 4, 14, 1, 0) +#define SYS_CNTHP_TVAL_EL2 sys_reg(3, 4, 14, 2, 0) +#define SYS_CNTHP_CTL_EL2 sys_reg(3, 4, 14, 2, 1) +#define SYS_CNTHP_CVAL_EL2 sys_reg(3, 4, 14, 2, 2) +#define SYS_CNTHV_TVAL_EL2 sys_reg(3, 4, 14, 3, 0) +#define SYS_CNTHV_CTL_EL2 sys_reg(3, 4, 14, 3, 1) +#define SYS_CNTHV_CVAL_EL2 sys_reg(3, 4, 14, 3, 2) /* VHE encodings for architectural EL0/1 system registers */ #define SYS_SCTLR_EL12 sys_reg(3, 5, 1, 0, 0) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 42d67f74b54a..96fee2a39859 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1348,26 +1348,111 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu, switch (reg) { case SYS_CNTP_TVAL_EL0: + if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) + tmr = TIMER_HPTIMER; + else + tmr = TIMER_PTIMER; + treg = TIMER_REG_TVAL; + break; + case SYS_AARCH32_CNTP_TVAL: + case SYS_CNTP_TVAL_EL02: tmr = TIMER_PTIMER; treg = TIMER_REG_TVAL; break; + + case SYS_CNTV_TVAL_EL02: + tmr = TIMER_VTIMER; + treg = TIMER_REG_TVAL; + break; + + case SYS_CNTHP_TVAL_EL2: + tmr = TIMER_HPTIMER; + treg = TIMER_REG_TVAL; + break; + + case SYS_CNTHV_TVAL_EL2: + tmr = TIMER_HVTIMER; + treg = TIMER_REG_TVAL; + break; + case SYS_CNTP_CTL_EL0: + if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) + tmr = TIMER_HPTIMER; + else + tmr = TIMER_PTIMER; + treg = TIMER_REG_CTL; + break; + case SYS_AARCH32_CNTP_CTL: + case SYS_CNTP_CTL_EL02: tmr = TIMER_PTIMER; treg = TIMER_REG_CTL; break; + + case SYS_CNTV_CTL_EL02: + tmr = TIMER_VTIMER; + treg = TIMER_REG_CTL; + break; + + case SYS_CNTHP_CTL_EL2: + tmr = TIMER_HPTIMER; + treg = TIMER_REG_CTL; + break; + + case SYS_CNTHV_CTL_EL2: + tmr = TIMER_HVTIMER; + treg = TIMER_REG_CTL; + break; + case SYS_CNTP_CVAL_EL0: + if (is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) + tmr = TIMER_HPTIMER; + else + tmr = TIMER_PTIMER; + treg = TIMER_REG_CVAL; + break; + case SYS_AARCH32_CNTP_CVAL: + case SYS_CNTP_CVAL_EL02: tmr = TIMER_PTIMER; treg = TIMER_REG_CVAL; break; + + case SYS_CNTV_CVAL_EL02: + tmr = TIMER_VTIMER; + treg = TIMER_REG_CVAL; + break; + + case SYS_CNTHP_CVAL_EL2: + tmr = TIMER_HPTIMER; + treg = TIMER_REG_CVAL; + break; + + case SYS_CNTHV_CVAL_EL2: + tmr = TIMER_HVTIMER; + treg = TIMER_REG_CVAL; + break; + + case SYS_CNTVOFF_EL2: + tmr = TIMER_VTIMER; + treg = TIMER_REG_VOFF; + break; + case SYS_CNTPCT_EL0: case SYS_CNTPCTSS_EL0: + if (is_hyp_ctxt(vcpu)) + tmr = TIMER_HPTIMER; + else + tmr = TIMER_PTIMER; + treg = TIMER_REG_CNT; + break; + case SYS_AARCH32_CNTPCT: tmr = TIMER_PTIMER; treg = TIMER_REG_CNT; break; + default: print_sys_reg_msg(p, "%s", "Unhandled trapped timer register"); kvm_inject_undefined(vcpu); @@ -2449,8 +2534,15 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL2_REG(CONTEXTIDR_EL2, access_rw, reset_val, 0), EL2_REG(TPIDR_EL2, access_rw, reset_val, 0), - EL2_REG(CNTVOFF_EL2, access_rw, reset_val, 0), + EL2_REG(CNTVOFF_EL2, access_arch_timer, reset_val, 0), EL2_REG(CNTHCTL_EL2, access_rw, reset_val, 0), + { SYS_DESC(SYS_CNTHP_TVAL_EL2), access_arch_timer }, + EL2_REG(CNTHP_CTL_EL2, access_arch_timer, reset_val, 0), + EL2_REG(CNTHP_CVAL_EL2, access_arch_timer, reset_val, 0), + + { SYS_DESC(SYS_CNTHV_TVAL_EL2), access_arch_timer }, + EL2_REG(CNTHV_CTL_EL2, access_arch_timer, reset_val, 0), + EL2_REG(CNTHV_CVAL_EL2, access_arch_timer, reset_val, 0), EL12_REG(SCTLR, access_vm_reg, reset_val, 0x00C50078), EL12_REG(CPACR, access_rw, reset_val, 0), @@ -2469,6 +2561,14 @@ static const struct sys_reg_desc sys_reg_descs[] = { EL12_REG(CONTEXTIDR, access_vm_reg, reset_val, 0), EL12_REG(CNTKCTL, access_rw, reset_val, 0), + { SYS_DESC(SYS_CNTP_TVAL_EL02), access_arch_timer }, + { SYS_DESC(SYS_CNTP_CTL_EL02), access_arch_timer }, + { SYS_DESC(SYS_CNTP_CVAL_EL02), access_arch_timer }, + + { SYS_DESC(SYS_CNTV_TVAL_EL02), access_arch_timer }, + { SYS_DESC(SYS_CNTV_CTL_EL02), access_arch_timer }, + { SYS_DESC(SYS_CNTV_CVAL_EL02), access_arch_timer }, + EL2_REG(SP_EL2, NULL, reset_unknown, 0), }; From patchwork Wed Apr 5 15:39:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 014FEC77B70 for ; Wed, 5 Apr 2023 15:41:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238966AbjDEPlx (ORCPT ); Wed, 5 Apr 2023 11:41:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238946AbjDEPlt (ORCPT ); Wed, 5 Apr 2023 11:41:49 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB34A65A8 for ; Wed, 5 Apr 2023 08:41:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9742F63EE6 for ; Wed, 5 Apr 2023 15:41:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B25CC433EF; Wed, 5 Apr 2023 15:41:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709270; bh=RpayedqqXPrcFY7vhwa6lg8OwX/4EFZH0OkzYWmR8Tk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Xv5Qnfr2AH+ZtYQT0U09dw7qDW+1ryK4/mEPzbr0wmAyuDgSVTzm0ZV0s46NM4DaC 2dMAkwnWZtBkomWP0PA4fhS+ogLIoomEVmzYYJnEOEOErXThsrVFj6ANxf/bFUCXiv Ox6KE7YGMt0XXEAh0lmWxrzXijlkYbicmv+OeA0dmggSmgHmeiLXIj3tMHYEBYo0Wf 2ZC5Ck5FmMQADor9/2EsX8tnw21b4HOA2genNyvKxccZPkSaZqWVacLX+YnNeVrY7A azY5Cov0fhUxyjIyXAF9uGEPPl9Q6kLE33MyU2jVX+us3xEBPLdSy8zTH+Hu3ErnaV RAbA8zVwBkCdQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FS-0062PV-Aw; Wed, 05 Apr 2023 16:40:34 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 26/50] KVM: arm64: nv: Forward timer traps to nested EL2 Date: Wed, 5 Apr 2023 16:39:44 +0100 Message-Id: <20230405154008.3552854-27-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Signed-off-by: Marc Zyngier --- arch/arm64/kvm/sys_regs.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 96fee2a39859..9bd852d16f70 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -1459,6 +1459,35 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu, return false; } + if (vcpu_has_nv(vcpu) && !is_hyp_ctxt(vcpu) && tmr == TIMER_PTIMER) { + u64 val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2); + bool trap; + + if (!vcpu_el2_e2h_is_set(vcpu)) + val = (val & (CNTHCTL_EL1PCEN | CNTHCTL_EL1PCTEN)) << 10; + + switch (treg) { + case TIMER_REG_CVAL: + case TIMER_REG_CTL: + case TIMER_REG_TVAL: + trap = !(val & (CNTHCTL_EL1PCEN << 10)); + break; + + case TIMER_REG_CNT: + trap = !(val & (CNTHCTL_EL1PCTEN << 10)); + break; + + default: + trap = false; + break; + } + + if (trap) { + kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); + return false; + } + } + if (p->is_write) kvm_arm_timer_write_sysreg(vcpu, tmr, treg, p->regval); else From patchwork Wed Apr 5 15:39:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A63C1C761A6 for ; Wed, 5 Apr 2023 15:42:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238967AbjDEPmN (ORCPT ); Wed, 5 Apr 2023 11:42:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238975AbjDEPmJ (ORCPT ); Wed, 5 Apr 2023 11:42:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFBD17285 for ; Wed, 5 Apr 2023 08:41:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 820CD63EE8 for ; Wed, 5 Apr 2023 15:41:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E28F1C4339B; Wed, 5 Apr 2023 15:41:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709272; bh=pXYSgv35cxB9qoUl6iVDTqGhokT94uL8UdFnTttHfiU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=G7qc76s9sXNH0Bgm9+uy/srCWvw8pExD6Kg/81tfD+KdFHsbIjGCwjvEQKxpOdM6U niHoBZliRaD+WS67tmliYNgO/qZ/Dv2iJbyOALbk/u4mmVREjHWkGNr+DPO/88rkwJ tbQUUC+ljm49M3i8kLbOLx4o+ozO+F0lKqWxAqoZpset3kPhF/8Bg1yPaebpUjwddG jiaZO/hF0e5ubPK84dOWo+LHTjBr51WHO6PXhJth7IKfy7bhjYm2b3ItgxqBsMK7HC 2PWXqoBXmCWaq1vF/anghQ3mREg8NDOZLr1IMQqFmNItq8O02UG5N1/gbbrrVI4Rw1 d2KDZVOyNVnDw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FS-0062PV-JM; Wed, 05 Apr 2023 16:40:34 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 27/50] KVM: arm64: nv: Load timer before the GIC Date: Wed, 5 Apr 2023 16:39:45 +0100 Message-Id: <20230405154008.3552854-28-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order for vgic_v3_load_nested to be able to observe which timer interrupts have the HW bit set for the current context, the timers must have been loaded in the new mode and the right timer mapped to their corresponding HW IRQs. At the moment, we load the GIC first, meaning that timer interrupts injected to an L2 guest will never have the HW bit set (we see the old configuration). Swapping the two loads solves this particular problem. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/arm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 2fc529519ae7..8d8084a4473e 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -415,8 +415,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vcpu->cpu = cpu; - kvm_vgic_load(vcpu); kvm_timer_vcpu_load(vcpu); + kvm_vgic_load(vcpu); if (has_vhe()) kvm_vcpu_load_sysregs_vhe(vcpu); kvm_arch_vcpu_load_fp(vcpu); From patchwork Wed Apr 5 15:39:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81AC5C761A6 for ; Wed, 5 Apr 2023 15:43:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238999AbjDEPnD (ORCPT ); Wed, 5 Apr 2023 11:43:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238419AbjDEPm7 (ORCPT ); Wed, 5 Apr 2023 11:42:59 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F3576599 for ; Wed, 5 Apr 2023 08:42:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5D65E63E1F for ; Wed, 5 Apr 2023 15:41:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 77D56C433EF; Wed, 5 Apr 2023 15:41:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709291; bh=lTRYW4FHIxNi2wJ7oR3ayybAEsGVbwbOotS3GyhqU0c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PaFBWNK9nxKvwADX3/p89oap/grF2eLWOrRX9GOmaGjujSs5+48cidQk2XhAQGCon fk4dl/3ag5mBqF5rn1Uj31dSTeNJFj1r6VjY3vaxAoaEhw4tdWONch9/67a2+Gt+Rt pLPRuVsj+9+DlGDQgOAU+pktd+dBnE1vCfQmya6Sgz1PIWqqQEnTsN3K9Rv4ks4LWe N/8pexZM7mfFaDGwD3RboM+MBdib5aRcgZXP8/rsnDx53KpqMEEORHASk/ID5UXM3+ Hetzo1OcEUkLAy+pxE+7mXmPW6cKYJeGBm4BR/cHUa0qJpm5evCZ+qCpdR6euPSE42 s9xJlnbtnSsEA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FS-0062PV-Tc; Wed, 05 Apr 2023 16:40:34 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 28/50] KVM: arm64: nv: Nested GICv3 Support Date: Wed, 5 Apr 2023 16:39:46 +0100 Message-Id: <20230405154008.3552854-29-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Jintack Lim When entering a nested VM, we set up the hypervisor control interface based on what the guest hypervisor has set. Especially, we investigate each list register written by the guest hypervisor whether HW bit is set. If so, we translate hw irq number from the guest's point of view to the real hardware irq number if there is a mapping. Signed-off-by: Jintack Lim [Redesigned execution flow around vcpu load/put] Signed-off-by: Christoffer Dall [Rewritten to support GICv3 instead of GICv2] Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_emulate.h | 8 +- arch/arm64/include/asm/kvm_host.h | 13 +- arch/arm64/include/asm/kvm_nested.h | 1 + arch/arm64/kvm/Makefile | 2 +- arch/arm64/kvm/arm.c | 7 ++ arch/arm64/kvm/nested.c | 16 +++ arch/arm64/kvm/sys_regs.c | 179 +++++++++++++++++++++++++- arch/arm64/kvm/vgic/vgic-v3-nested.c | 180 +++++++++++++++++++++++++++ arch/arm64/kvm/vgic/vgic-v3.c | 31 ++++- arch/arm64/kvm/vgic/vgic.c | 27 ++++ include/kvm/arm_vgic.h | 20 +++ 11 files changed, 473 insertions(+), 11 deletions(-) create mode 100644 arch/arm64/kvm/vgic/vgic-v3-nested.c diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 1aa059ebb569..d4721555efc9 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -540,7 +540,13 @@ static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu) static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu) { - return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; + /* + * Use the in-memory view for MPIDR_EL1. It can't be changed by the + * guest, and is also accessed from the context of *another* vcpu, + * so anything using some other state (such as the NV state that is + * used by vcpu_read_sys_reg) will eventually go wrong. + */ + return __vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; } static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index fefd5156049e..1b3902d2470f 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -41,12 +41,13 @@ #define KVM_REQ_SLEEP \ KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) -#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) -#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) -#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) -#define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) -#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) -#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) +#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) +#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2) +#define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) +#define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) +#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) +#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) +#define KVM_REQ_GUEST_HYP_IRQ_PENDING KVM_ARCH_REQ(7) #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ KVM_DIRTY_LOG_INITIALLY_SET) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index bc783492d1c9..a65db37a44f7 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -72,6 +72,7 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid, const union tlbi_info *)); extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); +extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu); struct kvm_s2_trans { phys_addr_t output; diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index c2717a8f12f5..4462bede5c60 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -20,7 +20,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \ vgic/vgic-v3.o vgic/vgic-v4.o \ vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \ vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \ - vgic/vgic-its.o vgic/vgic-debug.o + vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o kvm-$(CONFIG_HW_PERF_EVENTS) += pmu-emul.o pmu.o diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 8d8084a4473e..e4a77fe1cfa4 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -769,6 +769,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) if (kvm_dirty_ring_check_request(vcpu)) return 0; + + check_nested_vcpu_requests(vcpu); } return 1; @@ -903,6 +905,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) * preserved on VMID roll-over if the task was preempted, * making a thread's VMID inactive. So we need to call * kvm_arm_vmid_update() in non-premptible context. + * + * Note that this must happen after the check_vcpu_request() + * call to pick the correct s2_mmu structure, as a pending + * nested exception (IRQ, for example) can trigger a change + * in translation regime. */ kvm_arm_vmid_update(&vcpu->arch.hw_mmu->vmid); diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 426d447f82c2..d672157de574 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -618,6 +618,22 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm) kvm_free_stage2_pgd(&kvm->arch.mmu); } +bool vgic_state_is_nested(struct kvm_vcpu *vcpu) +{ + bool imo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_IMO; + bool fmo = __vcpu_sys_reg(vcpu, HCR_EL2) & HCR_FMO; + + WARN_ONCE(imo != fmo, "Separate virtual IRQ/FIQ settings not supported\n"); + + return vcpu_has_nv(vcpu) && imo && fmo && !is_hyp_ctxt(vcpu); +} + +void check_nested_vcpu_requests(struct kvm_vcpu *vcpu) +{ + if (kvm_check_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu)) + kvm_inject_nested_irq(vcpu); +} + /* * Our emulated CPU doesn't support all the possible features. For the * sake of simplicity (and probably mental sanity), wipe out a number diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 9bd852d16f70..25dc3f2fb45c 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -17,6 +17,8 @@ #include #include +#include + #include #include #include @@ -476,6 +478,19 @@ static bool access_actlr(struct kvm_vcpu *vcpu, return true; } +/* + * The architecture says that non-secure write accesses to this register from + * EL1 are trapped to EL2, if either: + * - HCR_EL2.FMO==1, or + * - HCR_EL2.IMO==1 + */ +static bool sgi_traps_to_vel2(struct kvm_vcpu *vcpu) +{ + return (vcpu_has_nv(vcpu) && + !vcpu_is_el2(vcpu) && + !!(vcpu_read_sys_reg(vcpu, HCR_EL2) & (HCR_IMO | HCR_FMO))); +} + /* * Trap handler for the GICv3 SGI generation system register. * Forward the request to the VGIC emulation. @@ -491,6 +506,11 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu, if (!p->is_write) return read_from_write_only(vcpu, p, r); + if (sgi_traps_to_vel2(vcpu)) { + kvm_inject_nested_sync(vcpu, kvm_vcpu_get_esr(vcpu)); + return false; + } + /* * In a system where GICD_CTLR.DS=1, a ICC_SGI0R_EL1 access generates * Group0 SGIs only, while ICC_SGI1R_EL1 can generate either group, @@ -534,7 +554,13 @@ static bool access_gic_sre(struct kvm_vcpu *vcpu, if (p->is_write) return ignore_write(vcpu, p); - p->regval = vcpu->arch.vgic_cpu.vgic_v3.vgic_sre; + if (p->Op1 == 4) { /* ICC_SRE_EL2 */ + p->regval = (ICC_SRE_EL2_ENABLE | ICC_SRE_EL2_SRE | + ICC_SRE_EL1_DIB | ICC_SRE_EL1_DFB); + } else { /* ICC_SRE_EL1 */ + p->regval = vcpu->arch.vgic_cpu.vgic_v3.vgic_sre; + } + return true; } @@ -2095,6 +2121,122 @@ static bool access_spsr_el2(struct kvm_vcpu *vcpu, return true; } +static bool access_gic_apr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; + u32 index, *base; + + index = r->Op2; + if (r->CRm == 8) + base = cpu_if->vgic_ap0r; + else + base = cpu_if->vgic_ap1r; + + if (p->is_write) + base[index] = p->regval; + else + p->regval = base[index]; + + return true; +} + +static bool access_gic_hcr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; + + if (p->is_write) + cpu_if->vgic_hcr = p->regval; + else + p->regval = cpu_if->vgic_hcr; + + return true; +} + +static bool access_gic_vtr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + return write_to_read_only(vcpu, p, r); + + p->regval = kvm_vgic_global_state.ich_vtr_el2; + + return true; +} + +static bool access_gic_misr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + return write_to_read_only(vcpu, p, r); + + p->regval = vgic_v3_get_misr(vcpu); + + return true; +} + +static bool access_gic_eisr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + return write_to_read_only(vcpu, p, r); + + p->regval = vgic_v3_get_eisr(vcpu); + + return true; +} + +static bool access_gic_elrsr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + if (p->is_write) + return write_to_read_only(vcpu, p, r); + + p->regval = vgic_v3_get_elrsr(vcpu); + + return true; +} + +static bool access_gic_vmcr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; + + if (p->is_write) + cpu_if->vgic_vmcr = p->regval; + else + p->regval = cpu_if->vgic_vmcr; + + return true; +} + +static bool access_gic_lr(struct kvm_vcpu *vcpu, + struct sys_reg_params *p, + const struct sys_reg_desc *r) +{ + struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; + u32 index; + + index = p->Op2; + if (p->CRm == 13) + index += 8; + + if (p->is_write) + cpu_if->vgic_lr[index] = p->regval; + else + p->regval = cpu_if->vgic_lr[index]; + + return true; +} + /* * Architected system registers. * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2 @@ -2560,6 +2702,41 @@ static const struct sys_reg_desc sys_reg_descs[] = { { SYS_DESC(SYS_RMR_EL2), trap_undef }, { SYS_DESC(SYS_VDISR_EL2), trap_undef }, + { SYS_DESC(SYS_ICH_AP0R0_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP0R1_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP0R2_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP0R3_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP1R0_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP1R1_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP1R2_EL2), access_gic_apr }, + { SYS_DESC(SYS_ICH_AP1R3_EL2), access_gic_apr }, + + { SYS_DESC(SYS_ICC_SRE_EL2), access_gic_sre }, + + { SYS_DESC(SYS_ICH_HCR_EL2), access_gic_hcr }, + { SYS_DESC(SYS_ICH_VTR_EL2), access_gic_vtr }, + { SYS_DESC(SYS_ICH_MISR_EL2), access_gic_misr }, + { SYS_DESC(SYS_ICH_EISR_EL2), access_gic_eisr }, + { SYS_DESC(SYS_ICH_ELRSR_EL2), access_gic_elrsr }, + { SYS_DESC(SYS_ICH_VMCR_EL2), access_gic_vmcr }, + + { SYS_DESC(SYS_ICH_LR0_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR1_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR2_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR3_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR4_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR5_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR6_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR7_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR8_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR9_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR10_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR11_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR12_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR13_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR14_EL2), access_gic_lr }, + { SYS_DESC(SYS_ICH_LR15_EL2), access_gic_lr }, + EL2_REG(CONTEXTIDR_EL2, access_rw, reset_val, 0), EL2_REG(TPIDR_EL2, access_rw, reset_val, 0), diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c new file mode 100644 index 000000000000..ab8ddf490b31 --- /dev/null +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -0,0 +1,180 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include + +#include + +#include +#include +#include + +#include "vgic.h" + +static inline struct vgic_v3_cpu_if *vcpu_nested_if(struct kvm_vcpu *vcpu) +{ + return &vcpu->arch.vgic_cpu.nested_vgic_v3; +} + +static inline struct vgic_v3_cpu_if *vcpu_shadow_if(struct kvm_vcpu *vcpu) +{ + return &vcpu->arch.vgic_cpu.shadow_vgic_v3; +} + +static inline bool lr_triggers_eoi(u64 lr) +{ + return !(lr & (ICH_LR_STATE | ICH_LR_HW)) && (lr & ICH_LR_EOI); +} + +u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + u16 reg = 0; + int i; + + for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { + if (lr_triggers_eoi(cpu_if->vgic_lr[i])) + reg |= BIT(i); + } + + return reg; +} + +u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + u16 reg = 0; + int i; + + for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { + if (!(cpu_if->vgic_lr[i] & ICH_LR_STATE)) + reg |= BIT(i); + } + + return reg; +} + +u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + int nr_lr = kvm_vgic_global_state.nr_lr; + u64 reg = 0; + + if (vgic_v3_get_eisr(vcpu)) + reg |= ICH_MISR_EOI; + + if (cpu_if->vgic_hcr & ICH_HCR_UIE) { + int used_lrs; + + used_lrs = nr_lr - hweight16(vgic_v3_get_elrsr(vcpu)); + if (used_lrs <= 1) + reg |= ICH_MISR_U; + } + + /* TODO: Support remaining bits in this register */ + return reg; +} + +/* + * For LRs which have HW bit set such as timer interrupts, we modify them to + * have the host hardware interrupt number instead of the virtual one programmed + * by the guest hypervisor. + */ +static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); + struct vgic_irq *irq; + int i, used_lrs = 0; + + for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { + u64 lr = cpu_if->vgic_lr[i]; + int l1_irq; + + if (!(lr & ICH_LR_HW)) + goto next; + + /* We have the HW bit set */ + l1_irq = (lr & ICH_LR_PHYS_ID_MASK) >> ICH_LR_PHYS_ID_SHIFT; + irq = vgic_get_irq(vcpu->kvm, vcpu, l1_irq); + + if (!irq || !irq->hw) { + /* There was no real mapping, so nuke the HW bit */ + lr &= ~ICH_LR_HW; + if (irq) + vgic_put_irq(vcpu->kvm, irq); + goto next; + } + + /* Translate the virtual mapping to the real one */ + lr &= ~ICH_LR_EOI; /* Why? */ + lr &= ~ICH_LR_PHYS_ID_MASK; + lr |= (u64)irq->hwintid << ICH_LR_PHYS_ID_SHIFT; + vgic_put_irq(vcpu->kvm, irq); + +next: + s_cpu_if->vgic_lr[i] = lr; + used_lrs = i + 1; + } + + s_cpu_if->used_lrs = used_lrs; +} + +/* + * Change the shadow HWIRQ field back to the virtual value before copying over + * the entire shadow struct to the nested state. + */ +static void vgic_v3_fixup_shadow_lr_state(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); + int lr; + + for (lr = 0; lr < kvm_vgic_global_state.nr_lr; lr++) { + s_cpu_if->vgic_lr[lr] &= ~ICH_LR_PHYS_ID_MASK; + s_cpu_if->vgic_lr[lr] |= cpu_if->vgic_lr[lr] & ICH_LR_PHYS_ID_MASK; + } +} + +void vgic_v3_load_nested(struct kvm_vcpu *vcpu) +{ + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + + vgic_cpu->shadow_vgic_v3 = vgic_cpu->nested_vgic_v3; + vgic_v3_create_shadow_lr(vcpu); + __vgic_v3_restore_state(vcpu_shadow_if(vcpu)); +} + +void vgic_v3_put_nested(struct kvm_vcpu *vcpu) +{ + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + + __vgic_v3_save_state(vcpu_shadow_if(vcpu)); + + /* + * Translate the shadow state HW fields back to the virtual ones + * before copying the shadow struct back to the nested one. + */ + vgic_v3_fixup_shadow_lr_state(vcpu); + vgic_cpu->nested_vgic_v3 = vgic_cpu->shadow_vgic_v3; +} + +void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + + /* + * If we exit a nested VM with a pending maintenance interrupt from the + * GIC, then we need to forward this to the guest hypervisor so that it + * can re-sync the appropriate LRs and sample level triggered interrupts + * again. + */ + if (vgic_state_is_nested(vcpu) && + (cpu_if->vgic_hcr & ICH_HCR_EN) && + vgic_v3_get_misr(vcpu)) + kvm_inject_nested_irq(vcpu); +} diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 469d816f356f..749d705d76a4 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include "vgic.h" @@ -278,6 +279,12 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu) vgic_v3->vgic_sre = (ICC_SRE_EL1_DIB | ICC_SRE_EL1_DFB | ICC_SRE_EL1_SRE); + /* + * If nesting is allowed, force GICv3 onto the nested + * guests as well. + */ + if (vcpu_has_nv(vcpu)) + vcpu->arch.vgic_cpu.nested_vgic_v3.vgic_sre = vgic_v3->vgic_sre; vcpu->arch.vgic_cpu.pendbaser = INITIAL_PENDBASER_VALUE; } else { vgic_v3->vgic_sre = 0; @@ -724,6 +731,17 @@ void vgic_v3_load(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; + vcpu->arch.vgic_cpu.current_cpu_if = cpu_if; + + /* + * vgic_v3_load_nested only affects the LRs in the shadow + * state, so it is fine to pass the nested state around. + */ + if (vgic_state_is_nested(vcpu)) { + cpu_if = &vcpu->arch.vgic_cpu.shadow_vgic_v3; + vcpu->arch.vgic_cpu.current_cpu_if = cpu_if; + } + /* * If dealing with a GICv2 emulation on GICv3, VMCR_EL2.VFIQen * is dependent on ICC_SRE_EL1.SRE, and we have to perform the @@ -737,12 +755,15 @@ void vgic_v3_load(struct kvm_vcpu *vcpu) if (has_vhe()) __vgic_v3_activate_traps(cpu_if); + if (vgic_state_is_nested(vcpu)) + vgic_v3_load_nested(vcpu); + WARN_ON(vgic_v4_load(vcpu)); } void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; + struct vgic_v3_cpu_if *cpu_if = vcpu->arch.vgic_cpu.current_cpu_if; if (likely(cpu_if->vgic_sre)) cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr); @@ -750,7 +771,7 @@ void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu) void vgic_v3_put(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; + struct vgic_v3_cpu_if *cpu_if = vcpu->arch.vgic_cpu.current_cpu_if; WARN_ON(vgic_v4_put(vcpu, false)); @@ -760,4 +781,10 @@ void vgic_v3_put(struct kvm_vcpu *vcpu) if (has_vhe()) __vgic_v3_deactivate_traps(cpu_if); + + if (vgic_state_is_nested(vcpu)) + vgic_v3_put_nested(vcpu); + + /* Default back to the non-nested state for sanity */ + vcpu->arch.vgic_cpu.current_cpu_if = &vcpu->arch.vgic_cpu.vgic_v3; } diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c index ae491ef97188..2c5a4bb6b2f9 100644 --- a/arch/arm64/kvm/vgic/vgic.c +++ b/arch/arm64/kvm/vgic/vgic.c @@ -876,6 +876,10 @@ void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) { int used_lrs; + /* If nesting, this is a load/put affair, not flush/sync. */ + if (vgic_state_is_nested(vcpu)) + return; + /* An empty ap_list_head implies used_lrs == 0 */ if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) return; @@ -920,6 +924,29 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu) !vgic_supports_direct_msis(vcpu->kvm)) return; + /* + * If in a nested state, we must return early. Two possibilities: + * + * - If we have any pending IRQ for the guest and the guest + * expects IRQs to be handled in its virtual EL2 mode (the + * virtual IMO bit is set) and it is not already running in + * virtual EL2 mode, then we have to emulate an IRQ + * exception to virtual EL2. + * + * We do that by placing a request to ourselves which will + * abort the entry procedure and inject the exception at the + * beginning of the run loop. + * + * - Otherwise, do exactly *NOTHING*. The guest state is + * already loaded, and we can carry on with running it. + */ + if (vgic_state_is_nested(vcpu)) { + if (kvm_vgic_vcpu_pending_irq(vcpu)) + kvm_make_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu); + + return; + } + DEBUG_SPINLOCK_BUG_ON(!irqs_disabled()); if (!list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) { diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 402b545959af..d775eb50b72c 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -331,6 +331,17 @@ struct vgic_cpu { struct vgic_irq private_irqs[VGIC_NR_PRIVATE_IRQS]; + /* CPU vif control registers for the virtual GICH interface */ + struct vgic_v3_cpu_if nested_vgic_v3; + + /* + * The shadow vif control register loaded to the hardware when + * running a nested L2 guest with the virtual IMO/FMO bit set. + */ + struct vgic_v3_cpu_if shadow_vgic_v3; + + struct vgic_v3_cpu_if *current_cpu_if; + raw_spinlock_t ap_list_lock; /* Protects the ap_list */ /* @@ -389,6 +400,13 @@ void kvm_vgic_load(struct kvm_vcpu *vcpu); void kvm_vgic_put(struct kvm_vcpu *vcpu); void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu); +void vgic_v3_load_nested(struct kvm_vcpu *vcpu); +void vgic_v3_put_nested(struct kvm_vcpu *vcpu); +void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu); +u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu); +u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu); +u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu); + #define irqchip_in_kernel(k) (!!((k)->arch.vgic.in_kernel)) #define vgic_initialized(k) ((k)->arch.vgic.initialized) #define vgic_ready(k) ((k)->arch.vgic.ready) @@ -433,6 +451,8 @@ int vgic_v4_load(struct kvm_vcpu *vcpu); void vgic_v4_commit(struct kvm_vcpu *vcpu); int vgic_v4_put(struct kvm_vcpu *vcpu, bool need_db); +bool vgic_state_is_nested(struct kvm_vcpu *vcpu); + /* CPU HP callbacks */ void kvm_vgic_cpu_up(void); void kvm_vgic_cpu_down(void); From patchwork Wed Apr 5 15:39:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 937D8C76188 for ; Wed, 5 Apr 2023 15:41:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238254AbjDEPlb (ORCPT ); Wed, 5 Apr 2023 11:41:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238769AbjDEPl3 (ORCPT ); Wed, 5 Apr 2023 11:41:29 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B1C35FE1 for ; Wed, 5 Apr 2023 08:41:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1530563ED1 for ; Wed, 5 Apr 2023 15:40:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7BE61C433EF; Wed, 5 Apr 2023 15:40:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709254; bh=iJnrW6FNYdNkSdRSjauULZVIdYTtDFy2J/54vP7hOpE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ctczoR9KGsUWecWCeEJIKnjr6C3/N6SX5DenVS+NR4fdznz2klMNwkYElBP4Tl6K2 tpAY8cwEEjH1ZtXkYyUxGlmvgGe5oeRpA0CC9TR3oN4MDZCb+2CnlzH/AFd8+rFFpD lT3zxyJ/L37PoAQjC8icteQcg3nr9rDWp6Akzu5tOJjw9eBXT6Zv1kfGgY2TAStOIG ksqy1LC6xw5bknX4WgzSxErNSSgZAdWkO0pqj0FGG9NGy/AfLqez0lHZk107NyrM2J 2pDYYxFZwJvg6u4EuVWQwm7oee32fnONXKLyfQXnVGwjA/XDbyvarAnnz9xSck7+pf Hx7FPEgSJEdDA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FT-0062PV-5e; Wed, 05 Apr 2023 16:40:35 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 29/50] KVM: arm64: nv: Don't load the GICv4 context on entering a nested guest Date: Wed, 5 Apr 2023 16:39:47 +0100 Message-Id: <20230405154008.3552854-30-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When entering a nested guest (vgic_state_is_nested() == true), special care must be taken *not* to make the vPE resident, as these are interrupts targetting the L1 guest, and not any nested guest. By not making the vPE resident, we guarantee that the delivery of an vLPI will result in a doorbell, forcing an exit from the nested guest and a switch to the L1 guest to handle the interrupt. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/vgic/vgic-v3.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 749d705d76a4..7d4e2fb8f54e 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -757,8 +757,8 @@ void vgic_v3_load(struct kvm_vcpu *vcpu) if (vgic_state_is_nested(vcpu)) vgic_v3_load_nested(vcpu); - - WARN_ON(vgic_v4_load(vcpu)); + else + WARN_ON(vgic_v4_load(vcpu)); } void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu) @@ -773,6 +773,12 @@ void vgic_v3_put(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = vcpu->arch.vgic_cpu.current_cpu_if; + /* + * vgic_v4_put will do nothing if we were not resident. This + * covers both the cases where we've blocked (we already have + * done a vgic_v4_put) and when running a nested guest (the + * vPE was never resident in order to generate a doorbell). + */ WARN_ON(vgic_v4_put(vcpu, false)); vgic_v3_vmcr_sync(vcpu); From patchwork Wed Apr 5 15:39:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F0AFC76188 for ; Wed, 5 Apr 2023 15:41:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238232AbjDEPlY (ORCPT ); Wed, 5 Apr 2023 11:41:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238914AbjDEPlX (ORCPT ); Wed, 5 Apr 2023 11:41:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5C265BB1 for ; Wed, 5 Apr 2023 08:40:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 45F2363EBA for ; Wed, 5 Apr 2023 15:40:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AC2C7C433D2; Wed, 5 Apr 2023 15:40:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709252; bh=Ee7DP4nRaw/XtSer1SdieTIvPiQrm9+rtFEhVr/QDPo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b5kmLqdAk54c4rJwr9qxyYs6LxOMEp1VC0SF57IcQVckrjdipMgP6p2q3VlY9Af5p PqLqvifOQ6g+jb/st7s5sl1oSpqEsoyeQmSjBRMginl3AkWhbZiSS7c3TG5pZM6lR/ Mjd7s2Q/aGAHhtxHpai75nN4BlSDYVQLPFTQX3K4dCAMpSAFmxHJJuTtVw7ts7PxnF BnSflfBEcbJPHJQIUvlKPrpnuLE8fzLboJrfUIts1h6C7eCVk3K1Yu4nl2NcjyGF84 /Y0preDYRLjt+FmaWHFbCHNVd0MpI+gafekOO8Op4cfWmZDhewcx90z+ktJfTwFQMV CF9sTSGNOj2vw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FT-0062PV-Fo; Wed, 05 Apr 2023 16:40:35 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 30/50] KVM: arm64: nv: vgic: Emulate the HW bit in software Date: Wed, 5 Apr 2023 16:39:48 +0100 Message-Id: <20230405154008.3552854-31-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Christoffer Dall Should the guest hypervisor use the HW bit in the LRs, we need to emulate the deactivation from the L2 guest into the L1 distributor emulation, which is handled by L0. It's all good fun. Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_hyp.h | 2 ++ arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +- arch/arm64/kvm/vgic/vgic-v3-nested.c | 32 ++++++++++++++++++++++++++++ arch/arm64/kvm/vgic/vgic.c | 6 ++++-- include/kvm/arm_vgic.h | 1 + 5 files changed, 40 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index bdd9cf546d95..d57889de93e6 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -57,6 +57,8 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); +u64 __gic_v3_get_lr(unsigned int lr); + void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if); void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if); void __vgic_v3_activate_traps(struct vgic_v3_cpu_if *cpu_if); diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 6cb638b184b1..75152c1ce646 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -18,7 +18,7 @@ #define vtr_to_nr_pre_bits(v) ((((u32)(v) >> 26) & 7) + 1) #define vtr_to_nr_apr_regs(v) (1 << (vtr_to_nr_pre_bits(v) - 5)) -static u64 __gic_v3_get_lr(unsigned int lr) +u64 __gic_v3_get_lr(unsigned int lr) { switch (lr & 0xf) { case 0: diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c index ab8ddf490b31..e88c75e79010 100644 --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -140,6 +140,38 @@ static void vgic_v3_fixup_shadow_lr_state(struct kvm_vcpu *vcpu) } } +void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); + struct vgic_irq *irq; + int i; + + for (i = 0; i < s_cpu_if->used_lrs; i++) { + u64 lr = cpu_if->vgic_lr[i]; + int l1_irq; + + if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE)) + continue; + + /* + * If we had a HW lr programmed by the guest hypervisor, we + * need to emulate the HW effect between the guest hypervisor + * and the nested guest. + */ + l1_irq = (lr & ICH_LR_PHYS_ID_MASK) >> ICH_LR_PHYS_ID_SHIFT; + irq = vgic_get_irq(vcpu->kvm, vcpu, l1_irq); + if (!irq) + continue; /* oh well, the guest hyp is broken */ + + lr = __gic_v3_get_lr(i); + if (!(lr & ICH_LR_STATE)) + irq->active = false; + + vgic_put_irq(vcpu->kvm, irq); + } +} + void vgic_v3_load_nested(struct kvm_vcpu *vcpu) { struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c index 2c5a4bb6b2f9..05809462e199 100644 --- a/arch/arm64/kvm/vgic/vgic.c +++ b/arch/arm64/kvm/vgic/vgic.c @@ -876,9 +876,11 @@ void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu) { int used_lrs; - /* If nesting, this is a load/put affair, not flush/sync. */ - if (vgic_state_is_nested(vcpu)) + /* If nesting, emulate the HW effect from L0 to L1 */ + if (vgic_state_is_nested(vcpu)) { + vgic_v3_sync_nested(vcpu); return; + } /* An empty ap_list_head implies used_lrs == 0 */ if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index d775eb50b72c..ea45accbc690 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -400,6 +400,7 @@ void kvm_vgic_load(struct kvm_vcpu *vcpu); void kvm_vgic_put(struct kvm_vcpu *vcpu); void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu); +void vgic_v3_sync_nested(struct kvm_vcpu *vcpu); void vgic_v3_load_nested(struct kvm_vcpu *vcpu); void vgic_v3_put_nested(struct kvm_vcpu *vcpu); void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu); From patchwork Wed Apr 5 15:39:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50560C7619A for ; Wed, 5 Apr 2023 15:42:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238963AbjDEPmE (ORCPT ); Wed, 5 Apr 2023 11:42:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238769AbjDEPmD (ORCPT ); Wed, 5 Apr 2023 11:42:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A0585BA0 for ; Wed, 5 Apr 2023 08:41:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2546B63EE1 for ; Wed, 5 Apr 2023 15:41:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70A93C4339B; Wed, 5 Apr 2023 15:41:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709294; bh=yIv++hVvWKIFNu2dgNdr+SI9bfGl9mL3kzkogdhncyo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tI7QmZCpuZpOTYD3gwtVoahUgM634vsMCVLAOxYYBD7GLGaWmIcVkOuJkjQLIHENF fmvWU//R/0zbG3X8JxHtqfC+4ae1SVJLEaxairHDJI3UwBNc6X7uCjYbLkQNF2KVje b3dmLGtjC3Q4tDaw8vgMrwZkUbu3SQcqNVcgCr+PFEmVS7uH9ewpC1cntvqj9wUenN 3Hx3WfoShB6d91itVZtJEdBUeu7cyZdzv3ZwDL9wvRQ3kWi3UNAQRfUDJKBsnINyOj taJ5DUDQBlPECGn27ITblOod/2BwbLdy5zr1e293FeCY/w6qWuSZvnDpTde0VGi5Qt bSTrVs8T0lmXw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FT-0062PV-OQ; Wed, 05 Apr 2023 16:40:35 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 31/50] KVM: arm64: nv: vgic: Allow userland to set VGIC maintenance IRQ Date: Wed, 5 Apr 2023 16:39:49 +0100 Message-Id: <20230405154008.3552854-32-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Andre Przywara The VGIC maintenance IRQ signals various conditions about the LRs, when the GIC's virtualization extension is used. So far we didn't need it, but nested virtualization needs to know about this interrupt, so add a userland interface to setup the IRQ number. The architecture mandates that it must be a PPI, on top of that this code only exports a per-device option, so the PPI is the same on all VCPUs. Signed-off-by: Andre Przywara [added some bits of documentation] Signed-off-by: Marc Zyngier --- .../virt/kvm/devices/arm-vgic-v3.rst | 12 +++++++- arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/kvm/vgic/vgic-kvm-device.c | 29 +++++++++++++++++-- include/kvm/arm_vgic.h | 3 ++ tools/arch/arm/include/uapi/asm/kvm.h | 1 + 5 files changed, 43 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/devices/arm-vgic-v3.rst b/Documentation/virt/kvm/devices/arm-vgic-v3.rst index 51e5e5762571..1901e651cc00 100644 --- a/Documentation/virt/kvm/devices/arm-vgic-v3.rst +++ b/Documentation/virt/kvm/devices/arm-vgic-v3.rst @@ -284,8 +284,18 @@ Groups: | Aff3 | Aff2 | Aff1 | Aff0 | Errors: - ======= ============================================= -EINVAL vINTID is not multiple of 32 or info field is not VGIC_LEVEL_INFO_LINE_LEVEL ======= ============================================= + + KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ + Attributes: + + The attr field of kvm_device_attr encodes the following values: + + bits: | 31 .... 5 | 4 .... 0 | + values: | RES0 | vINTID | + + The vINTID specifies which interrupt is generated when the vGIC + must generate a maintenance interrupt. This must be a PPI. diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 0921f366c49f..d0b0bd14c02b 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -399,6 +399,7 @@ enum { #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 #define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 +#define KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 9 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) diff --git a/arch/arm64/kvm/vgic/vgic-kvm-device.c b/arch/arm64/kvm/vgic/vgic-kvm-device.c index 04dd68835b3f..45416f3b3961 100644 --- a/arch/arm64/kvm/vgic/vgic-kvm-device.c +++ b/arch/arm64/kvm/vgic/vgic-kvm-device.c @@ -288,6 +288,12 @@ static int vgic_get_common_attr(struct kvm_device *dev, VGIC_NR_PRIVATE_IRQS, uaddr); break; } + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: { + u32 __user *uaddr = (u32 __user *)(long)attr->addr; + + r = put_user(dev->kvm->arch.vgic.maint_irq, uaddr); + break; + } } return r; @@ -503,7 +509,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, struct vgic_reg_attr reg_attr; gpa_t addr; struct kvm_vcpu *vcpu; - bool uaccess; + bool uaccess, post_init = true; u32 val; int ret; @@ -519,6 +525,9 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, /* Sysregs uaccess is performed by the sysreg handling code */ uaccess = false; break; + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: + post_init = false; + fallthrough; default: uaccess = true; } @@ -531,7 +540,7 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, mutex_lock(&dev->kvm->lock); - if (unlikely(!vgic_initialized(dev->kvm))) { + if (post_init != vgic_initialized(dev->kvm)) { ret = -EBUSY; goto out; } @@ -566,6 +575,19 @@ static int vgic_v3_attr_regs_access(struct kvm_device *dev, } break; } + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: + if (!is_write) { + val = dev->kvm->arch.vgic.maint_irq; + ret = 0; + break; + } + + ret = -EINVAL; + if ((val < VGIC_NR_PRIVATE_IRQS) && (val >= VGIC_NR_SGIS)) { + dev->kvm->arch.vgic.maint_irq = val; + ret = 0; + } + break; default: ret = -EINVAL; break; @@ -591,6 +613,7 @@ static int vgic_v3_set_attr(struct kvm_device *dev, case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS: case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: return vgic_v3_attr_regs_access(dev, attr, true); default: return vgic_set_common_attr(dev, attr); @@ -605,6 +628,7 @@ static int vgic_v3_get_attr(struct kvm_device *dev, case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS: case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: return vgic_v3_attr_regs_access(dev, attr, false); default: return vgic_get_common_attr(dev, attr); @@ -628,6 +652,7 @@ static int vgic_v3_has_attr(struct kvm_device *dev, case KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS: return vgic_v3_has_attr_regs(dev, attr); case KVM_DEV_ARM_VGIC_GRP_NR_IRQS: + case KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ: return 0; case KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO: { if (((attr->attr & KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK) >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index ea45accbc690..1a2e2e8abd92 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -243,6 +243,9 @@ struct vgic_dist { int nr_spis; + /* The GIC maintenance IRQ for nested hypervisors. */ + u32 maint_irq; + /* base addresses in guest physical address space: */ gpa_t vgic_dist_base; /* distributor */ union { diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h index 03cd7c19a683..d5dd96902817 100644 --- a/tools/arch/arm/include/uapi/asm/kvm.h +++ b/tools/arch/arm/include/uapi/asm/kvm.h @@ -246,6 +246,7 @@ struct kvm_vcpu_events { #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 #define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 +#define KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 9 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) From patchwork Wed Apr 5 15:39:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D44EC76188 for ; Wed, 5 Apr 2023 15:42:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238932AbjDEPmC (ORCPT ); Wed, 5 Apr 2023 11:42:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238924AbjDEPmB (ORCPT ); Wed, 5 Apr 2023 11:42:01 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5C806EA2 for ; Wed, 5 Apr 2023 08:41:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 182A963EF4 for ; Wed, 5 Apr 2023 15:41:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71C95C4339E; Wed, 5 Apr 2023 15:41:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709287; bh=A8APb/ShbfEk/49VtpSdFlMSQpzv8xpXTIJI/KZaCbQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=us8bJebJKF5lor/94j4lpx/LipLQ6FGpgB907UJL/iQAmGl9gZM5lz69dezjKOVxz sWWANH3W+xYXkUdQOIPI+FJ2SqM06iBS9ftRFgE/MPqRu2jrWBN1hLyCroTSXuDbO3 IJJBkBivOxyAmSPLkJUsDSA1UYfq+MYq8GkLngCXLyV8ta6tix8uolk88QuyT7PBpA xO1wL1ThANsEhs0R8smN5DqjSYWhfwajCGgImJ1CDB47lnV9tQNJ3NrAS2HfSWqKX/ EikzggOeVvr0nvtcxOf58ZQxYtAkxFTLWZq9rGLUZdBzktstMwVPcRhk0+bBmaM87U i4bP16T+k1zDA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FU-0062PV-1N; Wed, 05 Apr 2023 16:40:37 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 32/50] KVM: arm64: nv: Implement maintenance interrupt forwarding Date: Wed, 5 Apr 2023 16:39:50 +0100 Message-Id: <20230405154008.3552854-33-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When we take a maintenance interrupt, we need to decide whether it is generated on an action from the guest, or if it is something that needs to be forwarded to the guest hypervisor. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/vgic/vgic-init.c | 33 ++++++++++++++++++++++++++++ arch/arm64/kvm/vgic/vgic-v3-nested.c | 28 ++++++++++++++++++----- 2 files changed, 55 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c index cd134db41a57..a97279f82bbd 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -6,10 +6,12 @@ #include #include #include +#include #include #include #include #include +#include #include "vgic.h" /* @@ -222,6 +224,16 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) if (!irqchip_in_kernel(vcpu->kvm)) return 0; + if (vcpu_has_nv(vcpu)) { + /* Cope with vintage userspace. Maybe we should fail instead */ + if (vcpu->kvm->arch.vgic.maint_irq == 0) + vcpu->kvm->arch.vgic.maint_irq = kvm_vgic_global_state.maint_irq; + ret = kvm_vgic_set_owner(vcpu, vcpu->kvm->arch.vgic.maint_irq, + vcpu); + if (ret) + return ret; + } + /* * If we are creating a VCPU with a GICv3 we must also register the * KVM io device for the redistributor that belongs to this VCPU. @@ -478,12 +490,23 @@ void kvm_vgic_cpu_down(void) static irqreturn_t vgic_maintenance_handler(int irq, void *data) { + struct kvm_vcpu *vcpu = *(struct kvm_vcpu **)data; + /* * We cannot rely on the vgic maintenance interrupt to be * delivered synchronously. This means we can only use it to * exit the VM, and we perform the handling of EOIed * interrupts on the exit path (see vgic_fold_lr_state). */ + + /* If not nested, deactivate */ + if (!vcpu || !vgic_state_is_nested(vcpu)) { + irq_set_irqchip_state(irq, IRQCHIP_STATE_ACTIVE, false); + return IRQ_HANDLED; + } + + /* Assume nested from now */ + vgic_v3_handle_nested_maint_irq(vcpu); return IRQ_HANDLED; } @@ -582,6 +605,16 @@ int kvm_vgic_hyp_init(void) return ret; } + if (has_mask) { + ret = irq_set_vcpu_affinity(kvm_vgic_global_state.maint_irq, + kvm_get_running_vcpus()); + if (ret) { + kvm_err("Error setting vcpu affinity\n"); + free_percpu_irq(kvm_vgic_global_state.maint_irq, kvm_get_running_vcpus()); + return ret; + } + } + kvm_info("vgic interrupt IRQ%d\n", kvm_vgic_global_state.maint_irq); return 0; } diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c index e88c75e79010..755e4819603a 100644 --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -175,10 +175,20 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) void vgic_v3_load_nested(struct kvm_vcpu *vcpu) { struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + struct vgic_irq *irq; + unsigned long flags; vgic_cpu->shadow_vgic_v3 = vgic_cpu->nested_vgic_v3; vgic_v3_create_shadow_lr(vcpu); __vgic_v3_restore_state(vcpu_shadow_if(vcpu)); + + irq = vgic_get_irq(vcpu->kvm, vcpu, vcpu->kvm->arch.vgic.maint_irq); + raw_spin_lock_irqsave(&irq->irq_lock, flags); + if (irq->line_level || irq->active) + irq_set_irqchip_state(kvm_vgic_global_state.maint_irq, + IRQCHIP_STATE_ACTIVE, true); + raw_spin_unlock_irqrestore(&irq->irq_lock, flags); + vgic_put_irq(vcpu->kvm, irq); } void vgic_v3_put_nested(struct kvm_vcpu *vcpu) @@ -193,20 +203,26 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu) */ vgic_v3_fixup_shadow_lr_state(vcpu); vgic_cpu->nested_vgic_v3 = vgic_cpu->shadow_vgic_v3; + irq_set_irqchip_state(kvm_vgic_global_state.maint_irq, + IRQCHIP_STATE_ACTIVE, false); } void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); - /* * If we exit a nested VM with a pending maintenance interrupt from the * GIC, then we need to forward this to the guest hypervisor so that it * can re-sync the appropriate LRs and sample level triggered interrupts * again. */ - if (vgic_state_is_nested(vcpu) && - (cpu_if->vgic_hcr & ICH_HCR_EN) && - vgic_v3_get_misr(vcpu)) - kvm_inject_nested_irq(vcpu); + if (vgic_state_is_nested(vcpu)) { + struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); + bool state; + + state = cpu_if->vgic_hcr & ICH_HCR_EN; + state &= vgic_v3_get_misr(vcpu); + + kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, + vcpu->kvm->arch.vgic.maint_irq, state, vcpu); + } } From patchwork Wed Apr 5 15:39:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF72BC76188 for ; Wed, 5 Apr 2023 15:43:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239016AbjDEPnA (ORCPT ); Wed, 5 Apr 2023 11:43:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238902AbjDEPm4 (ORCPT ); Wed, 5 Apr 2023 11:42:56 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5B3F6E86 for ; Wed, 5 Apr 2023 08:42:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CACF363EE6 for ; Wed, 5 Apr 2023 15:41:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3B581C4339C; Wed, 5 Apr 2023 15:41:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709308; bh=HzXSsUwByW2jAR2hO6nb9l7RS/h2EPWpRifFw83LriQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eQiN3pdYzvEKc1AGbn74E6rhwGWa5dgKhWjHUOQLqbbMuNqtgs5usa4vOmW72g+qN EOYM4I07ryt9xcCcFkKcwW1Y2LTOn0gtVY5Fab4P0grfH2gDHIGUmfMby1h9DDlSTQ gTa2+hSEEKMiPTadqnSRbQM0Cux6QvJ0XF1Fl5uO6VJGgId7gpCWoVZl/BiSGsTOH4 zPhmXXfoaj9XgyXi4vF+5GCzgTr2gxYpZMm7T7rSA/NV76f6lJT4U/QEHRIDekSGLq prl1a1PNdDwlFX2LVGcOxRTGNQWmJgnaqW60LTf6NOXAwFX6VBuVZv0Fgk1RuywJ3M /WKCr3TnSiFcw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FV-0062PV-CY; Wed, 05 Apr 2023 16:40:39 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 33/50] KVM: arm64: nv: Deal with broken VGIC on maintenance interrupt delivery Date: Wed, 5 Apr 2023 16:39:51 +0100 Message-Id: <20230405154008.3552854-34-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Normal, non-nesting KVM deals with maintenance interrupt in a very simple way: we don't even try to handle it and just turn it off as soon as we exit, long before the kernel can handle it. However, with NV, we rely on the actual handling of the interrupt to leave it active and pass it down to the L1 guest hypervisor (we effectively treat it as an assigned interrupt, just like the timer). This doesn't work with something like the Apple M2, which doesn't have an active state that allows the interrupt to be masked. Instead, just disable the vgic after having taken the interrupt and injected a virtual interrupt. This is enough for the guest to make forward progress, but will limit its ability to handle further interrupts until it next exits (IAR will always report "spurious"). Oh well. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/vgic/vgic-v3-nested.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c index 755e4819603a..919275b94625 100644 --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -225,4 +225,7 @@ void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu) kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, vcpu->kvm->arch.vgic.maint_irq, state, vcpu); } + + if (unlikely(kvm_vgic_global_state.no_hw_deactivation)) + sysreg_clear_set_s(SYS_ICH_HCR_EL2, ICH_HCR_EN, 0); } From patchwork Wed Apr 5 15:39:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7475AC76188 for ; Wed, 5 Apr 2023 15:41:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238909AbjDEPlK (ORCPT ); Wed, 5 Apr 2023 11:41:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238910AbjDEPlI (ORCPT ); Wed, 5 Apr 2023 11:41:08 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5340F6A7B for ; Wed, 5 Apr 2023 08:40:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 91B9D63ED0 for ; Wed, 5 Apr 2023 15:40:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0504CC433EF; Wed, 5 Apr 2023 15:40:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709248; bh=Qp6MCbgjyeYLqXhzPsD/TLmpoZonoNw49Oh0VFywzgE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Og1BlC6r7OnttGkUFyEf3L+9qSEDXVHlH2SUhGdme0H/H24l8TAvHPMVolnLH8PN3 CdhW9C4TDAYFAPeBYBoH43CBu6y0baneJFPHYBAcNJSFOQS1hy1GXCi3diwcDHkSDW 7V6FT6oOk5jwCdNoLQj/0aCaGU6OvjDS6Je/J5GHNH/ylZJig9xkcLHjJaJKlU5c+R 4DNfgPNMSFrogxMJeZoGUFNgmQTemFiQLVxe6KrMTauJ2Dpag8oiRfiTxTQyUi8S72 HHCBDt2PezWKVCgY85Pn1wIpStV4afikdJzFN2O5BPALspmRrsp7STJqB5jdUVujcp yuWkWeuy4zFWQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FX-0062PV-UG; Wed, 05 Apr 2023 16:40:41 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 34/50] KVM: arm64: nv: Allow userspace to request KVM_ARM_VCPU_NESTED_VIRT Date: Wed, 5 Apr 2023 16:39:52 +0100 Message-Id: <20230405154008.3552854-35-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Since we're (almost) feature complete, let's allow userspace to request KVM_ARM_VCPU_NESTED_VIRT by bumping the KVM_VCPU_MAX_FEATURES up. We also now advertise the feature to userspace with a new capability. It's going to be great... Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 2 +- arch/arm64/kvm/arm.c | 3 +++ include/uapi/linux/kvm.h | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 1b3902d2470f..03268d444801 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -37,7 +37,7 @@ #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS -#define KVM_VCPU_MAX_FEATURES 7 +#define KVM_VCPU_MAX_FEATURES 8 #define KVM_REQ_SLEEP \ KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index e4a77fe1cfa4..6af9fdf9b44d 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -272,6 +272,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_EL1_32BIT: r = cpus_have_const_cap(ARM64_HAS_32BIT_EL1); break; + case KVM_CAP_ARM_EL2: + r = cpus_have_const_cap(ARM64_HAS_NESTED_VIRT); + break; case KVM_CAP_GUEST_DEBUG_HW_BPS: r = get_num_brps(); break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 6a7e1a0ecf04..70ac878cf99f 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1185,6 +1185,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 #define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226 #define KVM_CAP_COUNTER_OFFSET 227 +#define KVM_CAP_ARM_EL2 228 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Wed Apr 5 15:39:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96575C77B60 for ; Wed, 5 Apr 2023 15:47:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239036AbjDEPrS (ORCPT ); Wed, 5 Apr 2023 11:47:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58086 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239002AbjDEPrQ (ORCPT ); Wed, 5 Apr 2023 11:47:16 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B4F67285 for ; Wed, 5 Apr 2023 08:46:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 679D163F0B for ; Wed, 5 Apr 2023 15:46:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D0AB2C433D2; Wed, 5 Apr 2023 15:46:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709606; bh=o5cJBmfCEPFW7Sj2/hEQQSLCVBPoNNQnxjP0r7BKxYA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QXJ631+Bsy2upXwXEAbmCTevLEqjfbGO550oHtopUl1HRXvY0pmvfQxQDL0Wclbsz t6rPNQTpVqQz/muBS8I23k/SumdNfXLBiAJuUY7txexlyzRc2+tFU2XpHP/Ur8wBIc APo8tKK5jBn0dB/kjUUiqLPcD+QWOdi+gzg5gb76LsxxVLcOpdd9t5xZWR09PpUx1e eR1C12KCxg2Oh8A9leFfFyB7GpkbomSiXPu+aBcR/efZiLYAIBCm3gJMCeDwnsc/qw HOHjscrkwpKyR7dmbN/JiYxHfnn4IqBtoT//pFe1t2Gr8AbbM21yVPadKPzNmeUTe4 v9o00FLjrqX2A== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5FZ-0062PV-9z; Wed, 05 Apr 2023 16:40:42 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 35/50] KVM: arm64: nv: Add handling of FEAT_TTL TLB invalidation Date: Wed, 5 Apr 2023 16:39:53 +0100 Message-Id: <20230405154008.3552854-36-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Support guest-provided information information to find out about the range of required invalidation. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 1 + arch/arm64/kvm/nested.c | 90 +++++++++++++++++++++++++++++ arch/arm64/kvm/sys_regs.c | 26 +-------- 3 files changed, 92 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index a65db37a44f7..9e5ffb68a5cd 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -129,6 +129,7 @@ extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); extern bool forward_nv_traps(struct kvm_vcpu *vcpu); extern bool forward_nv1_traps(struct kvm_vcpu *vcpu); +unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val); struct sys_reg_params; struct sys_reg_desc; diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index d672157de574..b46d58d4542c 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -357,6 +357,96 @@ int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa, return ret; } +static unsigned int ttl_to_size(u8 ttl) +{ + int level = ttl & 3; + int gran = (ttl >> 2) & 3; + unsigned int max_size = 0; + + switch (gran) { + case TLBI_TTL_TG_4K: + switch (level) { + case 0: + break; + case 1: + max_size = SZ_1G; + break; + case 2: + max_size = SZ_2M; + break; + case 3: + max_size = SZ_4K; + break; + } + break; + case TLBI_TTL_TG_16K: + switch (level) { + case 0: + case 1: + break; + case 2: + max_size = SZ_32M; + break; + case 3: + max_size = SZ_16K; + break; + } + break; + case TLBI_TTL_TG_64K: + switch (level) { + case 0: + case 1: + /* No 52bit IPA support */ + break; + case 2: + max_size = SZ_512M; + break; + case 3: + max_size = SZ_64K; + break; + } + break; + default: /* No size information */ + break; + } + + return max_size; +} + +unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val) +{ + unsigned long max_size; + u8 ttl; + + ttl = FIELD_GET(GENMASK_ULL(47, 44), val); + + max_size = ttl_to_size(ttl); + + if (!max_size) { + /* Compute the maximum extent of the invalidation */ + switch (mmu->vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + max_size = SZ_1G; + break; + case VTCR_EL2_TG0_16K: + max_size = SZ_32M; + break; + case VTCR_EL2_TG0_64K: + /* + * No, we do not support 52bit IPA in nested yet. Once + * we do, this should be 4TB. + */ + max_size = SZ_512M; + break; + default: + BUG(); + } + } + + WARN_ON(!max_size); + return max_size; +} + /* * We can have multiple *different* MMU contexts with the same VMID: * diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 25dc3f2fb45c..6bd78b58894f 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -3036,36 +3036,12 @@ static void s2_mmu_unmap_stage2_ipa(struct kvm_s2_mmu *mmu, * * - NS bit: we're non-secure only. * - * - TTL field: We already have the granule size from the - * VTCR_EL2.TG0 field, and the level is only relevant to the - * guest's S2PT. - * * - IPA[51:48]: We don't support 52bit IPA just yet... * * And of course, adjust the IPA to be on an actual address. */ base_addr = (info->ipa.addr & GENMASK_ULL(35, 0)) << 12; - - /* Compute the maximum extent of the invalidation */ - switch (mmu->vtcr & VTCR_EL2_TG0_MASK) { - case VTCR_EL2_TG0_4K: - max_size = SZ_1G; - break; - case VTCR_EL2_TG0_16K: - max_size = SZ_32M; - break; - case VTCR_EL2_TG0_64K: - /* - * No, we do not support 52bit IPA in nested yet. Once - * we do, this should be 4TB. - */ - /* FIXME: remove the 52bit PA support from the IDregs */ - max_size = SZ_512M; - break; - default: - BUG(); - } - + max_size = compute_tlb_inval_range(mmu, info->ipa.addr); base_addr &= ~(max_size - 1); kvm_unmap_stage2_range(mmu, base_addr, max_size); From patchwork Wed Apr 5 15:39:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03353C76188 for ; Wed, 5 Apr 2023 15:46:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239067AbjDEPqm (ORCPT ); Wed, 5 Apr 2023 11:46:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239072AbjDEPq1 (ORCPT ); Wed, 5 Apr 2023 11:46:27 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1049A7287 for ; Wed, 5 Apr 2023 08:46:13 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5362463F02 for ; Wed, 5 Apr 2023 15:46:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7727C4339B; Wed, 5 Apr 2023 15:46:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709571; bh=5FKZeF4bcHwk1v+ELgygebiurlG0HlkfqX7UD+u6FrE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HBW7dABKyloKObDla8SIZglPwDt2of9gvt0WxHKYZdVcnLzLMp3Xj+wMR/fxDbjrn icg6cABJEWfJ6PY19wWhPOrHDrscWeTlXe8IRsquAaEGCNvAUzjt7Zfjpw0NsHaskk QraW35EHoU1YfhuOzlU570TLdPizUAHjEJcWci4trgUPfDFZAVkXk86B1HbHgw9jth ckut0WJErX9DuwGPNYnKOqn8qdF23OW/XZgQ9LuaLrtclYT6BvsgOO05S7om8jqrmq vmLuqs4jObOo1ihlFTwPJ9iAHD2YqfwOuMfLPVpDtt2lZbGtGmgmeP98sCj6ctenrC yKSfLc98y7UWQ== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fd-0062PV-I7; Wed, 05 Apr 2023 16:40:46 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 36/50] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information Date: Wed, 5 Apr 2023 16:39:54 +0100 Message-Id: <20230405154008.3552854-37-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In order to be able to make S2 TLB invalidations more performant on NV, let's use a scheme derived from the ARMv8.4 TTL extension. If bits [56:55] in the descriptor are non-zero, they indicate a level which can be used as an invalidation range. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 2 + arch/arm64/kvm/nested.c | 81 +++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 9e5ffb68a5cd..52d0a6c0fecc 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -137,4 +137,6 @@ struct sys_reg_desc; void access_nested_id_reg(struct kvm_vcpu *v, struct sys_reg_params *p, const struct sys_reg_desc *r); +#define KVM_NV_GUEST_MAP_SZ (KVM_PGTABLE_PROT_SW1 | KVM_PGTABLE_PROT_SW0) + #endif /* __ARM64_KVM_NESTED_H */ diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index b46d58d4542c..0f535a5ff941 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -4,6 +4,7 @@ * Author: Jintack Lim */ +#include #include #include @@ -413,6 +414,81 @@ static unsigned int ttl_to_size(u8 ttl) return max_size; } +/* + * Compute the equivalent of the TTL field by parsing the shadow PT. The + * granule size is extracted from the cached VTCR_EL2.TG0 while the level is + * retrieved from first entry carrying the level as a tag. + */ +static u8 get_guest_mapping_ttl(struct kvm_s2_mmu *mmu, u64 addr) +{ + u64 tmp, sz = 0, vtcr = mmu->vtcr; + kvm_pte_t pte; + u8 ttl, level; + + switch (vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + ttl = (1 << 2); + break; + case VTCR_EL2_TG0_16K: + ttl = (2 << 2); + break; + case VTCR_EL2_TG0_64K: + ttl = (3 << 2); + break; + default: + BUG(); + } + + tmp = addr; + +again: + /* Iteratively compute the block sizes for a particular granule size */ + switch (vtcr & VTCR_EL2_TG0_MASK) { + case VTCR_EL2_TG0_4K: + if (sz < SZ_4K) sz = SZ_4K; + else if (sz < SZ_2M) sz = SZ_2M; + else if (sz < SZ_1G) sz = SZ_1G; + else sz = 0; + break; + case VTCR_EL2_TG0_16K: + if (sz < SZ_16K) sz = SZ_16K; + else if (sz < SZ_32M) sz = SZ_32M; + else sz = 0; + break; + case VTCR_EL2_TG0_64K: + if (sz < SZ_64K) sz = SZ_64K; + else if (sz < SZ_512M) sz = SZ_512M; + else sz = 0; + break; + default: + BUG(); + } + + if (sz == 0) + return 0; + + tmp &= ~(sz - 1); + if (kvm_pgtable_get_leaf(mmu->pgt, tmp, &pte, NULL)) + goto again; + if (!(pte & PTE_VALID)) + goto again; + level = FIELD_GET(KVM_NV_GUEST_MAP_SZ, pte); + if (!level) + goto again; + + ttl |= level; + + /* + * We now have found some level information in the shadow S2. Check + * that the resulting range is actually including the original IPA. + */ + sz = ttl_to_size(ttl); + if (addr < (tmp + sz)) + return ttl; + + return 0; +} + unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val) { unsigned long max_size; @@ -420,6 +496,11 @@ unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val) ttl = FIELD_GET(GENMASK_ULL(47, 44), val); + if (!(cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) && ttl)) { + u64 addr = (val & GENMASK_ULL(35, 0)) << 12; + ttl = get_guest_mapping_ttl(mmu, addr); + } + max_size = ttl_to_size(ttl); if (!max_size) { From patchwork Wed Apr 5 15:39:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83A77C76188 for ; Wed, 5 Apr 2023 15:46:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239040AbjDEPqa (ORCPT ); Wed, 5 Apr 2023 11:46:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239036AbjDEPqX (ORCPT ); Wed, 5 Apr 2023 11:46:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A39F74ED2 for ; Wed, 5 Apr 2023 08:46:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 419ED63E1F for ; Wed, 5 Apr 2023 15:46:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A537CC433D2; Wed, 5 Apr 2023 15:46:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709567; bh=7BO0Ssdij/dFPVApY+QX/fubB90BB/xC9flfiMh1bIA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vBVuY+KM7T0xq0bWBcHSWVUMSJ5I2Vx5K8zBiDExhc4OXDRYhczoqojSW1VS9AZy6 I7rCaL62RfN8yLfTZ7GcE8TOKcjW9UgOSGpF//UbC5GmsUF3xxeBUDNC9+dEpw0uAj uNBhUATOdNyvNY32FXqQBklTYGrHLVQFfeq96TT13FWnHhRhmhI0vckJfqFrKQkdQW cQ13wSDeXUauS+em2/nK+Y/IjVAWfBQY3ukFS/s434a57d8aY/QvugGzdKmeicWzwy +5Jt53MD515QVeA5YgxCOFVRYntSUDAmiX5I5C12jDM+unmjn0VaesM0BFiykpL4+1 5MYXZbihNXQ6g== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fe-0062PV-EJ; Wed, 05 Apr 2023 16:40:47 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 37/50] KVM: arm64: nv: Tag shadow S2 entries with nested level Date: Wed, 5 Apr 2023 16:39:55 +0100 Message-Id: <20230405154008.3552854-38-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Populate bits [56:55] of the leaf entry with the level provided by the guest's S2 translation. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 7 +++++++ arch/arm64/kvm/mmu.c | 16 ++++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 52d0a6c0fecc..784168dcb0fe 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -5,6 +5,8 @@ #include #include +#include + static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu) { return (!__is_defined(__KVM_NVHE_HYPERVISOR__) && @@ -139,4 +141,9 @@ void access_nested_id_reg(struct kvm_vcpu *v, struct sys_reg_params *p, #define KVM_NV_GUEST_MAP_SZ (KVM_PGTABLE_PROT_SW1 | KVM_PGTABLE_PROT_SW0) +static inline u64 kvm_encode_nested_level(struct kvm_s2_trans *trans) +{ + return FIELD_PREP(KVM_NV_GUEST_MAP_SZ, trans->level); +} + #endif /* __ARM64_KVM_NESTED_H */ diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7fd122d687f7..42bce05261c5 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1456,11 +1456,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * Potentially reduce shadow S2 permissions to match the guest's own * S2. For exec faults, we'd only reach this point if the guest * actually allowed it (see kvm_s2_handle_perm_fault). + * + * Also encode the level of the nested translation in the SW bits of + * the PTE/PMD/PUD. This will be retrived on TLB invalidation from + * the guest. */ if (nested) { writable &= kvm_s2_trans_writable(nested); if (!kvm_s2_trans_readable(nested)) prot &= ~KVM_PGTABLE_PROT_R; + + prot |= kvm_encode_nested_level(nested); } read_lock(&kvm->mmu_lock); @@ -1514,14 +1520,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * permissions only if vma_pagesize equals fault_granule. Otherwise, * kvm_pgtable_stage2_map() should be called to change block size. */ - if (fault_status == ESR_ELx_FSC_PERM && vma_pagesize == fault_granule) + if (fault_status == ESR_ELx_FSC_PERM && vma_pagesize == fault_granule) { + /* + * Drop the SW bits in favour of those stored in the + * PTE, which will be preserved. + */ + prot &= ~KVM_NV_GUEST_MAP_SZ; ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); - else + } else { ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize, __pfn_to_phys(pfn), prot, memcache, KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED); + } /* Mark the page dirty only if the fault is handled successfully */ if (writable && !ret) { From patchwork Wed Apr 5 15:39:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84B82C7619A for ; Wed, 5 Apr 2023 15:47:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238993AbjDEPrC (ORCPT ); Wed, 5 Apr 2023 11:47:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238770AbjDEPq4 (ORCPT ); Wed, 5 Apr 2023 11:46:56 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E98B59D2 for ; Wed, 5 Apr 2023 08:46:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2AD2B63F02 for ; Wed, 5 Apr 2023 15:46:29 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D7AEC4339B; Wed, 5 Apr 2023 15:46:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709588; bh=PHs4gaxAaWQmUmtjbLODLMqbuuO3t58QTS/ww8BSn6A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LWGJOkd2zACE5X491upAsmJRjku26+xZwkENQLcsZYHokf8hNgDzj5eaOPhZ2MDA2 aYNvlot05wLHFAICO9I3R0UERf5D4C/DBBQN/HezhGGSfEzPiwCJTTKernKraYpxs6 f8xS+6ehSIuZQ0UbdvbV+JAe2H4/micyk7sbX1TYB85ITnLBURnsmXFnMWSJROUgJX CP41DdK2PgVpR+uCqiqGStSGSqG3MVR7+yV+Mx3MdcxHhJGW4c9UsVf61fYU4PH9qe 1/WryL3+H5ckujoXvBgrp0b29HpPnWEZZZXvy/wJYAcmnfPUY807EUqKO3u92Pxd2B hJyZ6PmCIG/Xw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fh-0062PV-A1; Wed, 05 Apr 2023 16:40:51 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 38/50] KVM: arm64: nv: Add include containing the VNCR_EL2 offsets Date: Wed, 5 Apr 2023 16:39:56 +0100 Message-Id: <20230405154008.3552854-39-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org VNCR_EL2 points to a page containing a number of system registers accessed by a guest hypervisor when ARMv8.4-NV is enabled. Let's document the offsets in that page, as we are going to use this layout. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/vncr_mapping.h | 74 +++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 arch/arm64/include/asm/vncr_mapping.h diff --git a/arch/arm64/include/asm/vncr_mapping.h b/arch/arm64/include/asm/vncr_mapping.h new file mode 100644 index 000000000000..c0a2acd5045c --- /dev/null +++ b/arch/arm64/include/asm/vncr_mapping.h @@ -0,0 +1,74 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * System register offsets in the VNCR page + * All offsets are *byte* displacements! + */ + +#ifndef __ARM64_VNCR_MAPPING_H__ +#define __ARM64_VNCR_MAPPING_H__ + +#define VNCR_VTTBR_EL2 0x020 +#define VNCR_VTCR_EL2 0x040 +#define VNCR_VMPIDR_EL2 0x050 +#define VNCR_CNTVOFF_EL2 0x060 +#define VNCR_HCR_EL2 0x078 +#define VNCR_HSTR_EL2 0x080 +#define VNCR_VPIDR_EL2 0x088 +#define VNCR_TPIDR_EL2 0x090 +#define VNCR_VNCR_EL2 0x0B0 +#define VNCR_CPACR_EL1 0x100 +#define VNCR_CONTEXTIDR_EL1 0x108 +#define VNCR_SCTLR_EL1 0x110 +#define VNCR_ACTLR_EL1 0x118 +#define VNCR_TCR_EL1 0x120 +#define VNCR_AFSR0_EL1 0x128 +#define VNCR_AFSR1_EL1 0x130 +#define VNCR_ESR_EL1 0x138 +#define VNCR_MAIR_EL1 0x140 +#define VNCR_AMAIR_EL1 0x148 +#define VNCR_MDSCR_EL1 0x158 +#define VNCR_SPSR_EL1 0x160 +#define VNCR_CNTV_CVAL_EL0 0x168 +#define VNCR_CNTV_CTL_EL0 0x170 +#define VNCR_CNTP_CVAL_EL0 0x178 +#define VNCR_CNTP_CTL_EL0 0x180 +#define VNCR_SCXTNUM_EL1 0x188 +#define VNCR_TFSR_EL1 0x190 +#define VNCR_ZCR_EL1 0x1E0 +#define VNCR_TTBR0_EL1 0x200 +#define VNCR_TTBR1_EL1 0x210 +#define VNCR_FAR_EL1 0x220 +#define VNCR_ELR_EL1 0x230 +#define VNCR_SP_EL1 0x240 +#define VNCR_VBAR_EL1 0x250 +#define VNCR_ICH_LR0_EL2 0x400 +// VNCR_ICH_LRN_EL2(n) VNCR_ICH_LR0_EL2+8*((n) & 7) +#define VNCR_ICH_AP0R0_EL2 0x480 +// VNCR_ICH_AP0RN_EL2(n) VNCR_ICH_AP0R0_EL2+8*((n) & 3) +#define VNCR_ICH_AP1R0_EL2 0x4A0 +// VNCR_ICH_AP1RN_EL2(n) VNCR_ICH_AP1R0_EL2+8*((n) & 3) +#define VNCR_ICH_HCR_EL2 0x4C0 +#define VNCR_ICH_VMCR_EL2 0x4C8 +#define VNCR_VDISR_EL2 0x500 +#define VNCR_PMBLIMITR_EL1 0x800 +#define VNCR_PMBPTR_EL1 0x810 +#define VNCR_PMBSR_EL1 0x820 +#define VNCR_PMSCR_EL1 0x828 +#define VNCR_PMSEVFR_EL1 0x830 +#define VNCR_PMSICR_EL1 0x838 +#define VNCR_PMSIRR_EL1 0x840 +#define VNCR_PMSLATFR_EL1 0x848 +#define VNCR_TRFCR_EL1 0x880 +#define VNCR_MPAM1_EL1 0x900 +#define VNCR_MPAMHCR_EL2 0x930 +#define VNCR_MPAMVPMV_EL2 0x938 +#define VNCR_MPAMVPM0_EL2 0x940 +#define VNCR_MPAMVPM1_EL2 0x948 +#define VNCR_MPAMVPM2_EL2 0x950 +#define VNCR_MPAMVPM3_EL2 0x958 +#define VNCR_MPAMVPM4_EL2 0x960 +#define VNCR_MPAMVPM5_EL2 0x968 +#define VNCR_MPAMVPM6_EL2 0x970 +#define VNCR_MPAMVPM7_EL2 0x978 + +#endif /* __ARM64_VNCR_MAPPING_H__ */ From patchwork Wed Apr 5 15:39:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202096 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07B79C761A6 for ; Wed, 5 Apr 2023 15:46:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239060AbjDEPqZ (ORCPT ); Wed, 5 Apr 2023 11:46:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239018AbjDEPqJ (ORCPT ); Wed, 5 Apr 2023 11:46:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 093B93C3E for ; Wed, 5 Apr 2023 08:46:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8DFB463F09 for ; Wed, 5 Apr 2023 15:46:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8C63C433D2; Wed, 5 Apr 2023 15:46:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709563; bh=WyYOg+yzc7oduQO4QBxWSh44tO70vdaCrw+QhsKcuO4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oO2JNqv4ZAD9y7WGpH5lPYOivB/LD6P668UDAX59Rlh9SvY6NWPdjWbtq6IcgL3MF SrvXw0uJm5cMGROae5+DU/YVwmaJPRjjwEGmrK3tX6yd1l9AAAvfA9Rjz4AOwCEtrk hGauyL4TC8MTXMbmBH7eEZ8rZpSNDVaSKsgmxO0SVoGsmsVDEDxmJuvbiWvr5iNpHf y8Y8QHMHTxSypDsUNal2ovURwtGDRTM1CpoLpopeKG576hZ0mxcGYpJIzgMlidfUjk yRplnQSu5z7H93fVviFEuZoDl4v0sLNOS+wT83L+W9Y1gipPhr7SiqcPnPv+u0Ti07 NBhh3SRZbHLfw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fk-0062PV-Un; Wed, 05 Apr 2023 16:40:54 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 39/50] KVM: arm64: nv: Map VNCR-capable registers to a separate page Date: Wed, 5 Apr 2023 16:39:57 +0100 Message-Id: <20230405154008.3552854-40-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With ARMv8.4-NV, registers that can be directly accessed in memory by the guest have to live at architected offsets in a special page. Let's annotate the sysreg enum to reflect the offset at which they are in this page, whith a little twist: If running on HW that doesn't have the ARMv8.4-NV feature, or even a VM that doesn't use NV, we store all the system registers in the usual sys_regs array. The only difference with the pre-8.4 situation is that VNCR-capable registers are at a "similar" offset as in the VNCR page (we can compute the actual offset at compile time), and that the sys_regs array is both bigger and sparse. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 104 ++++++++++++++++++++---------- arch/arm64/tools/cpucaps | 1 + 2 files changed, 70 insertions(+), 35 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 03268d444801..eb095a9eecba 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -26,6 +26,7 @@ #include #include #include +#include #define __KVM_HAVE_ARCH_INTC_INITIALIZED @@ -302,32 +303,33 @@ struct kvm_vcpu_fault_info { u64 disr_el1; /* Deferred [SError] Status Register */ }; +/* + * VNCR() just places the VNCR_capable registers in the enum after + * __VNCR_START__, and the value (after correction) to be an 8-byte offset + * from the VNCR base. As we don't require the enum to be otherwise ordered, + * we need the terrible hack below to ensure that we correctly size the + * sys_regs array, no matter what. + * + * The __MAX__ macro has been lifted from Sean Eron Anderson's wonderful + * treasure trove of bit hacks: + * https://graphics.stanford.edu/~seander/bithacks.html#IntegerMinOrMax + */ +#define __MAX__(x,y) ((x) ^ (((x) ^ (y)) & -((x) < (y)))) +#define VNCR(r) \ + __before_##r, \ + r = __VNCR_START__ + ((VNCR_ ## r) / 8), \ + __after_##r = __MAX__(__before_##r - 1, r) + enum vcpu_sysreg { __INVALID_SYSREG__, /* 0 is reserved as an invalid value */ MPIDR_EL1, /* MultiProcessor Affinity Register */ CLIDR_EL1, /* Cache Level ID Register */ CSSELR_EL1, /* Cache Size Selection Register */ - SCTLR_EL1, /* System Control Register */ - ACTLR_EL1, /* Auxiliary Control Register */ - CPACR_EL1, /* Coprocessor Access Control */ - ZCR_EL1, /* SVE Control */ - TTBR0_EL1, /* Translation Table Base Register 0 */ - TTBR1_EL1, /* Translation Table Base Register 1 */ - TCR_EL1, /* Translation Control Register */ - ESR_EL1, /* Exception Syndrome Register */ - AFSR0_EL1, /* Auxiliary Fault Status Register 0 */ - AFSR1_EL1, /* Auxiliary Fault Status Register 1 */ - FAR_EL1, /* Fault Address Register */ - MAIR_EL1, /* Memory Attribute Indirection Register */ - VBAR_EL1, /* Vector Base Address Register */ - CONTEXTIDR_EL1, /* Context ID Register */ TPIDR_EL0, /* Thread ID, User R/W */ TPIDRRO_EL0, /* Thread ID, User R/O */ TPIDR_EL1, /* Thread ID, Privileged */ - AMAIR_EL1, /* Aux Memory Attribute Indirection Register */ CNTKCTL_EL1, /* Timer Control Register (EL1) */ PAR_EL1, /* Physical Address Register */ - MDSCR_EL1, /* Monitor Debug System Control Register */ MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */ OSLSR_EL1, /* OS Lock Status Register */ DISR_EL1, /* Deferred Interrupt Status Register */ @@ -358,20 +360,9 @@ enum vcpu_sysreg { APGAKEYLO_EL1, APGAKEYHI_EL1, - ELR_EL1, - SP_EL1, - SPSR_EL1, - - CNTVOFF_EL2, - CNTV_CVAL_EL0, - CNTV_CTL_EL0, - CNTP_CVAL_EL0, - CNTP_CTL_EL0, - /* Memory Tagging Extension registers */ RGSR_EL1, /* Random Allocation Tag Seed Register */ GCR_EL1, /* Tag Control Register */ - TFSR_EL1, /* Tag Fault Status Register (EL1) */ TFSRE0_EL1, /* Tag Fault Status Register (EL0) */ /* 32bit specific registers. */ @@ -381,20 +372,14 @@ enum vcpu_sysreg { DBGVCR32_EL2, /* Debug Vector Catch Register */ /* EL2 registers */ - VPIDR_EL2, /* Virtualization Processor ID Register */ - VMPIDR_EL2, /* Virtualization Multiprocessor ID Register */ SCTLR_EL2, /* System Control Register (EL2) */ ACTLR_EL2, /* Auxiliary Control Register (EL2) */ - HCR_EL2, /* Hypervisor Configuration Register */ MDCR_EL2, /* Monitor Debug Configuration Register (EL2) */ CPTR_EL2, /* Architectural Feature Trap Register (EL2) */ - HSTR_EL2, /* Hypervisor System Trap Register */ HACR_EL2, /* Hypervisor Auxiliary Control Register */ TTBR0_EL2, /* Translation Table Base Register 0 (EL2) */ TTBR1_EL2, /* Translation Table Base Register 1 (EL2) */ TCR_EL2, /* Translation Control Register (EL2) */ - VTTBR_EL2, /* Virtualization Translation Table Base Register */ - VTCR_EL2, /* Virtualization Translation Control Register */ SPSR_EL2, /* EL2 saved program status register */ ELR_EL2, /* EL2 exception link register */ AFSR0_EL2, /* Auxiliary Fault Status Register 0 (EL2) */ @@ -407,7 +392,6 @@ enum vcpu_sysreg { VBAR_EL2, /* Vector Base Address Register (EL2) */ RVBAR_EL2, /* Reset Vector Base Address Register */ CONTEXTIDR_EL2, /* Context ID Register (EL2) */ - TPIDR_EL2, /* EL2 Software Thread ID Register */ CNTHCTL_EL2, /* Counter-timer Hypervisor Control register */ SP_EL2, /* EL2 Stack Pointer */ CNTHP_CTL_EL2, @@ -415,6 +399,42 @@ enum vcpu_sysreg { CNTHV_CTL_EL2, CNTHV_CVAL_EL2, + __VNCR_START__, /* Any VNCR-capable reg goes after this point */ + + VNCR(SCTLR_EL1),/* System Control Register */ + VNCR(ACTLR_EL1),/* Auxiliary Control Register */ + VNCR(CPACR_EL1),/* Coprocessor Access Control */ + VNCR(ZCR_EL1), /* SVE Control */ + VNCR(TTBR0_EL1),/* Translation Table Base Register 0 */ + VNCR(TTBR1_EL1),/* Translation Table Base Register 1 */ + VNCR(TCR_EL1), /* Translation Control Register */ + VNCR(ESR_EL1), /* Exception Syndrome Register */ + VNCR(AFSR0_EL1),/* Auxiliary Fault Status Register 0 */ + VNCR(AFSR1_EL1),/* Auxiliary Fault Status Register 1 */ + VNCR(FAR_EL1), /* Fault Address Register */ + VNCR(MAIR_EL1), /* Memory Attribute Indirection Register */ + VNCR(VBAR_EL1), /* Vector Base Address Register */ + VNCR(CONTEXTIDR_EL1), /* Context ID Register */ + VNCR(AMAIR_EL1),/* Aux Memory Attribute Indirection Register */ + VNCR(MDSCR_EL1),/* Monitor Debug System Control Register */ + VNCR(ELR_EL1), + VNCR(SP_EL1), + VNCR(SPSR_EL1), + VNCR(TFSR_EL1), /* Tag Fault Status Register (EL1) */ + VNCR(VPIDR_EL2),/* Virtualization Processor ID Register */ + VNCR(VMPIDR_EL2),/* Virtualization Multiprocessor ID Register */ + VNCR(HCR_EL2), /* Hypervisor Configuration Register */ + VNCR(HSTR_EL2), /* Hypervisor System Trap Register */ + VNCR(VTTBR_EL2),/* Virtualization Translation Table Base Register */ + VNCR(VTCR_EL2), /* Virtualization Translation Control Register */ + VNCR(TPIDR_EL2),/* EL2 Software Thread ID Register */ + + VNCR(CNTVOFF_EL2), + VNCR(CNTV_CVAL_EL0), + VNCR(CNTV_CTL_EL0), + VNCR(CNTP_CVAL_EL0), + VNCR(CNTP_CTL_EL0), + NR_SYS_REGS /* Nothing after this line! */ }; @@ -431,6 +451,9 @@ struct kvm_cpu_context { u64 sys_regs[NR_SYS_REGS]; struct kvm_vcpu *__hyp_running_vcpu; + + /* This pointer has to be 4kB aligned. */ + u64 *vncr_array; }; struct kvm_host_data { @@ -772,8 +795,19 @@ struct kvm_vcpu_arch { * accessed by a running VCPU. For example, for userspace access or * for system registers that are never context switched, but only * emulated. + * + * Don't bother with VNCR-based accesses in the nVHE code, it has no + * business dealing with NV. */ -#define __ctxt_sys_reg(c,r) (&(c)->sys_regs[(r)]) +static inline u64 *__ctxt_sys_reg(const struct kvm_cpu_context *ctxt, int r) +{ +#if !defined (__KVM_NVHE_HYPERVISOR__) + if (unlikely(cpus_have_final_cap(ARM64_HAS_ENHANCED_NESTED_VIRT) && + r >= __VNCR_START__ && ctxt->vncr_array)) + return &ctxt->vncr_array[r - __VNCR_START__]; +#endif + return (u64 *)&ctxt->sys_regs[r]; +} #define ctxt_sys_reg(c,r) (*__ctxt_sys_reg(c,r)) diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps index 40ba95472594..fc29a6400f02 100644 --- a/arch/arm64/tools/cpucaps +++ b/arch/arm64/tools/cpucaps @@ -24,6 +24,7 @@ HAS_DIT HAS_E0PD HAS_ECV HAS_ECV_CNTPOFF +HAS_ENHANCED_NESTED_VIRT HAS_EPAN HAS_GENERIC_AUTH HAS_GENERIC_AUTH_ARCH_QARMA3 From patchwork Wed Apr 5 15:39:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202094 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB519C77B60 for ; Wed, 5 Apr 2023 15:46:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239027AbjDEPqM (ORCPT ); Wed, 5 Apr 2023 11:46:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239001AbjDEPqH (ORCPT ); Wed, 5 Apr 2023 11:46:07 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A20E140E8 for ; Wed, 5 Apr 2023 08:46:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 984736393B for ; Wed, 5 Apr 2023 15:45:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D641BC433D2; Wed, 5 Apr 2023 15:45:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709559; bh=/ioysnnKbOV23f25ZT+oV+7bSjUxiF+BradC4DXgBK8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gAK4ikb4y7BW1B0vZK/vee3otsPA9CJKrX9a7+fFowKyTuRsSYwjZgPKVHMlvNcu6 ceTutiAoZaJjDPZE3wHQk6WYZJN0EOKtpbMtegJfLQ7/8sudoQzcYDCcp8GKCCvT8R t0Ih5djKerteiJwzW4qj8NcWS38wlXNsRizjOqj2l6GoFJvULxY/KUTKd6gTOScUzy g9pQ/ZEi82TWYclX2qCQWslaig7Y76wrRrL8RETgVsVvQhFVToAS8klPsAbw8XjqJn ctkfd0vxflpaim0swgBVOum2BliFXKu4P7MfdUWpsSbDpprDtSaLI8HeG+yC4dJvYP Ds9QYXaU3aaxg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fn-0062PV-ND; Wed, 05 Apr 2023 16:40:56 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 40/50] KVM: arm64: nv: Move nested vgic state into the sysreg file Date: Wed, 5 Apr 2023 16:39:58 +0100 Message-Id: <20230405154008.3552854-41-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The vgic nested state needs to be accessible from the VNCR page, and thus needs to be part of the normal sysreg file. Let's move it there. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_host.h | 9 +++ arch/arm64/kvm/sys_regs.c | 53 ++++++++++++------ arch/arm64/kvm/vgic/vgic-v3-nested.c | 82 ++++++++++++++-------------- arch/arm64/kvm/vgic/vgic-v3.c | 12 ++-- arch/arm64/kvm/vgic/vgic.h | 10 ++++ include/kvm/arm_vgic.h | 7 --- 6 files changed, 104 insertions(+), 69 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index eb095a9eecba..c100b15b34d1 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -435,6 +435,15 @@ enum vcpu_sysreg { VNCR(CNTP_CVAL_EL0), VNCR(CNTP_CTL_EL0), + VNCR(ICH_LR0_EL2), + ICH_LR15_EL2 = ICH_LR0_EL2 + 15, + VNCR(ICH_AP0R0_EL2), + ICH_AP0R3_EL2 = ICH_AP0R0_EL2 + 3, + VNCR(ICH_AP1R0_EL2), + ICH_AP1R3_EL2 = ICH_AP1R0_EL2 + 3, + VNCR(ICH_HCR_EL2), + VNCR(ICH_VMCR_EL2), + NR_SYS_REGS /* Nothing after this line! */ }; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 6bd78b58894f..4a1c4a97f702 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2125,17 +2125,17 @@ static bool access_gic_apr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; - u32 index, *base; + u64 *base; + u8 index; index = r->Op2; if (r->CRm == 8) - base = cpu_if->vgic_ap0r; + base = __ctxt_sys_reg(&vcpu->arch.ctxt, ICH_AP0R0_EL2); else - base = cpu_if->vgic_ap1r; + base = __ctxt_sys_reg(&vcpu->arch.ctxt, ICH_AP1R0_EL2); if (p->is_write) - base[index] = p->regval; + base[index] = lower_32_bits(p->regval); else p->regval = base[index]; @@ -2146,12 +2146,10 @@ static bool access_gic_hcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; - if (p->is_write) - cpu_if->vgic_hcr = p->regval; + __vcpu_sys_reg(vcpu, ICH_HCR_EL2) = lower_32_bits(p->regval); else - p->regval = cpu_if->vgic_hcr; + p->regval = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); return true; } @@ -2208,12 +2206,19 @@ static bool access_gic_vmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; - if (p->is_write) - cpu_if->vgic_vmcr = p->regval; + __vcpu_sys_reg(vcpu, ICH_VMCR_EL2) = (p->regval & + (ICH_VMCR_ENG0_MASK | + ICH_VMCR_ENG1_MASK | + ICH_VMCR_PMR_MASK | + ICH_VMCR_BPR0_MASK | + ICH_VMCR_BPR1_MASK | + ICH_VMCR_EOIM_MASK | + ICH_VMCR_CBPR_MASK | + ICH_VMCR_FIQ_EN_MASK | + ICH_VMCR_ACK_CTL_MASK)); else - p->regval = cpu_if->vgic_vmcr; + p->regval = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2); return true; } @@ -2222,17 +2227,29 @@ static bool access_gic_lr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { - struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.nested_vgic_v3; u32 index; + u64 *base; + base = __ctxt_sys_reg(&vcpu->arch.ctxt, ICH_LR0_EL2); index = p->Op2; if (p->CRm == 13) index += 8; - if (p->is_write) - cpu_if->vgic_lr[index] = p->regval; - else - p->regval = cpu_if->vgic_lr[index]; + if (p->is_write) { + u64 mask = (ICH_LR_VIRTUAL_ID_MASK | + ICH_LR_GROUP | + ICH_LR_HW | + ICH_LR_STATE); + + if (p->regval & ICH_LR_HW) + mask |= ICH_LR_PHYS_ID_MASK; + else + mask |= ICH_LR_EOI; + + base[index] = p->regval & mask; + } else { + p->regval = base[index]; + } return true; } diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c index 919275b94625..12937bc86e1c 100644 --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -15,11 +15,6 @@ #include "vgic.h" -static inline struct vgic_v3_cpu_if *vcpu_nested_if(struct kvm_vcpu *vcpu) -{ - return &vcpu->arch.vgic_cpu.nested_vgic_v3; -} - static inline struct vgic_v3_cpu_if *vcpu_shadow_if(struct kvm_vcpu *vcpu) { return &vcpu->arch.vgic_cpu.shadow_vgic_v3; @@ -32,12 +27,11 @@ static inline bool lr_triggers_eoi(u64 lr) u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); u16 reg = 0; int i; for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { - if (lr_triggers_eoi(cpu_if->vgic_lr[i])) + if (lr_triggers_eoi(__vcpu_sys_reg(vcpu, ICH_LRN(i)))) reg |= BIT(i); } @@ -46,12 +40,11 @@ u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu) u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); u16 reg = 0; int i; for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { - if (!(cpu_if->vgic_lr[i] & ICH_LR_STATE)) + if (!(__vcpu_sys_reg(vcpu, ICH_LRN(i)) & ICH_LR_STATE)) reg |= BIT(i); } @@ -60,14 +53,13 @@ u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu) u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); int nr_lr = kvm_vgic_global_state.nr_lr; u64 reg = 0; if (vgic_v3_get_eisr(vcpu)) reg |= ICH_MISR_EOI; - if (cpu_if->vgic_hcr & ICH_HCR_UIE) { + if (__vcpu_sys_reg(vcpu, ICH_HCR_EL2) & ICH_HCR_UIE) { int used_lrs; used_lrs = nr_lr - hweight16(vgic_v3_get_elrsr(vcpu)); @@ -86,13 +78,12 @@ u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu) */ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); struct vgic_irq *irq; int i, used_lrs = 0; for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { - u64 lr = cpu_if->vgic_lr[i]; + u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i)); int l1_irq; if (!(lr & ICH_LR_HW)) @@ -124,31 +115,14 @@ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu) s_cpu_if->used_lrs = used_lrs; } -/* - * Change the shadow HWIRQ field back to the virtual value before copying over - * the entire shadow struct to the nested state. - */ -static void vgic_v3_fixup_shadow_lr_state(struct kvm_vcpu *vcpu) -{ - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); - struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); - int lr; - - for (lr = 0; lr < kvm_vgic_global_state.nr_lr; lr++) { - s_cpu_if->vgic_lr[lr] &= ~ICH_LR_PHYS_ID_MASK; - s_cpu_if->vgic_lr[lr] |= cpu_if->vgic_lr[lr] & ICH_LR_PHYS_ID_MASK; - } -} - void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); struct vgic_irq *irq; int i; for (i = 0; i < s_cpu_if->used_lrs; i++) { - u64 lr = cpu_if->vgic_lr[i]; + u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i)); int l1_irq; if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE)) @@ -172,14 +146,27 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) } } +void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu) +{ + struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.shadow_vgic_v3; + int i; + + cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); + cpu_if->vgic_vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2); + + for (i = 0; i < 4; i++) { + cpu_if->vgic_ap0r[i] = __vcpu_sys_reg(vcpu, ICH_AP0RN(i)); + cpu_if->vgic_ap1r[i] = __vcpu_sys_reg(vcpu, ICH_AP1RN(i)); + } + + vgic_v3_create_shadow_lr(vcpu); +} + void vgic_v3_load_nested(struct kvm_vcpu *vcpu) { - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; struct vgic_irq *irq; unsigned long flags; - vgic_cpu->shadow_vgic_v3 = vgic_cpu->nested_vgic_v3; - vgic_v3_create_shadow_lr(vcpu); __vgic_v3_restore_state(vcpu_shadow_if(vcpu)); irq = vgic_get_irq(vcpu->kvm, vcpu, vcpu->kvm->arch.vgic.maint_irq); @@ -193,16 +180,32 @@ void vgic_v3_load_nested(struct kvm_vcpu *vcpu) void vgic_v3_put_nested(struct kvm_vcpu *vcpu) { - struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); + int i; - __vgic_v3_save_state(vcpu_shadow_if(vcpu)); + __vgic_v3_save_state(s_cpu_if); /* * Translate the shadow state HW fields back to the virtual ones * before copying the shadow struct back to the nested one. */ - vgic_v3_fixup_shadow_lr_state(vcpu); - vgic_cpu->nested_vgic_v3 = vgic_cpu->shadow_vgic_v3; + __vcpu_sys_reg(vcpu, ICH_HCR_EL2) = s_cpu_if->vgic_hcr; + __vcpu_sys_reg(vcpu, ICH_VMCR_EL2) = s_cpu_if->vgic_vmcr; + + for (i = 0; i < 4; i++) { + __vcpu_sys_reg(vcpu, ICH_AP0RN(i)) = s_cpu_if->vgic_ap0r[i]; + __vcpu_sys_reg(vcpu, ICH_AP1RN(i)) = s_cpu_if->vgic_ap1r[i]; + } + + for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { + u64 val = __vcpu_sys_reg(vcpu, ICH_LRN(i)); + + val &= ~ICH_LR_STATE; + val |= s_cpu_if->vgic_lr[i] & ICH_LR_STATE; + + __vcpu_sys_reg(vcpu, ICH_LRN(i)) = val; + } + irq_set_irqchip_state(kvm_vgic_global_state.maint_irq, IRQCHIP_STATE_ACTIVE, false); } @@ -216,10 +219,9 @@ void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu) * again. */ if (vgic_state_is_nested(vcpu)) { - struct vgic_v3_cpu_if *cpu_if = vcpu_nested_if(vcpu); bool state; - state = cpu_if->vgic_hcr & ICH_HCR_EN; + state = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) & ICH_HCR_EN; state &= vgic_v3_get_misr(vcpu); kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 7d4e2fb8f54e..deeac4142961 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -281,10 +281,11 @@ void vgic_v3_enable(struct kvm_vcpu *vcpu) ICC_SRE_EL1_SRE); /* * If nesting is allowed, force GICv3 onto the nested - * guests as well. + * guests as well by setting the shadow state to the + * same value. */ if (vcpu_has_nv(vcpu)) - vcpu->arch.vgic_cpu.nested_vgic_v3.vgic_sre = vgic_v3->vgic_sre; + vcpu->arch.vgic_cpu.shadow_vgic_v3.vgic_sre = vgic_v3->vgic_sre; vcpu->arch.vgic_cpu.pendbaser = INITIAL_PENDBASER_VALUE; } else { vgic_v3->vgic_sre = 0; @@ -734,10 +735,13 @@ void vgic_v3_load(struct kvm_vcpu *vcpu) vcpu->arch.vgic_cpu.current_cpu_if = cpu_if; /* - * vgic_v3_load_nested only affects the LRs in the shadow - * state, so it is fine to pass the nested state around. + * If the vgic is in nested state, populate the shadow state + * from the guest's nested state. As vgic_v3_load_nested() + * will only load LRs, let's deal with the rest of the state + * here as if it was a non-nested state. Cunning. */ if (vgic_state_is_nested(vcpu)) { + vgic_v3_create_shadow_state(vcpu); cpu_if = &vcpu->arch.vgic_cpu.shadow_vgic_v3; vcpu->arch.vgic_cpu.current_cpu_if = cpu_if; } diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h index f9923beedd27..bbf07bde4dc0 100644 --- a/arch/arm64/kvm/vgic/vgic.h +++ b/arch/arm64/kvm/vgic/vgic.h @@ -344,4 +344,14 @@ void vgic_v4_configure_vsgis(struct kvm *kvm); void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val); int vgic_v4_request_vpe_irq(struct kvm_vcpu *vcpu, int irq); +void vgic_v3_sync_nested(struct kvm_vcpu *vcpu); +void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu); +void vgic_v3_load_nested(struct kvm_vcpu *vcpu); +void vgic_v3_put_nested(struct kvm_vcpu *vcpu); +void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu); + +#define ICH_LRN(n) (ICH_LR0_EL2 + (n)) +#define ICH_AP0RN(n) (ICH_AP0R0_EL2 + (n)) +#define ICH_AP1RN(n) (ICH_AP1R0_EL2 + (n)) + #endif diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 1a2e2e8abd92..9b91a8135dac 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -334,9 +334,6 @@ struct vgic_cpu { struct vgic_irq private_irqs[VGIC_NR_PRIVATE_IRQS]; - /* CPU vif control registers for the virtual GICH interface */ - struct vgic_v3_cpu_if nested_vgic_v3; - /* * The shadow vif control register loaded to the hardware when * running a nested L2 guest with the virtual IMO/FMO bit set. @@ -403,10 +400,6 @@ void kvm_vgic_load(struct kvm_vcpu *vcpu); void kvm_vgic_put(struct kvm_vcpu *vcpu); void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu); -void vgic_v3_sync_nested(struct kvm_vcpu *vcpu); -void vgic_v3_load_nested(struct kvm_vcpu *vcpu); -void vgic_v3_put_nested(struct kvm_vcpu *vcpu); -void vgic_v3_handle_nested_maint_irq(struct kvm_vcpu *vcpu); u16 vgic_v3_get_eisr(struct kvm_vcpu *vcpu); u16 vgic_v3_get_elrsr(struct kvm_vcpu *vcpu); u64 vgic_v3_get_misr(struct kvm_vcpu *vcpu); From patchwork Wed Apr 5 15:39:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B833C7619A for ; Wed, 5 Apr 2023 15:47:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238997AbjDEPrA (ORCPT ); Wed, 5 Apr 2023 11:47:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239004AbjDEPqy (ORCPT ); Wed, 5 Apr 2023 11:46:54 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C05E0658D for ; Wed, 5 Apr 2023 08:46:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E6078623DE for ; Wed, 5 Apr 2023 15:46:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A3C3C433D2; Wed, 5 Apr 2023 15:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709583; bh=jexwEcEYoC0w5lw9cBv3BUVfKVxaWNo5BXzlFRvTiKE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gRE37ppRuH/Tkwl23J0CS1yiyT0L6bnufzl5owffXuf0Q/UzPPjLGyYsWhl8gws8C RsjjQ3oNHlzV44SoWH0qWfnuJUz5sH56XViiPHubJKry0pMMzWbkCaqWz2/8MiyDTx uTxiTQ8Awr950Ogb5gY9zKiovV3ll0PNpQU64isi+ECiADMwGS9WIxKn+MCwvz8RtX 7KVyK9NQy3h0MJeOfrAkSAWU750C5M4364NDe81jkV/+EzDTAqqWPYhckwAzJ9rwqq nkeCBF358TJBkPMqY8sgD1OJwlzy2tns9kpxluGOSI8qaI6eEhQh0cdHoyniMQImu4 6bEPK9cx/uB3Q== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fq-0062PV-AB; Wed, 05 Apr 2023 16:40:59 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 41/50] KVM: arm64: Add FEAT_NV2 cpu feature Date: Wed, 5 Apr 2023 16:39:59 +0100 Message-Id: <20230405154008.3552854-42-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add the detection code for the FEAT_NV2 feature. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_nested.h | 6 ++++++ arch/arm64/kernel/cpufeature.c | 11 +++++++++++ 2 files changed, 17 insertions(+) diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index 784168dcb0fe..a58c3df12379 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -14,6 +14,12 @@ static inline bool vcpu_has_nv(const struct kvm_vcpu *vcpu) test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features)); } +static inline bool vcpu_has_nv2(const struct kvm_vcpu *vcpu) +{ + return cpus_have_final_cap(ARM64_HAS_ENHANCED_NESTED_VIRT) && + vcpu_has_nv(vcpu); +} + /* Translation helpers from non-VHE EL2 to EL1 */ static inline u64 tcr_el2_ps_to_tcr_el1_ips(u64 tcr_el2) { diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index c331c49a7d19..db877f1d3a7b 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2298,6 +2298,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = { .field_width = 4, .min_field_value = ID_AA64MMFR2_EL1_NV_IMP, }, + { + .desc = "Enhanced Nested Virtualization Support", + .capability = ARM64_HAS_ENHANCED_NESTED_VIRT, + .type = ARM64_CPUCAP_SYSTEM_FEATURE, + .matches = has_nested_virt_support, + .sys_reg = SYS_ID_AA64MMFR2_EL1, + .sign = FTR_UNSIGNED, + .field_pos = ID_AA64MMFR2_EL1_NV_SHIFT, + .field_width = 4, + .min_field_value = ID_AA64MMFR2_EL1_NV_NV2, + }, { .capability = ARM64_HAS_32BIT_EL0_DO_NOT_USE, .type = ARM64_CPUCAP_SYSTEM_FEATURE, From patchwork Wed Apr 5 15:40:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78883C76188 for ; Wed, 5 Apr 2023 15:46:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238985AbjDEPq2 (ORCPT ); Wed, 5 Apr 2023 11:46:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239024AbjDEPqM (ORCPT ); Wed, 5 Apr 2023 11:46:12 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 047972139 for ; Wed, 5 Apr 2023 08:46:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7904063EFD for ; Wed, 5 Apr 2023 15:46:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDBBCC433D2; Wed, 5 Apr 2023 15:46:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709564; bh=mIGEzqeJ4OWwVwkeI3yoSLFdEg4/R9rCF1HJ3dE3CcU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PXH2v50/FTDjMyqolfO0L1hk5dlmFSbdQ4VVZF9WffokwoLhpcMl3JpCIG9QQ9bRr dUNkTu7rCoQfLOvnrz/AwKr2CdC+2y1p/3G7FKjG8BIMOtKRdswUlOClRx5YIUIwnR V9he/Uvb50TaFfE7bG7GULAIgG/vdf95toDNOIZxYlxslbIK6Q3DiJh91BO99TdBfq X8wbcE0eH6uUlUFzG+2M/J0NPj1xuimP1QNyxynBBH6MBpdShCFVuf9wSylV8+Cb4g 9i1v0L9DnePTxK/+Drz//6kqgQzlFA6VCNrOKvAbQ59CfS1WHLx9ahK5YIxwVMsLqO AnECQOM4uBxgw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fr-0062PV-Du; Wed, 05 Apr 2023 16:41:00 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 42/50] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 Date: Wed, 5 Apr 2023 16:40:00 +0100 Message-Id: <20230405154008.3552854-43-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Christoffer Dall Emulating the timers with FEAT_NV2 is a bit odd, as the timers can be reconfigured behind our back without the hypervisor even noticing. In the VHE case, that's an actual regression in the architecture... Co-developed-by: Christoffer Dall Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier --- arch/arm64/kvm/arch_timer.c | 44 ++++++++++++++++++++++++++++++++++++ arch/arm64/kvm/arm.c | 3 +++ include/kvm/arm_arch_timer.h | 1 + 3 files changed, 48 insertions(+) diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index c5c8cc3c25ae..2a1d88efada4 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -914,6 +914,50 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu) kvm_timer_blocking(vcpu); } +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu) +{ + /* + * When NV2 is on, guest hypervisors have their EL0 timer register + * accesses redirected to the VNCR page. Any guest action taken on + * the timer is postponed until the next exit, leading to a very + * poor quality of emulation. + */ + if (!is_hyp_ctxt(vcpu)) + return; + + if (!vcpu_el2_e2h_is_set(vcpu)) { + /* + * A non-VHE guest hypervisor doesn't have any direct access + * to its timers: the EL2 registers trap (and the HW is + * fully emulated), while the EL0 registers access memory + * despite the access being notionally direct. Boo. + * + * We update the hardware timer registers with the + * latest value written by the guest to the VNCR page + * and let the hardware take care of the rest. + */ + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CTL_EL0), SYS_CNTV_CTL); + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0), SYS_CNTV_CVAL); + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CTL_EL0), SYS_CNTP_CTL); + write_sysreg_el0(__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0), SYS_CNTP_CVAL); + } else { + /* + * For a VHE guest hypervisor, the EL2 state is directly + * stored in the host EL0 timers, while the emulated EL0 + * state is stored in the VNCR page. The latter could have + * been updated behind our back, and we must reset the + * emulation of the timers. + */ + struct timer_map map; + get_timer_map(vcpu, &map); + + soft_timer_cancel(&map.emul_vtimer->hrtimer); + soft_timer_cancel(&map.emul_ptimer->hrtimer); + timer_emulate(map.emul_vtimer); + timer_emulate(map.emul_ptimer); + } +} + /* * With a userspace irqchip we have to check if the guest de-asserted the * timer and if so, unmask the timer irq signal on the host interrupt diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 6af9fdf9b44d..c7089e558252 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -985,6 +985,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) if (static_branch_unlikely(&userspace_irqchip_in_use)) kvm_timer_sync_user(vcpu); + if (vcpu_has_nv2(vcpu)) + kvm_timer_sync_nested(vcpu); + kvm_arch_vcpu_ctxsync_fp(vcpu); /* diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h index 52008f5cff06..a8382dda1c2e 100644 --- a/include/kvm/arm_arch_timer.h +++ b/include/kvm/arm_arch_timer.h @@ -98,6 +98,7 @@ int __init kvm_timer_hyp_init(bool has_gic); int kvm_timer_enable(struct kvm_vcpu *vcpu); int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu); void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu); +void kvm_timer_sync_nested(struct kvm_vcpu *vcpu); void kvm_timer_sync_user(struct kvm_vcpu *vcpu); bool kvm_timer_should_notify_user(struct kvm_vcpu *vcpu); void kvm_timer_update_run(struct kvm_vcpu *vcpu); From patchwork Wed Apr 5 15:40:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BF84C761A6 for ; Wed, 5 Apr 2023 15:47:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239008AbjDEPrR (ORCPT ); Wed, 5 Apr 2023 11:47:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238972AbjDEPrP (ORCPT ); Wed, 5 Apr 2023 11:47:15 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74C105FE1 for ; Wed, 5 Apr 2023 08:46:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8546063EFF for ; Wed, 5 Apr 2023 15:46:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EBDC8C433EF; Wed, 5 Apr 2023 15:46:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709605; bh=BRXV6rAWpc0/CXcNW2cHbdZA6l0F6lbjVb8xQjdp5Es=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RFBRoFAN7xUa3vqYMf/KeC7NMmCy+G35xe1KKuGNOY0n5Y9A75spHh1l2SZPlnsUf oP4IZkm8MRvMEjDK0bDqr9qaWxV2kcX0onmobR3rBbMEx2aJTU9vx9YAtkkx6woVCx u1hvpruBwTgC9GJx44qv5BNUcY9GuWHmTRsSjSCxumhaR63uR2Zovh1DWJTCbbFzXS 1Uw1evoQJ1ufqzF3p+7g40Nu7d6FYssUxEAwLVHqGWp0IwzgSn3/xFY5Nz2O7Ddm91 Cja07pfEs5O8RUn9GvF4SnIdusj6h3I/PKb41C7iExreMNTgN8+5vBiDx/2vgmlGk6 juj07dxMpwv7w== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Ft-0062PV-Qs; Wed, 05 Apr 2023 16:41:02 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 43/50] KVM: arm64: nv: Fold GICv3 host trapping requirements into guest setup Date: Wed, 5 Apr 2023 16:40:01 +0100 Message-Id: <20230405154008.3552854-44-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Popular HW that is able to use NV also has a broken vgic implementation that requires trapping. On such HW, propagate the host trap bits into the guest's shadow ICH_HCR_EL2 register, making sure we don't allow an L2 guest to bring the system down. This involves a bit of tweaking so that the emulation code correctly poicks up the shadow state as needed, and to only partially sync ICH_HCR_EL2 back with the guest state to capture EOIcount. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vgic-v3-sr.c | 4 ++-- arch/arm64/kvm/vgic/vgic-v3-nested.c | 21 ++++++++++++++++++--- 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 75152c1ce646..aaaea35099e5 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -484,7 +484,7 @@ static int __vgic_v3_get_group(struct kvm_vcpu *vcpu) static int __vgic_v3_highest_priority_lr(struct kvm_vcpu *vcpu, u32 vmcr, u64 *lr_val) { - unsigned int used_lrs = vcpu->arch.vgic_cpu.vgic_v3.used_lrs; + unsigned int used_lrs = kern_hyp_va(vcpu->arch.vgic_cpu.current_cpu_if)->used_lrs; u8 priority = GICv3_IDLE_PRIORITY; int i, lr = -1; @@ -523,7 +523,7 @@ static int __vgic_v3_highest_priority_lr(struct kvm_vcpu *vcpu, u32 vmcr, static int __vgic_v3_find_active_lr(struct kvm_vcpu *vcpu, int intid, u64 *lr_val) { - unsigned int used_lrs = vcpu->arch.vgic_cpu.vgic_v3.used_lrs; + unsigned int used_lrs = kern_hyp_va(vcpu->arch.vgic_cpu.current_cpu_if)->used_lrs; int i; for (i = 0; i < used_lrs; i++) { diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c index 12937bc86e1c..51f97bd4489d 100644 --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c @@ -149,9 +149,20 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu) void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.shadow_vgic_v3; + struct vgic_v3_cpu_if *host_if = &vcpu->arch.vgic_cpu.vgic_v3; + u64 val = 0; int i; - cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); + /* + * If we're on a system with a broken vgic that requires + * trapping, propagate the trapping requirements. + * + * Ah, the smell of rotten fruits... + */ + if (static_branch_unlikely(&vgic_v3_cpuif_trap)) + val = host_if->vgic_hcr & (ICH_HCR_TALL0 | ICH_HCR_TALL1 | + ICH_HCR_TC | ICH_HCR_TDIR); + cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) | val; cpu_if->vgic_vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2); for (i = 0; i < 4; i++) { @@ -181,6 +192,7 @@ void vgic_v3_load_nested(struct kvm_vcpu *vcpu) void vgic_v3_put_nested(struct kvm_vcpu *vcpu) { struct vgic_v3_cpu_if *s_cpu_if = vcpu_shadow_if(vcpu); + u64 val; int i; __vgic_v3_save_state(s_cpu_if); @@ -189,7 +201,10 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu) * Translate the shadow state HW fields back to the virtual ones * before copying the shadow struct back to the nested one. */ - __vcpu_sys_reg(vcpu, ICH_HCR_EL2) = s_cpu_if->vgic_hcr; + val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2); + val &= ~ICH_HCR_EOIcount_MASK; + val |= (s_cpu_if->vgic_hcr & ICH_HCR_EOIcount_MASK); + __vcpu_sys_reg(vcpu, ICH_HCR_EL2) = val; __vcpu_sys_reg(vcpu, ICH_VMCR_EL2) = s_cpu_if->vgic_vmcr; for (i = 0; i < 4; i++) { @@ -198,7 +213,7 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu) } for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) { - u64 val = __vcpu_sys_reg(vcpu, ICH_LRN(i)); + val = __vcpu_sys_reg(vcpu, ICH_LRN(i)); val &= ~ICH_LR_STATE; val |= s_cpu_if->vgic_lr[i] & ICH_LR_STATE; From patchwork Wed Apr 5 15:40:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E72BFC7619A for ; Wed, 5 Apr 2023 15:46:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239032AbjDEPqO (ORCPT ); Wed, 5 Apr 2023 11:46:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239002AbjDEPqH (ORCPT ); Wed, 5 Apr 2023 11:46:07 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E782B6587 for ; Wed, 5 Apr 2023 08:46:01 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7F61D63EDC for ; Wed, 5 Apr 2023 15:46:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6343C4339B; Wed, 5 Apr 2023 15:46:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709560; bh=LnE0GKRDFVFvMCAUuZtgI4oqPODsAv/I2wvajHnGEII=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eI5NqGimGJMiuIfjuyIT/zyqTKUarUjBI0TPC/e84CraQRtUhDwb57OPRE3azBRZa BhBPIQMsDFwW4qRoEhenwkcPz/Y4Ncs/+3hXmorXKeH8w6amkx5vCRL0W5+dAgxABh VoT3i6oR6N82wF6fU061bPhss+DmPbXMW4HJWHXOFoOCbnbvyF3NlZeeDY/o6s7+At IrFtbk5uFAd3RC56mlP+JBSYcmCs+QrNOH6FFMoYdDRkJl+Z7QrdmYJFMz7c3fWhS3 B9NeHqkibMfaFu5AZuAqN6DHYufMKECw3bJtovNqXyIsHN5tiQetvbCoOiYgv6MhIK AZINPIyL3qnhw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fv-0062PV-1y; Wed, 05 Apr 2023 16:41:03 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 44/50] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state Date: Wed, 5 Apr 2023 16:40:02 +0100 Message-Id: <20230405154008.3552854-45-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With FEAT_NV2, the EL0 timer state is entirely stored in memory, meaning that the hypervisor can only provide a very poor emulation. The only thing we can really do is to publish the interrupt state in the guest view of CNT{P,V}_CTL_EL0, and defer everything else to the next exit. Only FEAT_ECV will allow us to fix it, at the cost of extra trapping. Suggested-by: Chase Conklin Suggested-by: Ganapatrao Kulkarni Signed-off-by: Marc Zyngier --- arch/arm64/kvm/arch_timer.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index 2a1d88efada4..98fddb4a846d 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -453,6 +453,25 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level, { int ret; + /* + * Paper over NV2 brokenness by publishing the interrupt status + * bit. This still results in a poor quality of emulation (guest + * writes will have no effect until the next exit). + * + * But hey, it's fast, right? + */ + if (vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu) && + (timer_ctx == vcpu_vtimer(vcpu) || timer_ctx == vcpu_ptimer(vcpu))) { + u32 ctl = timer_get_ctl(timer_ctx); + + if (new_level) + ctl |= ARCH_TIMER_CTRL_IT_STAT; + else + ctl &= ~ARCH_TIMER_CTRL_IT_STAT; + + timer_set_ctl(timer_ctx, ctl); + } + timer_ctx->irq.level = new_level; trace_kvm_timer_update_irq(vcpu->vcpu_id, timer_irq(timer_ctx), timer_ctx->irq.level); From patchwork Wed Apr 5 15:40:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9345DC76188 for ; Wed, 5 Apr 2023 15:48:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239101AbjDEPsM (ORCPT ); Wed, 5 Apr 2023 11:48:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239089AbjDEPsF (ORCPT ); Wed, 5 Apr 2023 11:48:05 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 649156A5D for ; Wed, 5 Apr 2023 08:47:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 940DD63F06 for ; Wed, 5 Apr 2023 15:46:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 03735C4339C; Wed, 5 Apr 2023 15:46:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709603; bh=+2JPLukEFDEX0fzy6afrLZR6rneETyuBJB+KMvFzgh4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WpExVkEELG5OkSDW97GOsdLYhfXOjszCbAjneEGtUIVptFkbz94wE07eqg2d6V7hp 09VplT7MWmdPJaA1bAVUNsrjGkg+8d9UZMtd+Tr9tJbIV7/brHkABfh//efDhHEa8H f9tPRPB4iI7QfMXVRrPUCuLpmxynPAfatOD32e91z85Moum0UEG/yzSLjWJXL1IH8v bg1aXn7UqMIR8Dw2t7iWdJtCZBdj0W5Po0KDeazw0QgXqJkMOF1nqePvex2zZfoBQy o1jDFkBsCbkZk5eIzRysoo+g2CRcI0YUZ1GICxP8ggKvENC6kP+mDR5bhpov+2YIMu kMRX3y9jvEUfA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fv-0062PV-Bw; Wed, 05 Apr 2023 16:41:03 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 45/50] KVM: arm64: nv: Allocate VNCR page when required Date: Wed, 5 Apr 2023 16:40:03 +0100 Message-Id: <20230405154008.3552854-46-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If running a NV guest on an ARMv8.4-NV capable system, let's allocate an additional page that will be used by the hypervisor to fulfill system register accesses. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/nested.c | 10 ++++++++++ arch/arm64/kvm/reset.c | 1 + 2 files changed, 11 insertions(+) diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index 0f535a5ff941..15264ab84ab4 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -38,6 +38,14 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) if (!cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) return -EINVAL; + if (cpus_have_final_cap(ARM64_HAS_ENHANCED_NESTED_VIRT)) { + if (!vcpu->arch.ctxt.vncr_array) + vcpu->arch.ctxt.vncr_array = (u64 *)__get_free_page(GFP_KERNEL | __GFP_ZERO); + + if (!vcpu->arch.ctxt.vncr_array) + return -ENOMEM; + } + mutex_lock(&kvm->lock); /* @@ -67,6 +75,8 @@ int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) kvm_init_stage2_mmu(kvm, &tmp[num_mmus - 2], 0)) { kvm_free_stage2_pgd(&tmp[num_mmus - 1]); kvm_free_stage2_pgd(&tmp[num_mmus - 2]); + free_page((unsigned long)vcpu->arch.ctxt.vncr_array); + vcpu->arch.ctxt.vncr_array = NULL; } else { kvm->arch.nested_mmus_size = num_mmus; ret = 0; diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 49a3257dec46..8417d2fa0ebd 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -161,6 +161,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu) if (sve_state) kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu)); kfree(sve_state); + free_page((unsigned long)vcpu->arch.ctxt.vncr_array); kfree(vcpu->arch.ccsidr); } From patchwork Wed Apr 5 15:40:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04703C77B60 for ; Wed, 5 Apr 2023 15:47:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238974AbjDEPrQ (ORCPT ); Wed, 5 Apr 2023 11:47:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238928AbjDEPrO (ORCPT ); Wed, 5 Apr 2023 11:47:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 216466599 for ; Wed, 5 Apr 2023 08:46:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EBFD262967 for ; Wed, 5 Apr 2023 15:46:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5FEDAC433EF; Wed, 5 Apr 2023 15:46:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709600; bh=YswV5Jz08g+7awQFOqRKDr6YaGFWPKo6rQfKmUnA2EQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SaMUU63rG6I3UJ8EiSKdnwhPKiRozzaZf2pcX3NjbzxFND96C2tSTzTaSuag8i4oy YEYOeMrMIRVMOBjPi7Yw3EVzM7T0YlISLhy9JY3oSIr0e7ME9T3tpIBRlIZMTOp6pW weAMmPlls1JH1j2yoepOgNmfAqqMvnMezVfnlYz0KOPCddlPmYoLmb3qjP0b5/+tdO VZ82GLnVa1YayDSGd/1jLLXc8IKfl9vUvxU3K9kqJaGbVQMBOC13HvTCV3c5+Wn6Wv YEqdeYwIaUv+z+ohCe5r8aQrpxHKqa6l3M4vw6HAEr9Ii9VLmyJiRyvNRTD4A/IiSP lWNw6ve4ofATw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fv-0062PV-Kq; Wed, 05 Apr 2023 16:41:03 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 46/50] KVM: arm64: nv: Enable ARMv8.4-NV support Date: Wed, 5 Apr 2023 16:40:04 +0100 Message-Id: <20230405154008.3552854-47-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org As all the VNCR-capable system registers are nicely separated from the rest of the crowd, let's set HCR_EL2.NV2 on and let the ball rolling. Signed-off-by: Marc Zyngier --- arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/include/asm/kvm_emulate.h | 23 +++++++++++++---------- arch/arm64/include/asm/sysreg.h | 1 + arch/arm64/kvm/hyp/vhe/switch.c | 14 +++++++++++++- 4 files changed, 28 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index d446fccdca5a..6e20fcc142bb 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -20,6 +20,7 @@ #define HCR_AMVOFFEN (UL(1) << 51) #define HCR_FIEN (UL(1) << 47) #define HCR_FWB (UL(1) << 46) +#define HCR_NV2 (UL(1) << 45) #define HCR_AT (UL(1) << 44) #define HCR_NV1 (UL(1) << 43) #define HCR_NV (UL(1) << 42) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index d4721555efc9..98600f2bbcc9 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -249,21 +249,24 @@ static inline bool is_hyp_ctxt(const struct kvm_vcpu *vcpu) static inline u64 __fixup_spsr_el2_write(struct kvm_cpu_context *ctxt, u64 val) { - if (!__vcpu_el2_e2h_is_set(ctxt)) { - /* - * Clear the .M field when writing SPSR to the CPU, so that we - * can detect when the CPU clobbered our SPSR copy during a - * local exception. - */ - val &= ~0xc; - } + struct kvm_vcpu *vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt); + + if (vcpu_has_nv2(vcpu) || __vcpu_el2_e2h_is_set(ctxt)) + return val; - return val; + /* + * Clear the .M field when writing SPSR to the CPU, so that we + * can detect when the CPU clobbered our SPSR copy during a + * local exception. + */ + return val &= ~0xc; } static inline u64 __fixup_spsr_el2_read(const struct kvm_cpu_context *ctxt, u64 val) { - if (__vcpu_el2_e2h_is_set(ctxt)) + struct kvm_vcpu *vcpu = container_of(ctxt, struct kvm_vcpu, arch.ctxt); + + if (vcpu_has_nv2(vcpu) || __vcpu_el2_e2h_is_set(ctxt)) return val; /* diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h index 355480766b4a..5a55dd0141f1 100644 --- a/arch/arm64/include/asm/sysreg.h +++ b/arch/arm64/include/asm/sysreg.h @@ -432,6 +432,7 @@ #define SYS_TCR_EL2 sys_reg(3, 4, 2, 0, 2) #define SYS_VTTBR_EL2 sys_reg(3, 4, 2, 1, 0) #define SYS_VTCR_EL2 sys_reg(3, 4, 2, 1, 2) +#define SYS_VNCR_EL2 sys_reg(3, 4, 2, 2, 0) #define SYS_TRFCR_EL2 sys_reg(3, 4, 1, 2, 1) #define SYS_HDFGRTR_EL2 sys_reg(3, 4, 3, 1, 4) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index ff77b37b6f45..7682d2f13eaa 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -47,7 +47,13 @@ static void __activate_traps(struct kvm_vcpu *vcpu) * the EL1 virtual memory control register accesses * as well as the AT S1 operations. */ - hcr |= HCR_TVM | HCR_TRVM | HCR_AT | HCR_TTLB | HCR_NV1; + if (vcpu_has_nv2(vcpu)) { + hcr &= ~HCR_TVM; + } else { + hcr |= HCR_TVM | HCR_TRVM | HCR_TTLB; + } + + hcr |= HCR_AT | HCR_NV1; } else { /* * For a guest hypervisor on v8.1 (VHE), allow to @@ -81,6 +87,12 @@ static void __activate_traps(struct kvm_vcpu *vcpu) if (!vcpu_el2_tge_is_set(vcpu)) hcr |= HCR_AT | HCR_TTLB; } + + if (vcpu_has_nv2(vcpu)) { + hcr |= HCR_AT | HCR_TTLB | HCR_NV2; + write_sysreg_s(vcpu->arch.ctxt.vncr_array, + SYS_VNCR_EL2); + } } else if (vcpu_has_nv(vcpu)) { u64 vhcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2); From patchwork Wed Apr 5 15:40:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52A06C76188 for ; Wed, 5 Apr 2023 15:48:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239109AbjDEPsS (ORCPT ); Wed, 5 Apr 2023 11:48:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239096AbjDEPsJ (ORCPT ); Wed, 5 Apr 2023 11:48:09 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84D105FC3 for ; Wed, 5 Apr 2023 08:47:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B6EFB63F08 for ; Wed, 5 Apr 2023 15:46:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2A48EC433D2; Wed, 5 Apr 2023 15:46:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709610; bh=C193qCDjqiF5HkfsJlWly9Jsvkj9TvtpD7IYpBzHGpI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LZhoGiNOz1zaYAHslZ7h8cxSWTX0VfnaxUr3yFPWPMcMo8gTO6iGqwGPdbx61fSb4 HnxtJ7xUxnRn5fuEnaisKTd/gLcBfnxaay8TWrIa3g5K8r9eTiJSyUWy00q+Truz+C KvSCbt2s3YACaxWuru0JozSu4s6cATY4GaEqGTTGcq15lyHsKSJ9lJnmmmq38lLS/F uGhIG1epigcI9mz9klLeje8ZXADc6dn6Xb/749CzBVZIkKW1pCShddFRatJEGJbdcl 2LLbUTekm4WpAzfzS80X9fJth7r888gaOsUr0cuXiU0+yUsssCG3v1notbwoPCJhNH BBDhHG0dthOPw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fv-0062PV-UY; Wed, 05 Apr 2023 16:41:04 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 47/50] KVM: arm64: nv: Fast-track 'InHost' exception returns Date: Wed, 5 Apr 2023 16:40:05 +0100 Message-Id: <20230405154008.3552854-48-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org A significant part of the ARMv8.3-NV extension is to trap ERET instructions so that the hypervisor gets a chance to switch from a vEL2 L1 guest to an EL1 L2 guest. But this also has the unfortunate consequence of trapping ERET in unsuspecting circumstances, such as staying at vEL2 (interrupt handling while being in the guest hypervisor), or returning to host userspace in the case of a VHE guest. Although we already make some effort to handle these ERET quicker by not doing the put/load dance, it is still way too far down the line for it to be efficient enough. For these cases, it would ideal to ERET directly, no question asked. Of course, we can't do that. But the next best thing is to do it as early as possible, in fixup_guest_exit(), much as we would handle FPSIMD exceptions. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/emulate-nested.c | 29 +++------------------ arch/arm64/kvm/hyp/vhe/switch.c | 46 +++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+), 26 deletions(-) diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c index 555771e1260d..e2d23f624115 100644 --- a/arch/arm64/kvm/emulate-nested.c +++ b/arch/arm64/kvm/emulate-nested.c @@ -79,8 +79,7 @@ static u64 kvm_check_illegal_exception_return(struct kvm_vcpu *vcpu, u64 spsr) void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu) { - u64 spsr, elr, mode; - bool direct_eret; + u64 spsr, elr; /* * Forward this trap to the virtual EL2 if the virtual @@ -89,33 +88,11 @@ void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu) if (forward_nv_traps(vcpu)) return; - /* - * Going through the whole put/load motions is a waste of time - * if this is a VHE guest hypervisor returning to its own - * userspace, or the hypervisor performing a local exception - * return. No need to save/restore registers, no need to - * switch S2 MMU. Just do the canonical ERET. - */ - spsr = vcpu_read_sys_reg(vcpu, SPSR_EL2); - spsr = kvm_check_illegal_exception_return(vcpu, spsr); - - mode = spsr & (PSR_MODE_MASK | PSR_MODE32_BIT); - - direct_eret = (mode == PSR_MODE_EL0t && - vcpu_el2_e2h_is_set(vcpu) && - vcpu_el2_tge_is_set(vcpu)); - direct_eret |= (mode == PSR_MODE_EL2h || mode == PSR_MODE_EL2t); - - if (direct_eret) { - *vcpu_pc(vcpu) = vcpu_read_sys_reg(vcpu, ELR_EL2); - *vcpu_cpsr(vcpu) = spsr; - trace_kvm_nested_eret(vcpu, *vcpu_pc(vcpu), spsr); - return; - } - preempt_disable(); kvm_arch_vcpu_put(vcpu); + spsr = __vcpu_sys_reg(vcpu, SPSR_EL2); + spsr = kvm_check_illegal_exception_return(vcpu, spsr); elr = __vcpu_sys_reg(vcpu, ELR_EL2); trace_kvm_nested_eret(vcpu, elr, spsr); diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 7682d2f13eaa..238c6613cf47 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -168,6 +168,51 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu) __deactivate_traps_common(vcpu); } +static bool kvm_hyp_handle_eret(struct kvm_vcpu *vcpu, u64 *exit_code) +{ + struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; + u64 spsr, mode; + + /* + * Going through the whole put/load motions is a waste of time + * if this is a VHE guest hypervisor returning to its own + * userspace, or the hypervisor performing a local exception + * return. No need to save/restore registers, no need to + * switch S2 MMU. Just do the canonical ERET. + * + * Unless the trap has to be forwarded further down the line, + * of course... + */ + if (__vcpu_sys_reg(vcpu, HCR_EL2) & HCR_NV) + return false; + + spsr = read_sysreg_el1(SYS_SPSR); + spsr = __fixup_spsr_el2_read(ctxt, spsr); + mode = spsr & (PSR_MODE_MASK | PSR_MODE32_BIT); + + switch (mode) { + case PSR_MODE_EL0t: + if (!(vcpu_el2_e2h_is_set(vcpu) && vcpu_el2_tge_is_set(vcpu))) + return false; + break; + case PSR_MODE_EL2t: + mode = PSR_MODE_EL1t; + break; + case PSR_MODE_EL2h: + mode = PSR_MODE_EL1h; + break; + default: + return false; + } + + spsr = (spsr & ~(PSR_MODE_MASK | PSR_MODE32_BIT)) | mode; + + write_sysreg_el2(spsr, SYS_SPSR); + write_sysreg_el2(read_sysreg_el1(SYS_ELR), SYS_ELR); + + return true; +} + static const exit_handler_fn hyp_exit_handlers[] = { [0 ... ESR_ELx_EC_MAX] = NULL, [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, @@ -177,6 +222,7 @@ static const exit_handler_fn hyp_exit_handlers[] = { [ESR_ELx_EC_IABT_LOW] = kvm_hyp_handle_iabt_low, [ESR_ELx_EC_DABT_LOW] = kvm_hyp_handle_dabt_low, [ESR_ELx_EC_PAC] = kvm_hyp_handle_ptrauth, + [ESR_ELx_EC_ERET] = kvm_hyp_handle_eret, }; static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu) From patchwork Wed Apr 5 15:40:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F33EC77B60 for ; Wed, 5 Apr 2023 15:47:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239025AbjDEPrB (ORCPT ); Wed, 5 Apr 2023 11:47:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239063AbjDEPqz (ORCPT ); Wed, 5 Apr 2023 11:46:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DCDE40DA for ; Wed, 5 Apr 2023 08:46:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C96E63E1F for ; Wed, 5 Apr 2023 15:46:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91AF5C433D2; Wed, 5 Apr 2023 15:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709586; bh=C+eYkxFuOlWI1J0Ai+7uKAkwuQz44DowGsncqCBujTA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QLwyPTCVUNwX1ra03YiaNfDc3ORLFLTEcXljf4JoPppIdHgp51m5yr5k/J/snkYPs Q9sg49L4y1IX06258+nn1DhGHU9h8P4TDKQtXvRWIsfjgbLUadfQaU3/jK/AE20pqM CfxXSqErplFDkmhWuyBP3YdhURLDIcpO9IuxO7FdDCVBzKxqVz1ER/E3Bm9R2ejd7x yQiG0zQ2GSQ86CxcMDX7GVoCufAqhdiLZqrVKgbIXUJGTqBvNLU5AUc20oJZwerTP9 bzenpExdBzgY8olp6GhsqpX0T8lxiBQGF71fXF4qDwg/J7zqgp/58hxzLF5yGl5LdP oZrztHeZqQZKg== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fw-0062PV-71; Wed, 05 Apr 2023 16:41:04 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 48/50] KVM: arm64: nv: Fast-track EL1 TLBIs for VHE guests Date: Wed, 5 Apr 2023 16:40:06 +0100 Message-Id: <20230405154008.3552854-49-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Due to the way ARMv8.4-NV suppresses traps when accessing EL2 system registers, we can't track when the guest changes its HCR_EL2.TGE setting. This means we always trap EL1 TLBIs, even if they don't affect any guest. This obviously has a huge impact on performance, as we handle TLBI traps as a normal exit, and a normal VHE host issues thousands of TLBIs when booting (and quite a few when running userspace). A cheap way to reduce the overhead is to handle the limited case of {E2H,TGE}=={1,1} as a guest fixup, as we already have the right mmu configuration in place. Just execute the decoded instruction right away and return to the guest. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vhe/switch.c | 43 ++++++++++++++++++++++++++++++++- arch/arm64/kvm/hyp/vhe/tlb.c | 6 +++-- arch/arm64/kvm/sys_regs.c | 12 --------- 3 files changed, 46 insertions(+), 15 deletions(-) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index 238c6613cf47..a3555b90d9e1 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -168,6 +168,47 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu) __deactivate_traps_common(vcpu); } +static bool kvm_hyp_handle_tlbi_el1(struct kvm_vcpu *vcpu, u64 *exit_code) +{ + u32 instr; + u64 val; + + /* + * Ideally, we would never trap on EL1 TLB invalidations when the + * guest's HCR_EL2.{E2H,TGE} == {1,1}. But "thanks" to ARMv8.4, we + * don't trap writes to HCR_EL2, meaning that we can't track + * changes to the virtual TGE bit. So we leave HCR_EL2.TTLB set on + * the host. Oopsie... + * + * In order to speed-up EL1 TLBIs from the vEL2 guest when TGE is + * set, try and handle these invalidation as quickly as possible, + * without fully exiting (unless this needs forwarding). + */ + if (!vcpu_has_nv2(vcpu) || + !vcpu_is_el2(vcpu) || + (__vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) + return false; + + instr = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu)); + if (sys_reg_Op0(instr) != TLBI_Op0 || + sys_reg_Op1(instr) != TLBI_Op1_EL1) + return false; + + val = vcpu_get_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu)); + __kvm_tlb_el1_instr(NULL, val, instr); + __kvm_skip_instr(vcpu); + + return true; +} + +static bool kvm_hyp_handle_sysreg_vhe(struct kvm_vcpu *vcpu, u64 *exit_code) +{ + if (kvm_hyp_handle_tlbi_el1(vcpu, exit_code)) + return true; + + return kvm_hyp_handle_sysreg(vcpu, exit_code); +} + static bool kvm_hyp_handle_eret(struct kvm_vcpu *vcpu, u64 *exit_code) { struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; @@ -216,7 +257,7 @@ static bool kvm_hyp_handle_eret(struct kvm_vcpu *vcpu, u64 *exit_code) static const exit_handler_fn hyp_exit_handlers[] = { [0 ... ESR_ELx_EC_MAX] = NULL, [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, - [ESR_ELx_EC_SYS64] = kvm_hyp_handle_sysreg, + [ESR_ELx_EC_SYS64] = kvm_hyp_handle_sysreg_vhe, [ESR_ELx_EC_SVE] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_FP_ASIMD] = kvm_hyp_handle_fpsimd, [ESR_ELx_EC_IABT_LOW] = kvm_hyp_handle_iabt_low, diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index c4389db4cc22..beb162468c0b 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -201,7 +201,8 @@ void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding) dsb(ishst); /* Switch to requested VMID */ - __tlb_switch_to_guest(mmu, &cxt); + if (mmu) + __tlb_switch_to_guest(mmu, &cxt); /* * Execute the same instruction as the guest hypervisor did, @@ -240,5 +241,6 @@ void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding) dsb(ish); isb(); - __tlb_switch_to_host(&cxt); + if (mmu) + __tlb_switch_to_host(&cxt); } diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 4a1c4a97f702..a671629fb366 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -3113,18 +3113,6 @@ static bool handle_tlbi_el1(struct kvm_vcpu *vcpu, struct sys_reg_params *p, WARN_ON(!vcpu_is_el2(vcpu)); - if ((__vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) { - mutex_lock(&vcpu->kvm->lock); - /* - * ARMv8.4-NV allows the guest to change TGE behind - * our back, so we always trap EL1 TLBIs from vEL2... - */ - __kvm_tlb_el1_instr(&vcpu->kvm->arch.mmu, p->regval, sys_encoding); - mutex_unlock(&vcpu->kvm->lock); - - return true; - } - kvm_s2_mmu_iterate_by_vmid(vcpu->kvm, get_vmid(vttbr), &(union tlbi_info) { .va = { From patchwork Wed Apr 5 15:40:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E62BAC76188 for ; Wed, 5 Apr 2023 15:46:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239056AbjDEPqv (ORCPT ); Wed, 5 Apr 2023 11:46:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238972AbjDEPqi (ORCPT ); Wed, 5 Apr 2023 11:46:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 769837685 for ; Wed, 5 Apr 2023 08:46:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A115E63EFC for ; Wed, 5 Apr 2023 15:46:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14225C433D2; Wed, 5 Apr 2023 15:46:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709576; bh=q9r1hOTNh/oLCfkMuKIIcbe+H8ghVFyameXVN/4crsY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CIbUyv8yMPeDkGi3BJLYFGLnn8Z/wG6rJWWQcBu5hpGwZdFS9bLA5r9Dl/CdDdmpM GKzt/DDKMoHlSWz4R9HBqgm4mhFwYbeY+LrazOYiLHoMy0LbwqA26+AjTgKZwhfj+l EeSEfmiz+TDgCUcnv1k3gKfvkZ7BY33Dc/EeRwe6aJaS1Aeh8chOY733qDcRGoZXXD 6L1DzS756sgl466Yw81/lO+HPUg6y1uatVfiQhZwvviICSJifWeKq/1yHN8/y8ZYkv 2lE3L4hOId0hS8/EZQhjmVqmz6WvkktylZNbtJO6jBsuUl+Q0cc7D4rEBZN2Qh7Yzn 9tQvLyau1pAGw== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fw-0062PV-G8; Wed, 05 Apr 2023 16:41:04 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 49/50] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers Date: Wed, 5 Apr 2023 16:40:07 +0100 Message-Id: <20230405154008.3552854-50-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Although FEAT_NV2 makes most things fast, it also makes it impossible to correctly emulate the timers, as the sysreg accesses are redirected to memory. FEAT_ECV addresses this by giving a hypervisor the ability to trap the EL02 sysregs as well as the virtual timer. Add the required trap setting to make use of the feature, allowing us to elide the ugly resync in the middle of the run loop. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/arch_timer.c | 37 +++++++++++++++++++++++++--- include/clocksource/arm_arch_timer.h | 4 +++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c index 98fddb4a846d..e3b5bd7cf15c 100644 --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -790,7 +790,7 @@ static void kvm_timer_vcpu_load_nested_switch(struct kvm_vcpu *vcpu, static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map) { - bool tpt, tpc; + bool tvt, tpt, tvc, tpc, tvt02, tpt02; u64 clr, set; /* @@ -805,7 +805,30 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map) * within this function, reality kicks in and we start adding * traps based on emulation requirements. */ - tpt = tpc = false; + tvt = tpt = tvc = tpc = false; + tvt02 = tpt02 = false; + + /* + * NV2 badly breaks the timer semantics by redirecting accesses to + * the EL0 timer state to memory, so let's call ECV to the rescue if + * available: we trap all CNT{P,V}_{CTL,CVAL,TVAL}_EL0 accesses. + * + * The treatment slightly varies depending whether we run a nVHE or + * VHE guest: nVHE will use the _EL0 registers directly, while VHE + * will use the _EL02 accessors. This translates in different trap + * bits. + * + * None of the trapping is required when running in non-HYP context, + * unless required by the L1 hypervisor settings once we advertise + * ECV+NV in the guest, or that we need trapping for other reasons. + */ + if (cpus_have_final_cap(ARM64_HAS_ECV) && + vcpu_has_nv2(vcpu) && is_hyp_ctxt(vcpu)) { + if (vcpu_el2_e2h_is_set(vcpu)) + tvt02 = tpt02 = true; + else + tvt = tpt = true; + } /* * We have two possibility to deal with a physical offset: @@ -845,6 +868,10 @@ static void timer_set_traps(struct kvm_vcpu *vcpu, struct timer_map *map) assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr); assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr); + assign_clear_set_bit(tvt, CNTHCTL_EL1TVT, clr, set); + assign_clear_set_bit(tvc, CNTHCTL_EL1TVCT, clr, set); + assign_clear_set_bit(tvt02, CNTHCTL_EL1NVVCT, clr, set); + assign_clear_set_bit(tpt02, CNTHCTL_EL1NVPCT, clr, set); /* This only happens on VHE, so use the CNTKCTL_EL1 accessor */ sysreg_clear_set(cntkctl_el1, clr, set); @@ -940,8 +967,12 @@ void kvm_timer_sync_nested(struct kvm_vcpu *vcpu) * accesses redirected to the VNCR page. Any guest action taken on * the timer is postponed until the next exit, leading to a very * poor quality of emulation. + * + * This is an unmitigated disaster, only papered over by FEAT_ECV, + * which allows trapping of the timer registers even with NV2. + * Still, this is still worse than FEAT_NV on its own. Meh. */ - if (!is_hyp_ctxt(vcpu)) + if (cpus_have_const_cap(ARM64_HAS_ECV) || !is_hyp_ctxt(vcpu)) return; if (!vcpu_el2_e2h_is_set(vcpu)) { diff --git a/include/clocksource/arm_arch_timer.h b/include/clocksource/arm_arch_timer.h index cbbc9a6dc571..c62811fb4130 100644 --- a/include/clocksource/arm_arch_timer.h +++ b/include/clocksource/arm_arch_timer.h @@ -22,6 +22,10 @@ #define CNTHCTL_EVNTDIR (1 << 3) #define CNTHCTL_EVNTI (0xF << 4) #define CNTHCTL_ECV (1 << 12) +#define CNTHCTL_EL1TVT (1 << 13) +#define CNTHCTL_EL1TVCT (1 << 14) +#define CNTHCTL_EL1NVPCT (1 << 15) +#define CNTHCTL_EL1NVVCT (1 << 16) enum arch_timer_reg { ARCH_TIMER_REG_CTRL, From patchwork Wed Apr 5 15:40:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 13202107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E00EFC7619A for ; Wed, 5 Apr 2023 15:47:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239054AbjDEPra (ORCPT ); Wed, 5 Apr 2023 11:47:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239041AbjDEPr0 (ORCPT ); Wed, 5 Apr 2023 11:47:26 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF52E728A for ; Wed, 5 Apr 2023 08:46:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 70E7A63F05 for ; Wed, 5 Apr 2023 15:46:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8A51C433D2; Wed, 5 Apr 2023 15:46:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680709590; bh=R76HaO8qoXwTusW877BsCKlHkjCFGvuoQfdtYgi9v/k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DvrmfVHqB8oL64YI4j2xaSeYxR+Goun2pP9Q7DIuwrXw79k54u/Do3R65JHRCB22J t6G6iyI3uEgpHsFi3jq1+EF0gUpxxLMpHZ7ucwzINPLexJ5iHS+9LoGNdZNi26f/be EI3vGp5WH9fSQxAU8iQYSQO8i+g6ZTVL7/2UXUNLDku7QNE2ZxDkCcchyE2rPsm6da C//r7QNgK/GZ3OjGghEv0QHvIZYLbYhOxYYEZ7qHNFo3HR2JYpcWNBIfZQxMuZMKSC ulRP91iRoU5Z8PGyCYLI5UcoKkZBFUrd4mGlCG74Bcq27FCedFnxaFiVMZFoTsaAAL KbA3l6ocM56XA== Received: from sofa.misterjones.org ([185.219.108.64] helo=valley-girl.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1pk5Fw-0062PV-Ow; Wed, 05 Apr 2023 16:41:04 +0100 From: Marc Zyngier To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei , Andre Przywara , Chase Conklin , Christoffer Dall , Ganapatrao Kulkarni , Darren Hart , Jintack Lim , Russell King , Miguel Luis , James Morse , Suzuki K Poulose , Oliver Upton , Zenghui Yu Subject: [PATCH v9 50/50] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on Date: Wed, 5 Apr 2023 16:40:08 +0100 Message-Id: <20230405154008.3552854-51-maz@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230405154008.3552854-1-maz@kernel.org> References: <20230405154008.3552854-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, alexandru.elisei@arm.com, andre.przywara@arm.com, chase.conklin@arm.com, christoffer.dall@arm.com, gankulkarni@os.amperecomputing.com, darren@os.amperecomputing.com, jintack@cs.columbia.edu, rmk+kernel@armlinux.org.uk, miguel.luis@oracle.com, james.morse@arm.com, suzuki.poulose@arm.com, oliver.upton@linux.dev, yuzenghui@huawei.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Although FEAT_ECV allows us to correctly enable the timers, it also reduces performances pretty badly (a L2 guest doing a lot of virtio emulated in L1 userspace results in a 30% degradation). Mitigate this by emulating the CTL/CVAL register reads in the inner run loop, without returning to the general kernel. This halves the overhead described above. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vhe/switch.c | 49 +++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index a3555b90d9e1..a9ac61505a86 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -201,11 +201,60 @@ static bool kvm_hyp_handle_tlbi_el1(struct kvm_vcpu *vcpu, u64 *exit_code) return true; } +static bool kvm_hyp_handle_ecv(struct kvm_vcpu *vcpu, u64 *exit_code) +{ + u64 esr, val; + + /* + * Having FEAT_ECV allows for a better quality of timer emulation. + * However, this comes at a huge cost in terms of traps. Try and + * satisfy the reads without returning to the kernel if we can. + */ + if (!cpus_have_final_cap(ARM64_HAS_ECV)) + return false; + + if (!vcpu_has_nv2(vcpu)) + return false; + + esr = kvm_vcpu_get_esr(vcpu); + if ((esr & ESR_ELx_SYS64_ISS_DIR_MASK) != ESR_ELx_SYS64_ISS_DIR_READ) + return false; + + switch (esr_sys64_to_sysreg(esr)) { + case SYS_CNTP_CTL_EL02: + case SYS_CNTP_CTL_EL0: + val = __vcpu_sys_reg(vcpu, CNTP_CTL_EL0); + break; + case SYS_CNTP_CVAL_EL02: + case SYS_CNTP_CVAL_EL0: + val = __vcpu_sys_reg(vcpu, CNTP_CVAL_EL0); + break; + case SYS_CNTV_CTL_EL02: + case SYS_CNTV_CTL_EL0: + val = __vcpu_sys_reg(vcpu, CNTV_CTL_EL0); + break; + case SYS_CNTV_CVAL_EL02: + case SYS_CNTV_CVAL_EL0: + val = __vcpu_sys_reg(vcpu, CNTV_CVAL_EL0); + break; + default: + return false; + } + + vcpu_set_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu), val); + __kvm_skip_instr(vcpu); + + return true; +} + static bool kvm_hyp_handle_sysreg_vhe(struct kvm_vcpu *vcpu, u64 *exit_code) { if (kvm_hyp_handle_tlbi_el1(vcpu, exit_code)) return true; + if (kvm_hyp_handle_ecv(vcpu, exit_code)) + return true; + return kvm_hyp_handle_sysreg(vcpu, exit_code); }