From patchwork Thu Dec 10 16:00:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 11965505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B56F6C1B0E3 for ; Thu, 10 Dec 2020 16:30:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 677B823D39 for ; Thu, 10 Dec 2020 16:30:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391641AbgLJQaJ (ORCPT ); Thu, 10 Dec 2020 11:30:09 -0500 Received: from mail.kernel.org ([198.145.29.99]:33732 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391588AbgLJQFa (ORCPT ); Thu, 10 Dec 2020 11:05:30 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CCCCA23F34; Thu, 10 Dec 2020 16:04:35 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1knONt-0008Di-H2; Thu, 10 Dec 2020 16:01:38 +0000 From: Marc Zyngier To: linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org Cc: Andre Przywara , Christoffer Dall , Jintack Lim , Alexandru Elisei , James Morse , Julien Thierry , Suzuki K Poulose , kernel-team@android.com Subject: [PATCH v3 66/66] KVM: arm64: nv: Fast-track EL1 TLBIs for VHE guests Date: Thu, 10 Dec 2020 16:00:02 +0000 Message-Id: <20201210160002.1407373-67-maz@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201210160002.1407373-1-maz@kernel.org> References: <20201210160002.1407373-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, andre.przywara@arm.com, christoffer.dall@arm.com, jintack@cs.columbia.edu, alexandru.elisei@arm.com, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, kernel-team@android.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Due to the way ARMv8.4-NV suppresses traps when accessing EL2 system registers, we can't track when the guest changes its HCR_EL2.TGE setting. This means we always trap EL1 TLBIs, even if they don't affect any guest. This obviously has a huge impact on performance, as we handle TLBI traps as a normal exit, and a normal VHE host issues thousands of TLBIs when booting (and quite a few when running userspace). A cheap way to reduce the overhead is to handle the limited case of {E2H,TGE}=={1,1} as a guest fixup, as we already have the right mmu configuration in place. Just execute the decoded instruction right away and return to the guest. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/hyp/vhe/switch.c | 36 +++++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/vhe/tlb.c | 6 ++++-- arch/arm64/kvm/sys_regs.c | 25 ++++++++--------------- 3 files changed, 48 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c index c90aed418f73..0b9bc36340a5 100644 --- a/arch/arm64/kvm/hyp/vhe/switch.c +++ b/arch/arm64/kvm/hyp/vhe/switch.c @@ -168,6 +168,39 @@ void deactivate_traps_vhe_put(void) __deactivate_traps_common(); } +static bool __hyp_handle_tlbi_el1(struct kvm_vcpu *vcpu) +{ + u32 instr; + u64 val; + + /* + * Ideally, we would never trap on EL1 TLB invalidations when the + * guest's HCR_EL2.{E2H,TGE} == {1,1}. But "thanks" to ARMv8.4, we + * don't trap writes to HCR_EL2, meaning that we can't track + * changes to the virtual TGE bit. So we leave HCR_EL2.TTLB set on + * the host. Oopsie... + * + * In order to speed-up EL1 TLBIs from the vEL2 guest when TGE is + * set, try and handle these invalidation as quickly as possible, + * without fully exiting (unless this needs forwarding). + */ + if (kvm_vcpu_trap_get_class(vcpu) != ESR_ELx_EC_SYS64 || + !vcpu_mode_el2(vcpu) || + (__vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) + return false; + + instr = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu)); + if (sys_reg_Op0(instr) != TLBI_Op0 || + sys_reg_Op1(instr) != TLBI_Op1_EL1) + return false; + + val = vcpu_get_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu)); + __kvm_tlb_el1_instr(NULL, val, instr); + __kvm_skip_instr(vcpu); + + return true; +} + static bool __hyp_handle_eret(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; @@ -261,6 +294,9 @@ static bool fixup_guest_exit_vhe(struct kvm_vcpu *vcpu, u64 *exit_code, if (*exit_code == ARM_EXCEPTION_TRAP) { if (__hyp_handle_eret(vcpu)) return true; + + if (__hyp_handle_tlbi_el1(vcpu)) + return true; } return fixup_guest_exit(vcpu, exit_code); diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 52fda1d61308..6dd7f224e0f3 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -200,7 +200,8 @@ void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding) dsb(ishst); /* Switch to requested VMID */ - __tlb_switch_to_guest(mmu, &cxt); + if (mmu) + __tlb_switch_to_guest(mmu, &cxt); /* * Execute the same instruction as the guest hypervisor did, @@ -239,5 +240,6 @@ void __kvm_tlb_el1_instr(struct kvm_s2_mmu *mmu, u64 val, u64 sys_encoding) dsb(ish); isb(); - __tlb_switch_to_host(&cxt); + if (mmu) + __tlb_switch_to_host(&cxt); } diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index c7b004982ea9..cf03281f06a3 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -2684,6 +2684,8 @@ static bool handle_tlbi_el1(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) { u32 sys_encoding = sys_insn(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2); + u64 virtual_vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + struct kvm_s2_mmu *mmu; /* * If we're here, this is because we've trapped on a EL1 TLBI @@ -2702,24 +2704,13 @@ static bool handle_tlbi_el1(struct kvm_vcpu *vcpu, struct sys_reg_params *p, mutex_lock(&vcpu->kvm->lock); - if ((__vcpu_sys_reg(vcpu, HCR_EL2) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) { - u64 virtual_vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); - struct kvm_s2_mmu *mmu; - - mmu = lookup_s2_mmu(vcpu->kvm, virtual_vttbr, HCR_VM); - if (mmu) - __kvm_tlb_el1_instr(mmu, p->regval, sys_encoding); + mmu = lookup_s2_mmu(vcpu->kvm, virtual_vttbr, HCR_VM); + if (mmu) + __kvm_tlb_el1_instr(mmu, p->regval, sys_encoding); - mmu = lookup_s2_mmu(vcpu->kvm, virtual_vttbr, 0); - if (mmu) - __kvm_tlb_el1_instr(mmu, p->regval, sys_encoding); - } else { - /* - * ARMv8.4-NV allows the guest to change TGE behind - * our back, so we always trap EL1 TLBIs from vEL2... - */ - __kvm_tlb_el1_instr(&vcpu->kvm->arch.mmu, p->regval, sys_encoding); - } + mmu = lookup_s2_mmu(vcpu->kvm, virtual_vttbr, 0); + if (mmu) + __kvm_tlb_el1_instr(mmu, p->regval, sys_encoding); mutex_unlock(&vcpu->kvm->lock);