From patchwork Fri Dec 4 18:19:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 11952075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-22.0 required=3.0 tests=BAYES_00,INCLUDES_PATCH, INCLUDES_PULL_REQUEST,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE, SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB7A0C433FE for ; Fri, 4 Dec 2020 18:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9F78322C9C for ; Fri, 4 Dec 2020 18:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730105AbgLDSUl (ORCPT ); Fri, 4 Dec 2020 13:20:41 -0500 Received: from mail.kernel.org ([198.145.29.99]:45412 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729472AbgLDSUl (ORCPT ); Fri, 4 Dec 2020 13:20:41 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B9FF322B4E; Fri, 4 Dec 2020 18:20:00 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1klFgU-00G3EZ-ID; Fri, 04 Dec 2020 18:19:58 +0000 From: Marc Zyngier To: Paolo Bonzini Cc: Keqian Zhu , Will Deacon , Yanan Wang , kernel-team@android.com, James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [GIT PULL] KVM/arm64 fixes for 5.10, take #5 Date: Fri, 4 Dec 2020 18:19:11 +0000 Message-Id: <20201204181914.783445-1-maz@kernel.org> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: pbonzini@redhat.com, zhukeqian1@huawei.com, will@kernel.org, wangyanan55@huawei.com, kernel-team@android.com, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Hi Paolo, A week ago, I was hoping being done with the 5.10 fixes. I should obviously know better. Thanks to Yanan's excellent work, we have another set of page table fixes, all plugging issues introduced with our new page table code. The problems range from memory leak to TLB conflicts, all of which are serious enough to be squashed right away. Are we done yet? Fingers crossed. Please pull, M. The following changes since commit 23bde34771f1ea92fb5e6682c0d8c04304d34b3b: KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace (2020-11-17 18:51:09 +0000) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.10-5 for you to fetch changes up to 7d894834a305568a0168c55d4729216f5f8cb4e6: KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() (2020-12-02 09:53:29 +0000) ---------------------------------------------------------------- kvm/arm64 fixes for 5.10, take #5 - Don't leak page tables on PTE update - Correctly invalidate TLBs on table to block transition - Only update permissions if the fault level matches the expected mapping size ---------------------------------------------------------------- Yanan Wang (3): KVM: arm64: Fix memory leak on stage2 update of a valid PTE KVM: arm64: Fix handling of merging tables into a block entry KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/kvm_emulate.h | 5 +++++ arch/arm64/kvm/hyp/pgtable.c | 17 ++++++++++++++++- arch/arm64/kvm/mmu.c | 11 +++++++++-- 4 files changed, 31 insertions(+), 3 deletions(-) From patchwork Fri Dec 4 18:19:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 11952077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26532C4361A for ; Fri, 4 Dec 2020 18:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E663622C9C for ; Fri, 4 Dec 2020 18:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730111AbgLDSUm (ORCPT ); Fri, 4 Dec 2020 13:20:42 -0500 Received: from mail.kernel.org ([198.145.29.99]:45458 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730095AbgLDSUl (ORCPT ); Fri, 4 Dec 2020 13:20:41 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 765C122C9F; Fri, 4 Dec 2020 18:20:01 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1klFgV-00G3EZ-Mo; Fri, 04 Dec 2020 18:19:59 +0000 From: Marc Zyngier To: Paolo Bonzini Cc: Keqian Zhu , Will Deacon , Yanan Wang , kernel-team@android.com, James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH 2/3] KVM: arm64: Fix handling of merging tables into a block entry Date: Fri, 4 Dec 2020 18:19:13 +0000 Message-Id: <20201204181914.783445-3-maz@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201204181914.783445-1-maz@kernel.org> References: <20201204181914.783445-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: pbonzini@redhat.com, zhukeqian1@huawei.com, will@kernel.org, wangyanan55@huawei.com, kernel-team@android.com, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Yanan Wang When dirty logging is enabled, we collapse block entries into tables as necessary. If dirty logging gets canceled, we can end-up merging tables back into block entries. When this happens, we must not only free the non-huge page-table pages but also invalidate all the TLB entries that can potentially cover the block. Otherwise, we end-up with multiple possible translations for the same physical page, which can legitimately result in a TLB conflict. To address this, replease the bogus invalidation by IPA with a full VM invalidation. Although this is pretty heavy handed, it happens very infrequently and saves a bunch of invalidations by IPA. Signed-off-by: Yanan Wang [maz: fixup commit message] Signed-off-by: Marc Zyngier Acked-by: Will Deacon Link: https://lore.kernel.org/r/20201201201034.116760-3-wangyanan55@huawei.com --- arch/arm64/kvm/hyp/pgtable.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 2beba1dc40ec..bdf8e55ed308 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -502,7 +502,13 @@ static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level, return 0; kvm_set_invalid_pte(ptep); - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0); + + /* + * Invalidate the whole stage-2, as we may have numerous leaf + * entries below us which would otherwise need invalidating + * individually. + */ + kvm_call_hyp(__kvm_tlb_flush_vmid, data->mmu); data->anchor = ptep; return 0; } From patchwork Fri Dec 4 18:19:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Zyngier X-Patchwork-Id: 11952079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8923C4167B for ; Fri, 4 Dec 2020 18:20:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B357E22B4E for ; Fri, 4 Dec 2020 18:20:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730119AbgLDSUn (ORCPT ); Fri, 4 Dec 2020 13:20:43 -0500 Received: from mail.kernel.org ([198.145.29.99]:45490 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730112AbgLDSUn (ORCPT ); Fri, 4 Dec 2020 13:20:43 -0500 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0ABF222CA1; Fri, 4 Dec 2020 18:20:02 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.lan) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1klFgW-00G3EZ-9k; Fri, 04 Dec 2020 18:20:00 +0000 From: Marc Zyngier To: Paolo Bonzini Cc: Keqian Zhu , Will Deacon , Yanan Wang , kernel-team@android.com, James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH 3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() Date: Fri, 4 Dec 2020 18:19:14 +0000 Message-Id: <20201204181914.783445-4-maz@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201204181914.783445-1-maz@kernel.org> References: <20201204181914.783445-1-maz@kernel.org> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: pbonzini@redhat.com, zhukeqian1@huawei.com, will@kernel.org, wangyanan55@huawei.com, kernel-team@android.com, james.morse@arm.com, julien.thierry.kdev@gmail.com, suzuki.poulose@arm.com, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Yanan Wang If we get a FSC_PERM fault, just using (logging_active && writable) to determine calling kvm_pgtable_stage2_map(). There will be two more cases we should consider. (1) After logging_active is configged back to false from true. When we get a FSC_PERM fault with write_fault and adjustment of hugepage is needed, we should merge tables back to a block entry. This case is ignored by still calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless loop and guest panic due to soft lockup. (2) We use (FSC_PERM && logging_active && writable) to determine collapsing a block entry into a table by calling kvm_pgtable_stage2_map(). But sometimes we may only need to relax permissions when trying to write to a page other than a block. In this condition,using kvm_pgtable_stage2_relax_perms() will be fine. The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup level at which a D-abort or I-abort occurred. By comparing granule of the fault lookup level with vma_pagesize, we can strictly distinguish conditions of calling kvm_pgtable_stage2_relax_perms() or kvm_pgtable_stage2_map(), and the above two cases will be well considered. Suggested-by: Keqian Zhu Signed-off-by: Yanan Wang Signed-off-by: Marc Zyngier Acked-by: Will Deacon Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com --- arch/arm64/include/asm/esr.h | 1 + arch/arm64/include/asm/kvm_emulate.h | 5 +++++ arch/arm64/kvm/mmu.c | 11 +++++++++-- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 22c81f1edda2..85a3e49f92f4 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -104,6 +104,7 @@ /* Shared ISS fault status code(IFSC/DFSC) for Data/Instruction aborts */ #define ESR_ELx_FSC (0x3F) #define ESR_ELx_FSC_TYPE (0x3C) +#define ESR_ELx_FSC_LEVEL (0x03) #define ESR_ELx_FSC_EXTABT (0x10) #define ESR_ELx_FSC_SERROR (0x11) #define ESR_ELx_FSC_ACCESS (0x08) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 5ef2669ccd6c..00bc6f1234ba 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -350,6 +350,11 @@ static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vc return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_TYPE; } +static __always_inline u8 kvm_vcpu_trap_get_fault_level(const struct kvm_vcpu *vcpu) +{ + return kvm_vcpu_get_esr(vcpu) & ESR_ELx_FSC_LEVEL; +} + static __always_inline bool kvm_vcpu_abt_issea(const struct kvm_vcpu *vcpu) { switch (kvm_vcpu_trap_get_fault(vcpu)) { diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 1a01da9fdc99..75814a02d189 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -754,10 +754,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn_t gfn; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); - unsigned long vma_pagesize; + unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu); + unsigned long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; + fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level); write_fault = kvm_is_write_fault(vcpu); exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); VM_BUG_ON(write_fault && exec_fault); @@ -896,7 +898,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) prot |= KVM_PGTABLE_PROT_X; - if (fault_status == FSC_PERM && !(logging_active && writable)) { + /* + * Under the premise of getting a FSC_PERM fault, we just need to relax + * permissions only if vma_pagesize equals fault_granule. Otherwise, + * kvm_pgtable_stage2_map() should be called to change block size. + */ + if (fault_status == FSC_PERM && vma_pagesize == fault_granule) { ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); } else { ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,