From patchwork Fri Oct 27 11:56:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13438566 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5CC8DC25B67 for ; Fri, 27 Oct 2023 11:57:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ajPUc1igNekzEs6i0fYdWdYrGre/Hgvo0LnPuhIebLU=; b=SHVmDF+01fBLX+ kvjx0gB/4okh2ntv4dL2Dhg/lsfHAw+zwtcFmw+ZC1AA4GzCi/KOqtLUH9wL2ruGLofSi0LKloUJ3 LtOL/9XAi/NGTjbg6K3RIBxT54qnC/bpDK1ls06fz3I4TFxsSkTNoal0UupLiGsKnbpsOGs0fJUm3 yM47BWb1VXGhZCcJf5o7F3GJC3E1UJlb3UdQuH1WhZ50lcgs4yzUNYRzCIdIE4TbqWwrsQUr/Pnyc al2nIoMI9YNaG10YS/3UQXEtoxYFoCOBQ6HqG7Ve87DhRs32aC/FGAT8DjFOigQAHqor8yOc9mGho 10H0pv87iZBuciDtqFwQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSe-00GLGB-2f; Fri, 27 Oct 2023 11:57:08 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSM-00GLBw-0k for linux-arm-kernel@lists.infradead.org; Fri, 27 Oct 2023 11:56:53 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9A24A143D; Fri, 27 Oct 2023 04:57:27 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 124AB3F738; Fri, 27 Oct 2023 04:56:43 -0700 (PDT) From: Ryan Roberts To: Ard Biesheuvel , Ard Biesheuvel , Will Deacon , Catalin Marinas , Marc Zyngier , Oliver Upton , Mark Rutland , Anshuman Khandual , Kees Cook , Joey Gouly , Suzuki K Poulose , James Morse , Zenghui Yu Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: [RFC PATCH v1 1/3] arm64/mm: Modify range-based tlbi to decrement scale Date: Fri, 27 Oct 2023 12:56:32 +0100 Message-Id: <20231027115634.1432154-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231027115634.1432154-1-ryan.roberts@arm.com> References: <20231027115634.1432154-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231027_045650_361529_4EFB00CB X-CRM114-Status: GOOD ( 20.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In preparation for adding support for LPA2 to the tlb invalidation routines, modify the algorithm used by range-based tlbi to start at the highest 'scale' and decrement instead of starting at the lowest 'scale' and incrementing. This new approach makes it possible to maintain 64K alignment as we work through the range, until the last op (at scale=0). This is required when LPA2 is enabled. (This part will be added in a subsequent commit). This change is separated into its own patch because it will also impact non-LPA2 systems, and I want to make it easy to bisect in case it leads to performance regression (see below for benchmarks that suggest this should not be a problem). The original commit (d1d3aa98 "arm64: tlb: Use the TLBI RANGE feature in arm64") stated this as the reason for _incrementing_ scale: However, in most scenarios, the pages = 1 when flush_tlb_range() is called. Start from scale = 3 or other proper value (such as scale =ilog2(pages)), will incur extra overhead. So increase 'scale' from 0 to maximum. But pages=1 is already special cased by the non-range invalidation path, which will take care of it the first time through the loop (both in the original commit and in my change), so I don't think switching to decrement scale should have any extra performance impact after all. Indeed benchmarking kernel compilation, a TLBI-heavy workload, suggests that this new approach actually _improves_ performance slightly (using a virtual machine on Apple M2): Table shows time to execute kernel compilation workload with 8 jobs, relative to baseline without this patch (more negative number is bigger speedup). Repeated 9 times across 3 system reboots: | counter | mean | stdev | |:----------|-----------:|----------:| | real-time | -0.6% | 0.0% | | kern-time | -1.6% | 0.5% | | user-time | -0.4% | 0.1% | Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/tlbflush.h | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) -- 2.25.1 diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index b149cf9f91bc..e8153c16fcdf 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -351,14 +351,14 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * entries one by one at the granularity of 'stride'. If the TLB * range ops are supported, then: * - * 1. If 'pages' is odd, flush the first page through non-range - * operations; + * 1. The minimum range granularity is decided by 'scale', so multiple range + * TLBI operations may be required. Start from scale = 3, flush the largest + * possible number of pages ((num+1)*2^(5*scale+1)) that fit into the + * requested range, then decrement scale and continue until one or zero pages + * are left. * - * 2. For remaining pages: the minimum range granularity is decided - * by 'scale', so multiple range TLBI operations may be required. - * Start from scale = 0, flush the corresponding number of pages - * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it - * until no pages left. + * 2. If there is 1 page remaining, flush it through non-range operations. Range + * operations can only span an even number of pages. * * Note that certain ranges can be represented by either num = 31 and * scale or num = 0 and scale + 1. The loop below favours the latter @@ -368,12 +368,12 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) asid, tlb_level, tlbi_user) \ do { \ int num = 0; \ - int scale = 0; \ + int scale = 3; \ unsigned long addr; \ \ while (pages > 0) { \ if (!system_supports_tlb_range() || \ - pages % 2 == 1) { \ + pages == 1) { \ addr = __TLBI_VADDR(start, asid); \ __tlbi_level(op, addr, tlb_level); \ if (tlbi_user) \ @@ -393,7 +393,7 @@ do { \ start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \ pages -= __TLBI_RANGE_PAGES(num, scale); \ } \ - scale++; \ + scale--; \ } \ } while (0) From patchwork Fri Oct 27 11:56:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13438563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EB48EC25B47 for ; Fri, 27 Oct 2023 11:57:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QM/rewGUp85f9h9JbwzV2nTRjwLOIiZV2t81O3QMiE4=; b=vSw4jrVoLcfczK uZydjVb/6MqRB516OhOvvmSIc8XcLhR8IiqMjYTQcQpXIhd6Zr6w5M2GlDgOVOxbGiyIDpGfxalBE un296n5R14f5fsQxjWULQ3Xlzza86hLaB0c3sBvs6sj+/L3AmPOe8pAFKbLvubKjuS7t5FK0AZRYt NnwTVasrxfHXFz/nBy2RnSlUXCm8NJyml/3uEN3Ai7WZAlEk/DDyaEh9F/EopjlSyAdw9KxxMT/XI kVA8XUOas2wmBxC/q3n0B2jVD7Nod5C0IhFlJj6XwUw/kLKuz6RL/TfSSm9m/V8PKUWMhc4BQFcEp CQDIBZgW7E2jZrJSxJVw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSP-00GLDb-18; Fri, 27 Oct 2023 11:56:53 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSM-00GLCA-0k for linux-arm-kernel@lists.infradead.org; Fri, 27 Oct 2023 11:56:51 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C8B75153B; Fri, 27 Oct 2023 04:57:29 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 415853F738; Fri, 27 Oct 2023 04:56:46 -0700 (PDT) From: Ryan Roberts To: Ard Biesheuvel , Ard Biesheuvel , Will Deacon , Catalin Marinas , Marc Zyngier , Oliver Upton , Mark Rutland , Anshuman Khandual , Kees Cook , Joey Gouly , Suzuki K Poulose , James Morse , Zenghui Yu Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: [RFC PATCH v1 2/3] arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs Date: Fri, 27 Oct 2023 12:56:33 +0100 Message-Id: <20231027115634.1432154-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231027115634.1432154-1-ryan.roberts@arm.com> References: <20231027115634.1432154-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231027_045650_317024_9DEC6597 X-CRM114-Status: GOOD ( 13.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Add stub functions which is initially always return false. These provide the hooks that we need to update the range-based TLBI routines, whose operands are encoded differently depending on whether lpa2 is enabled or not. The kernel and kvm will enable the use of lpa2 asynchronously in future, and part of that enablement will involve fleshing out their respective hook to advertise when it is using lpa2. Since the kernel's decision to use lpa2 relies on more than just whether the HW supports the feature, it can't just use the same static key as kvm. This is another reason to use separate functions. lpa2_is_enabled() is already implemented as part of Ard's kernel lpa2 series. Since kvm will make its decision solely based on HW support, kvm_lpa2_is_enabled() will be defined as system_supports_lpa2() once kvm starts using lpa2. Signed-off-by: Ryan Roberts --- arch/arm64/include/asm/kvm_mmu.h | 3 +++ arch/arm64/include/asm/pgtable-prot.h | 2 ++ 2 files changed, 5 insertions(+) -- 2.25.1 diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 96a80e8f6226..57d5c2866174 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -314,5 +314,8 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu) { return container_of(mmu->arch, struct kvm, arch); } + +#define kvm_lpa2_is_enabled() false + #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h index eed814b00a38..b4b2b8623769 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -71,6 +71,8 @@ extern bool arm64_use_ng_mappings; #define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0) #define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0) +#define lpa2_is_enabled() false + /* * If we have userspace only BTI we don't want to mark kernel pages * guarded even if the system does support BTI. From patchwork Fri Oct 27 11:56:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13438565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC18DC25B47 for ; Fri, 27 Oct 2023 11:57:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=+A33jHnXXX2cux66NnwnHWwjGC3HdnTqgWI05CX4i14=; b=Yr5b4W/9WaFE/T mhuisoEQ03hgC3vC3jWiRJOCSmjCMjExewfmMZ+9pf9PYTCqyUczRdQRBQrcEAAzvY9RTSN8Q4J1S aicVK5oeT4VIKkiveZpbH20TWBZvXv7nngsVUcrEIB3on74MOZiuEVydAbTyq29070lMKA8fVPI2r ZxRmaVavfxU0xKEZx2OOj5EdAba4og+J1kSb9rW8mahcDy3Q3ZC1PYi9991aCAw88Adyq7EKw8cB5 Soj1It0gOZ+s31sWHTCRnxzrts7+xegqwqlBB32tU3ED0y0hln5qqjzl7bpzRGAQlGdF9bF6Ko9Lc RqXayVeZDZxXmXYHYE9w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSf-00GLGW-1G; Fri, 27 Oct 2023 11:57:09 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwLSO-00GLDA-1v for linux-arm-kernel@lists.infradead.org; Fri, 27 Oct 2023 11:56:54 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1FF931570; Fri, 27 Oct 2023 04:57:32 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 708C43F738; Fri, 27 Oct 2023 04:56:48 -0700 (PDT) From: Ryan Roberts To: Ard Biesheuvel , Ard Biesheuvel , Will Deacon , Catalin Marinas , Marc Zyngier , Oliver Upton , Mark Rutland , Anshuman Khandual , Kees Cook , Joey Gouly , Suzuki K Poulose , James Morse , Zenghui Yu Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev Subject: [RFC PATCH v1 3/3] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 Date: Fri, 27 Oct 2023 12:56:34 +0100 Message-Id: <20231027115634.1432154-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231027115634.1432154-1-ryan.roberts@arm.com> References: <20231027115634.1432154-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231027_045652_729430_330015F3 X-CRM114-Status: GOOD ( 34.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FEAT_LPA2 impacts tlb invalidation in 2 ways; Firstly, the TTL field in the non-range tlbi instructions can now validly take a 0 value as a level hint for the 4KB granule (this is due to the extra level of translation) - previously TTL=0b0100 meant no hint and was treated as 0b0000. Secondly, The BADDR field of the range-based tlbi instructions is specified in 64KB units when LPA2 is in use (TCR.DS=1), whereas it is in page units otherwise. Changes are required for tlbi to continue to operate correctly when LPA2 is in use. Solve the first problem by always adding the level hint if the level is between [0, 3] (previously anything other than 0 was hinted, which breaks in the new level -1 case from kvm). When running on non-LPA2 HW, 0 is still safe to hint as the HW will fall back to non-hinted. While we are at it, we replace the notion of 0 being the non-hinted sentinel with a macro, TLBI_TTL_UNKNOWN. This means callers won't need updating if/when translation depth increases in future. The second issue is more complex: When LPA2 is in use, use the non-range tlbi instructions to forward align to a 64KB boundary first, then use range-based tlbi from there on, until we have either invalidated all pages or we have a single page remaining. If the latter, that is done with non-range tlbi. We determine whether LPA2 is in use based on lpa2_is_enabled() (for kernel calls) or kvm_lpa2_is_enabled() (for kvm calls). Signed-off-by: Ryan Roberts Reviewed-by: Catalin Marinas --- arch/arm64/include/asm/tlb.h | 15 ++++-- arch/arm64/include/asm/tlbflush.h | 90 ++++++++++++++++++++----------- 2 files changed, 68 insertions(+), 37 deletions(-) -- 2.25.1 diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h index 2c29239d05c3..396ba9b4872c 100644 --- a/arch/arm64/include/asm/tlb.h +++ b/arch/arm64/include/asm/tlb.h @@ -22,15 +22,15 @@ static void tlb_flush(struct mmu_gather *tlb); #include /* - * get the tlbi levels in arm64. Default value is 0 if more than one - * of cleared_* is set or neither is set. - * Arm64 doesn't support p4ds now. + * get the tlbi levels in arm64. Default value is TLBI_TTL_UNKNOWN if more than + * one of cleared_* is set or neither is set - this elides the level hinting to + * the hardware. */ static inline int tlb_get_level(struct mmu_gather *tlb) { /* The TTL field is only valid for the leaf entry. */ if (tlb->freed_tables) - return 0; + return TLBI_TTL_UNKNOWN; if (tlb->cleared_ptes && !(tlb->cleared_pmds || tlb->cleared_puds || @@ -47,7 +47,12 @@ static inline int tlb_get_level(struct mmu_gather *tlb) tlb->cleared_p4ds)) return 1; - return 0; + if (tlb->cleared_p4ds && !(tlb->cleared_ptes || + tlb->cleared_pmds || + tlb->cleared_puds)) + return 0; + + return TLBI_TTL_UNKNOWN; } static inline void tlb_flush(struct mmu_gather *tlb) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index e8153c16fcdf..cbc9ad4db32d 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -94,19 +94,22 @@ static inline unsigned long get_trans_granule(void) * When ARMv8.4-TTL exists, TLBI operations take an additional hint for * the level at which the invalidation must take place. If the level is * wrong, no invalidation may take place. In the case where the level - * cannot be easily determined, a 0 value for the level parameter will - * perform a non-hinted invalidation. + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will perform + * a non-hinted invalidation. Any provided level outside the hint range + * will also cause fall-back to non-hinted invalidation. * * For Stage-2 invalidation, use the level values provided to that effect * in asm/stage2_pgtable.h. */ #define TLBI_TTL_MASK GENMASK_ULL(47, 44) +#define TLBI_TTL_UNKNOWN INT_MAX + #define __tlbi_level(op, addr, level) do { \ u64 arg = addr; \ \ if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) && \ - level) { \ + level >= 0 && level <= 3) { \ u64 ttl = level & 3; \ ttl |= get_trans_granule() << 2; \ arg &= ~TLBI_TTL_MASK; \ @@ -122,28 +125,34 @@ static inline unsigned long get_trans_granule(void) } while (0) /* - * This macro creates a properly formatted VA operand for the TLB RANGE. - * The value bit assignments are: + * This macro creates a properly formatted VA operand for the TLB RANGE. The + * value bit assignments are: * * +----------+------+-------+-------+-------+----------------------+ * | ASID | TG | SCALE | NUM | TTL | BADDR | * +-----------------+-------+-------+-------+----------------------+ * |63 48|47 46|45 44|43 39|38 37|36 0| * - * The address range is determined by below formula: - * [BADDR, BADDR + (NUM + 1) * 2^(5*SCALE + 1) * PAGESIZE) + * The address range is determined by below formula: [BADDR, BADDR + (NUM + 1) * + * 2^(5*SCALE + 1) * PAGESIZE) + * + * Note that the first argument, baddr, is pre-shifted; If LPA2 is in use, BADDR + * holds addr[52:16]. Else BADDR holds page number. See for example ARM DDI + * 0487J.a section C5.5.60 "TLBI VAE1IS, TLBI VAE1ISNXS, TLB Invalidate by VA, + * EL1, Inner Shareable". * */ -#define __TLBI_VADDR_RANGE(addr, asid, scale, num, ttl) \ - ({ \ - unsigned long __ta = (addr) >> PAGE_SHIFT; \ - __ta &= GENMASK_ULL(36, 0); \ - __ta |= (unsigned long)(ttl) << 37; \ - __ta |= (unsigned long)(num) << 39; \ - __ta |= (unsigned long)(scale) << 44; \ - __ta |= get_trans_granule() << 46; \ - __ta |= (unsigned long)(asid) << 48; \ - __ta; \ +#define __TLBI_VADDR_RANGE(baddr, asid, scale, num, ttl) \ + ({ \ + unsigned long __ta = (baddr); \ + unsigned long __ttl = (ttl >= 1 && ttl <= 3) ? ttl : 0; \ + __ta &= GENMASK_ULL(36, 0); \ + __ta |= __ttl << 37; \ + __ta |= (unsigned long)(num) << 39; \ + __ta |= (unsigned long)(scale) << 44; \ + __ta |= get_trans_granule() << 46; \ + __ta |= (unsigned long)(asid) << 48; \ + __ta; \ }) /* These macros are used by the TLBI RANGE feature. */ @@ -216,12 +225,16 @@ static inline unsigned long get_trans_granule(void) * CPUs, ensuring that any walk-cache entries associated with the * translation are also invalidated. * - * __flush_tlb_range(vma, start, end, stride, last_level) + * __flush_tlb_range(vma, start, end, stride, last_level, tlb_level) * Invalidate the virtual-address range '[start, end)' on all * CPUs for the user address space corresponding to 'vma->mm'. * The invalidation operations are issued at a granularity * determined by 'stride' and only affect any walk-cache entries - * if 'last_level' is equal to false. + * if 'last_level' is equal to false. tlb_level is the level at + * which the invalidation must take place. If the level is wrong, + * no invalidation may take place. In the case where the level + * cannot be easily determined, the value TLBI_TTL_UNKNOWN will + * perform a non-hinted invalidation. * * * Finally, take a look at asm/tlb.h to see how tlb_flush() is implemented @@ -346,34 +359,44 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * @tlb_level: Translation Table level hint, if known * @tlbi_user: If 'true', call an additional __tlbi_user() * (typically for user ASIDs). 'flase' for IPA instructions + * @lpa2: If 'true', the lpa2 scheme is used as set out below * * When the CPU does not support TLB range operations, flush the TLB * entries one by one at the granularity of 'stride'. If the TLB * range ops are supported, then: * - * 1. The minimum range granularity is decided by 'scale', so multiple range + * 1. If FEAT_LPA2 is in use, the start address of a range operation must be + * 64KB aligned, so flush pages one by one until the alignment is reached + * using the non-range operations. This step is skipped if LPA2 is not in + * use. + * + * 2. The minimum range granularity is decided by 'scale', so multiple range * TLBI operations may be required. Start from scale = 3, flush the largest * possible number of pages ((num+1)*2^(5*scale+1)) that fit into the * requested range, then decrement scale and continue until one or zero pages - * are left. + * are left. We must start from highest scale to ensure 64KB start alignment + * is maintained in the LPA2 case. * - * 2. If there is 1 page remaining, flush it through non-range operations. Range - * operations can only span an even number of pages. + * 3. If there is 1 page remaining, flush it through non-range operations. Range + * operations can only span an even number of pages. We save this for last to + * ensure 64KB start alignment is maintained for the LPA2 case. * * Note that certain ranges can be represented by either num = 31 and * scale or num = 0 and scale + 1. The loop below favours the latter * since num is limited to 30 by the __TLBI_RANGE_NUM() macro. */ #define __flush_tlb_range_op(op, start, pages, stride, \ - asid, tlb_level, tlbi_user) \ + asid, tlb_level, tlbi_user, lpa2) \ do { \ int num = 0; \ int scale = 3; \ + int shift = lpa2 ? 16 : PAGE_SHIFT; \ unsigned long addr; \ \ while (pages > 0) { \ if (!system_supports_tlb_range() || \ - pages == 1) { \ + pages == 1 || \ + (lpa2 && start != ALIGN(start, SZ_64K))) { \ addr = __TLBI_VADDR(start, asid); \ __tlbi_level(op, addr, tlb_level); \ if (tlbi_user) \ @@ -385,8 +408,8 @@ do { \ \ num = __TLBI_RANGE_NUM(pages, scale); \ if (num >= 0) { \ - addr = __TLBI_VADDR_RANGE(start, asid, scale, \ - num, tlb_level); \ + addr = __TLBI_VADDR_RANGE(start >> shift, asid, \ + scale, num, tlb_level); \ __tlbi(r##op, addr); \ if (tlbi_user) \ __tlbi_user(r##op, addr); \ @@ -398,7 +421,7 @@ do { \ } while (0) #define __flush_s2_tlb_range_op(op, start, pages, stride, tlb_level) \ - __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false) + __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false, kvm_lpa2_is_enabled()); static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, @@ -428,9 +451,11 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, asid = ASID(vma->vm_mm); if (last_level) - __flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, true); + __flush_tlb_range_op(vale1is, start, pages, stride, asid, + tlb_level, true, lpa2_is_enabled()); else - __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true); + __flush_tlb_range_op(vae1is, start, pages, stride, asid, + tlb_level, true, lpa2_is_enabled()); dsb(ish); mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, start, end); @@ -442,9 +467,10 @@ static inline void flush_tlb_range(struct vm_area_struct *vma, /* * We cannot use leaf-only invalidation here, since we may be invalidating * table entries as part of collapsing hugepages or moving page tables. - * Set the tlb_level to 0 because we can not get enough information here. + * Set the tlb_level to TLBI_TTL_UNKNOWN because we can not get enough + * information here. */ - __flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0); + __flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN); } static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)