From patchwork Tue Sep 12 14:16:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 13381807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 73B93CA0EED for ; Tue, 12 Sep 2023 14:21:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=IHUUiv38Jn/nmMyAT5VfE7ebxeS+XQ5H347m/9itFao=; b=E0W8CAeKlRNnnK86MoiBGWc5Ww 6fBrFE0EKEQdgBREiDAwMeUoLwRumuo20dKrm2IMuryUZIS2UPXUNzk9E5MsQ85UDG1ZVzlvB7mvm 8cAE3k9klGfslOtpAo/hKTb/PzVQaHEjxNas/z61ykFzBdHExCD9ZjWFIgVZtmolEo+fdzwnWvI3l /jvRrW5qisVjWcWuaBwm6doJ5FpdNahvv0DRoxC8YSq20Ho6rMQIExHlxnfH0+BnwdvrMg2OB2INh S+ZevzBSAQ7SLIc13FvuUVoAGhEcCjID6ZIWdOnlN77ycMhNDkAlVc+L2EtYkGdOUfcJKLjLVFdTW tko+/Hhg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qg4Fi-003YJU-0p; Tue, 12 Sep 2023 14:20:30 +0000 Received: from mail-wm1-x349.google.com ([2a00:1450:4864:20::349]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qg4ET-003XHC-22 for linux-arm-kernel@lists.infradead.org; Tue, 12 Sep 2023 14:19:23 +0000 Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-402c46c4a04so43039985e9.2 for ; Tue, 12 Sep 2023 07:19:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694528351; x=1695133151; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3bHaxkXzhxvGKFXmJcqS1wtte4sDGQRztFvukt6wPec=; b=FIDhwLUP/yfdIbSjvWCUgTkHf1HLRJlPHSe2lHj84sdcC/knmKw3geuqd4XSxx/wTe KawRtzQbTxedKZBCFXYbj6lEktmF8r54lZNvWYD1849ZV+ys0s5RvVccqfU7yIu6Wknp R7zTC4kwC8F81UnJMC9AmtZE+JZc0djg0/2YIUN10Fw2kQRCy+Pm4F4BH/JuhJOE1zbx j58KFEmUbzvjNkS7DHxESmfy7Rfch1FmyqZw7mpQ+/LJF/qyCSk+dsVYoljzdAx+Vb5m cKtOZt7jCB1Yr/9zzSLHL9jGHvOxv2kmcPpXuKROc9E9w/tIBMmN65aLujoTyCX+IMfJ 6dBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694528351; x=1695133151; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3bHaxkXzhxvGKFXmJcqS1wtte4sDGQRztFvukt6wPec=; b=rBoD6UOSCC2NZbaYFV+cnyA4GxAIka3i9hcUF2VGBisreiMdEhtLEg7j7mcHWt4XO3 R21U12cRd6hhvE1/HXdrDVGs+WprZQxQ6wFtrXP2jAcS/aKUeMOWRLMT7y2XDJ09aA0b Xh8OO3h0uaq0WXOqefibYKT0i1PB2FOoEAa5KQAh+2mWS/ikIS6UuzIi9BO8SX+pIySZ OYrCCrY10RkhQOgtG0c5xp+oK8DRvruodk/qR47wPs+7yuKb9CCOjKi8qv+62tLqX6sg uikrvLFU6vb90oXKbsxzFcdteAhbboZ4E1pvEL6u/55Jr8MDtlgHy85dng5hJYN4zjH1 YmrA== X-Gm-Message-State: AOJu0Yyl8CfmXvdcsbWhr839pkNLrPjsPLZ7QjNO7/PWU0V3ji8FXcmK 18rBzut6KHZ+JxOir208P+8957ChXdODU/SOztUOvo3ZVH3uB5LZ+13nSIWd0aPg3e2J785lxSY BGTEIn6Xtc/ZTSJiL35xLVUkT7Rwnu1Mq5DMiaXGua1PbqCPLClzkFoCKey3rWHIl9gw/13fXNg c= X-Google-Smtp-Source: AGHT+IGyCmJND6uGBkZVwgfA2p7zjdcoTsxMlEZq02sUqAmPpEaHHqXlsbTyPBKGWO1y3Ntibvl/KIZT X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a1c:7914:0:b0:402:eac3:7a5e with SMTP id l20-20020a1c7914000000b00402eac37a5emr216564wme.3.1694528351550; Tue, 12 Sep 2023 07:19:11 -0700 (PDT) Date: Tue, 12 Sep 2023 14:16:38 +0000 In-Reply-To: <20230912141549.278777-63-ardb@google.com> Mime-Version: 1.0 References: <20230912141549.278777-63-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=10782; i=ardb@kernel.org; h=from:subject; bh=uGTAqqb+ogGidSNUTuAXAW/9slzvBPb0KbBnx4xxiPQ=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIZWh6ISpBkvD0yccejOmrk/bd7Qx8bLob9v6socH2vefv xS2b2FpRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjITx1Ghq/prYczKqbfea/Y klCpeZ5lxyeWVU5J3NLr2efN+aj9jYnhN1vzmj+p24rm33XJyis3vRgw4bqaFnOsnWOVWJbZn2f lLAA= X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230912141549.278777-111-ardb@google.com> Subject: [PATCH v4 48/61] arm64: mm: Add definitions to support 5 levels of paging From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: Ard Biesheuvel , Catalin Marinas , Will Deacon , Marc Zyngier , Mark Rutland , Ryan Roberts , Anshuman Khandual , Kees Cook , Joey Gouly X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230912_071913_731986_CE24DB1E X-CRM114-Status: GOOD ( 24.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel Add the required types and descriptor accessors to support 5 levels of paging in the common code. This is one of the prerequisites for supporting 52-bit virtual addressing with 4k pages. Note that this does not cover the code that handles kernel mappings or the fixmap. Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/pgalloc.h | 41 ++++++++++ arch/arm64/include/asm/pgtable-hwdef.h | 22 +++++- arch/arm64/include/asm/pgtable-types.h | 6 ++ arch/arm64/include/asm/pgtable.h | 82 +++++++++++++++++++- arch/arm64/mm/mmu.c | 31 +++++++- arch/arm64/mm/pgd.c | 15 +++- 6 files changed, 188 insertions(+), 9 deletions(-) diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h index 237224484d0f..cae8c648f462 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -60,6 +60,47 @@ static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot) } #endif /* CONFIG_PGTABLE_LEVELS > 3 */ +#if CONFIG_PGTABLE_LEVELS > 4 + +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot) +{ + if (pgtable_l5_enabled()) + set_pgd(pgdp, __pgd(__phys_to_pgd_val(p4dp) | prot)); +} + +static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp) +{ + pgdval_t pgdval = PGD_TYPE_TABLE; + + pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN; + __pgd_populate(pgdp, __pa(p4dp), pgdval); +} + +static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr) +{ + gfp_t gfp = GFP_PGTABLE_USER; + + if (mm == &init_mm) + gfp = GFP_PGTABLE_KERNEL; + return (p4d_t *)get_zeroed_page(gfp); +} + +static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) +{ + if (!pgtable_l5_enabled()) + return; + BUG_ON((unsigned long)p4d & (PAGE_SIZE-1)); + free_page((unsigned long)p4d); +} + +#define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d) +#else +static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot) +{ + BUILD_BUG(); +} +#endif /* CONFIG_PGTABLE_LEVELS > 4 */ + extern pgd_t *pgd_alloc(struct mm_struct *mm); extern void pgd_free(struct mm_struct *mm, pgd_t *pgdp); diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index 4426f48f2ae0..ef207a0d4f0d 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -26,10 +26,10 @@ #define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3)) /* - * Size mapped by an entry at level n ( 0 <= n <= 3) + * Size mapped by an entry at level n ( -1 <= n <= 3) * We map (PAGE_SHIFT - 3) at all translation levels and PAGE_SHIFT bits * in the final page. The maximum number of translation levels supported by - * the architecture is 4. Hence, starting at level n, we have further + * the architecture is 5. Hence, starting at level n, we have further * ((4 - n) - 1) levels of translation excluding the offset within the page. * So, the total number of bits mapped by an entry at level n is : * @@ -62,9 +62,16 @@ #define PTRS_PER_PUD (1 << (PAGE_SHIFT - 3)) #endif +#if CONFIG_PGTABLE_LEVELS > 4 +#define P4D_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(0) +#define P4D_SIZE (_AC(1, UL) << P4D_SHIFT) +#define P4D_MASK (~(P4D_SIZE-1)) +#define PTRS_PER_P4D (1 << (PAGE_SHIFT - 3)) +#endif + /* * PGDIR_SHIFT determines the size a top-level page table entry can map - * (depending on the configuration, this level can be 0, 1 or 2). + * (depending on the configuration, this level can be -1, 0, 1 or 2). */ #define PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - CONFIG_PGTABLE_LEVELS) #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT) @@ -87,6 +94,15 @@ /* * Hardware page table definitions. * + * Level -1 descriptor (PGD). + */ +#define PGD_TYPE_TABLE (_AT(pgdval_t, 3) << 0) +#define PGD_TABLE_BIT (_AT(pgdval_t, 1) << 1) +#define PGD_TYPE_MASK (_AT(pgdval_t, 3) << 0) +#define PGD_TABLE_PXN (_AT(pgdval_t, 1) << 59) +#define PGD_TABLE_UXN (_AT(pgdval_t, 1) << 60) + +/* * Level 0 descriptor (P4D). */ #define P4D_TYPE_TABLE (_AT(p4dval_t, 3) << 0) diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h index b8f158ae2527..6d6d4065b0cb 100644 --- a/arch/arm64/include/asm/pgtable-types.h +++ b/arch/arm64/include/asm/pgtable-types.h @@ -36,6 +36,12 @@ typedef struct { pudval_t pud; } pud_t; #define __pud(x) ((pud_t) { (x) } ) #endif +#if CONFIG_PGTABLE_LEVELS > 4 +typedef struct { p4dval_t p4d; } p4d_t; +#define p4d_val(x) ((x).p4d) +#define __p4d(x) ((p4d_t) { (x) } ) +#endif + typedef struct { pgdval_t pgd; } pgd_t; #define pgd_val(x) ((x).pgd) #define __pgd(x) ((pgd_t) { (x) } ) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 2d1294b532f4..769abda8a4f9 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -800,7 +800,6 @@ static inline pud_t *p4d_pgtable(p4d_t p4d) #else #define p4d_page_paddr(p4d) ({ BUILD_BUG(); 0;}) -#define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;}) /* Match pud_offset folding in */ #define pud_set_fixmap(addr) NULL @@ -811,6 +810,87 @@ static inline pud_t *p4d_pgtable(p4d_t p4d) #endif /* CONFIG_PGTABLE_LEVELS > 3 */ +#if CONFIG_PGTABLE_LEVELS > 4 + +static __always_inline bool pgtable_l5_enabled(void) +{ + if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT)) + return vabits_actual == VA_BITS; + return alternative_has_cap_unlikely(ARM64_HAS_VA52); +} + +static inline bool mm_p4d_folded(const struct mm_struct *mm) +{ + return !pgtable_l5_enabled(); +} +#define mm_p4d_folded mm_p4d_folded + +#define p4d_ERROR(e) \ + pr_err("%s:%d: bad p4d %016llx.\n", __FILE__, __LINE__, p4d_val(e)) + +#define pgd_none(pgd) (pgtable_l5_enabled() && !pgd_val(pgd)) +#define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & 2)) +#define pgd_present(pgd) (!pgd_none(pgd)) + +static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) +{ + if (in_swapper_pgdir(pgdp)) { + set_swapper_pgd(pgdp, __pgd(pgd_val(pgd))); + return; + } + + WRITE_ONCE(*pgdp, pgd); + dsb(ishst); + isb(); +} + +static inline void pgd_clear(pgd_t *pgdp) +{ + if (pgtable_l5_enabled()) + set_pgd(pgdp, __pgd(0)); +} + +static inline phys_addr_t pgd_page_paddr(pgd_t pgd) +{ + return __pgd_to_phys(pgd); +} + +#define p4d_index(addr) (((addr) >> P4D_SHIFT) & (PTRS_PER_P4D - 1)) + +static inline p4d_t *pgd_to_folded_p4d(pgd_t *pgdp, unsigned long addr) +{ + return (p4d_t *)PTR_ALIGN_DOWN(pgdp, PAGE_SIZE) + p4d_index(addr); +} + +static inline phys_addr_t p4d_offset_phys(pgd_t *pgdp, unsigned long addr) +{ + BUG_ON(!pgtable_l5_enabled()); + + return pgd_page_paddr(READ_ONCE(*pgdp)) + p4d_index(addr) * sizeof(p4d_t); +} + +static inline +p4d_t *p4d_offset_lockless(pgd_t *pgdp, pgd_t pgd, unsigned long addr) +{ + if (!pgtable_l5_enabled()) + return pgd_to_folded_p4d(pgdp, addr); + return (p4d_t *)__va(pgd_page_paddr(pgd)) + p4d_index(addr); +} +#define p4d_offset_lockless p4d_offset_lockless + +static inline p4d_t *p4d_offset(pgd_t *pgdp, unsigned long addr) +{ + return p4d_offset_lockless(pgdp, READ_ONCE(*pgdp), addr); +} + +#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd))) + +#else + +static inline bool pgtable_l5_enabled(void) { return false; } + +#endif /* CONFIG_PGTABLE_LEVELS > 4 */ + #define pgd_ERROR(e) \ pr_err("%s:%d: bad pgd %016llx.\n", __FILE__, __LINE__, pgd_val(e)) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 50972763a67e..2212289c9a02 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1022,7 +1022,7 @@ static void free_empty_pud_table(p4d_t *p4dp, unsigned long addr, if (CONFIG_PGTABLE_LEVELS <= 3) return; - if (!pgtable_range_aligned(start, end, floor, ceiling, PGDIR_MASK)) + if (!pgtable_range_aligned(start, end, floor, ceiling, P4D_MASK)) return; /* @@ -1045,8 +1045,8 @@ static void free_empty_p4d_table(pgd_t *pgdp, unsigned long addr, unsigned long end, unsigned long floor, unsigned long ceiling) { - unsigned long next; p4d_t *p4dp, p4d; + unsigned long i, next, start = addr; do { next = p4d_addr_end(addr, end); @@ -1058,6 +1058,27 @@ static void free_empty_p4d_table(pgd_t *pgdp, unsigned long addr, WARN_ON(!p4d_present(p4d)); free_empty_pud_table(p4dp, addr, next, floor, ceiling); } while (addr = next, addr < end); + + if (!pgtable_l5_enabled()) + return; + + if (!pgtable_range_aligned(start, end, floor, ceiling, PGDIR_MASK)) + return; + + /* + * Check whether we can free the p4d page if the rest of the + * entries are empty. Overlap with other regions have been + * handled by the floor/ceiling check. + */ + p4dp = p4d_offset(pgdp, 0UL); + for (i = 0; i < PTRS_PER_P4D; i++) { + if (!p4d_none(READ_ONCE(p4dp[i]))) + return; + } + + pgd_clear(pgdp); + __flush_tlb_kernel_pgtable(start); + free_hotplug_pgtable_page(virt_to_page(p4dp)); } static void free_empty_tables(unsigned long addr, unsigned long end, @@ -1142,6 +1163,12 @@ int pmd_set_huge(pmd_t *pmdp, phys_addr_t phys, pgprot_t prot) return 1; } +#ifndef __PAGETABLE_P4D_FOLDED +void p4d_clear_huge(p4d_t *p4dp) +{ +} +#endif + int pud_clear_huge(pud_t *pudp) { if (!pud_sect(READ_ONCE(*pudp))) diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c index 4a64089e5771..3c4f8a279d2b 100644 --- a/arch/arm64/mm/pgd.c +++ b/arch/arm64/mm/pgd.c @@ -17,11 +17,20 @@ static struct kmem_cache *pgd_cache __ro_after_init; +static bool pgdir_is_page_size(void) +{ + if (PGD_SIZE == PAGE_SIZE) + return true; + if (CONFIG_PGTABLE_LEVELS == 5) + return !pgtable_l5_enabled(); + return false; +} + pgd_t *pgd_alloc(struct mm_struct *mm) { gfp_t gfp = GFP_PGTABLE_USER; - if (PGD_SIZE == PAGE_SIZE) + if (pgdir_is_page_size()) return (pgd_t *)__get_free_page(gfp); else return kmem_cache_alloc(pgd_cache, gfp); @@ -29,7 +38,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm) void pgd_free(struct mm_struct *mm, pgd_t *pgd) { - if (PGD_SIZE == PAGE_SIZE) + if (pgdir_is_page_size()) free_page((unsigned long)pgd); else kmem_cache_free(pgd_cache, pgd); @@ -37,7 +46,7 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd) void __init pgtable_cache_init(void) { - if (PGD_SIZE == PAGE_SIZE) + if (pgdir_is_page_size()) return; #ifdef CONFIG_ARM64_PA_BITS_52