From patchwork Mon Oct 14 10:59:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13834722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 234FFD1A45A for ; Mon, 14 Oct 2024 11:02:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF7A46B00B9; Mon, 14 Oct 2024 07:02:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA66C6B00EB; Mon, 14 Oct 2024 07:02:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CFA6C6B00EC; Mon, 14 Oct 2024 07:02:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B07726B00B9 for ; Mon, 14 Oct 2024 07:02:13 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0DEE3160D36 for ; Mon, 14 Oct 2024 11:02:05 +0000 (UTC) X-FDA: 82671918378.12.77457C8 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id B746E40023 for ; Mon, 14 Oct 2024 11:02:05 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728903616; a=rsa-sha256; cv=none; b=HXFJyjw6cdZkX7f/iFE0ReyhXJQFfBwBhtYHtIwu1Lw8dNg8qCu90tufZKBkmQpcX7wRxH ftkRhDla4+m8dPgquGHhhKeSIfMMarJfzf7cCIuin0NZpxVy/J2unbkC1I+51xJh4hXHTD mZrPbSSv6e+YH746Ac0MccmE61WaH7s= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728903616; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=24J2s1s+JtIBsBU/tXNcd36ly+a41o9WkxedPu4yJ84=; b=MedPfjvcbTdTc3LemoFxL7KxULdfYQuHZi4pnoNU7gvjCI4+ap6vwqrTSI0HI69cgvjCsk dxx9c+rZrA0Fc1kr8xs4VvvN69bHiuL5tGU3U9TniK8IzQKLX+ISk3P/aqXM1XAx4mSR+6 dI5u/jB5SiDpiAwu78zb34NMcuiuegg= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD6AF1763; Mon, 14 Oct 2024 04:02:40 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AE3BE3F51B; Mon, 14 Oct 2024 04:02:08 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Anshuman Khandual , Ard Biesheuvel , Catalin Marinas , David Hildenbrand , Greg Marsden , Ivan Ivanov , Kalesh Singh , Marc Zyngier , Mark Rutland , Matthias Brugger , Miroslav Benes , Will Deacon Cc: Ryan Roberts , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH v1 54/57] arm64: Support runtime folding in idmap_kpti_install_ng_mappings Date: Mon, 14 Oct 2024 11:59:01 +0100 Message-ID: <20241014105912.3207374-54-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241014105912.3207374-1-ryan.roberts@arm.com> References: <20241014105514.3206191-1-ryan.roberts@arm.com> <20241014105912.3207374-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B746E40023 X-Stat-Signature: 6buarbkndns9b6hprb9fh8rg355kk4fe X-Rspam-User: X-HE-Tag: 1728903725-331710 X-HE-Meta: U2FsdGVkX1+wEFsunBVRkGhUWz1U06y2u2eKfpuMPPvHTiwl1IaP7w27IKV9Z1mhPrDrC7IrKtgo4eyhqqhSUp/eRCWUnEvxLULvoax49NJLNGZdo/8pH2wJGpdt5S1tfE9e4bTLUkwEfROA4mJH7ATGmx4ixmQODJZTqn/tjcIX+Ya9Cl44xvQKGQH2VO/ZXdTNpEEcahA8ro/LS+jEdEcJ/YNeGE9pVkDT5ZfhK7j9NVMwyAqq4H8Bxke90vMP+BnN680eOP6Pwak2CFsGiaZC/xni7wyRqSGcg2sb2axwaiqokhKS/x/347OcvOmMOn3dqlTauCRlGZfd6WR2TIE/1gr5/znIEwKlUVs/9Pq3XVKlwlj+e/H/T7u1vOCzrAEAtqH3UxJ8xGtG3XnSoVot10/yxpOLZxs0WlDLEmok+C/x/IZWMYoqfqOH87VY5tD9qpfTm44bMhUtqDVOcOOkM0E8Kgx9wPs2SeSmeIVB2YxOxE5SLF/JicEvtVPPHnvGkNAhC9GNY5YMR63O2XMwFcxlgokbEkscpfm2Fy1KIsqcMAaRaZUzVwMsQDQDmoK1cKDosoe509wFacsXYNKnMd6Ql3l5UT+9Eq33zrtIys3scjA7L8N2pOIhGULBxclYQ8YHL9Sgj1Tqluk6DPOn8EhsHjn++/A1rmTFZmPXDug4H3XRVS2jbl4JQZDo+hC/WrRXRr/4IqaxCuEobfOff5ZQfrgPvTl4O9tVW5zbkK1sdNC6Xzqpzu6YT3NOs0I/laJ+RbPb52v3/Y2zJzN4pIbt7c5bKHRStklIHVY7VdA6HL+d5n4yRNefn9t5cF/xyPfuJutsfJYKUT/MgIMS9wk0lBw8qmvXIOmuDUEe3RhI0xJ+nSw43UwshYEjztd94AM5mQEE4jBvRjFZ0BojsnQ4GeFKvhtxabjZoFDL0+TN6VcH06nxuiyExxmt1lnBUzgkhiRHKAzi8/R 133mltlZ W8AT4gNKVJTYXKnh8u8WP0o0PUvr9g4DnupSumMEZisr7KU0YkZqtycCatLFVS3f2R521F6gLclkFaLD4sBdwbICcQVA++x64kiC7tAF6n18KREbxf8a99KjZ69jdhT8QuzWAArxkxkWUxNL4+A+9vvNjq77pU81rM/wNuor7orbHX9n1RRlJknTQC6bHzsR/UvIv5afLM8zgMV1yMwWAnXkZZIh43pAjwqE/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: TODO: Signed-off-by: Ryan Roberts --- ***NOTE*** Any confused maintainers may want to read the cover note here for context: https://lore.kernel.org/all/20241014105514.3206191-1-ryan.roberts@arm.com/ arch/arm64/include/asm/assembler.h | 5 ++ arch/arm64/kernel/cpufeature.c | 21 +++++- arch/arm64/mm/proc.S | 107 ++++++++++++++++++++++------- 3 files changed, 108 insertions(+), 25 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 6424fd6be1cbe..0cfa7c3efd214 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -919,6 +919,11 @@ alternative_cb_end value_for_page_size \val, \val, SZ_4K, SZ_16K, SZ_64K .endm + .macro get_page_shift, val + get_tg0 \val + value_for_page_size \val, \val, ARM64_PAGE_SHIFT_4K, ARM64_PAGE_SHIFT_16K, ARM64_PAGE_SHIFT_64K + .endm + .macro get_page_mask, val get_tg0 \val value_for_page_size \val, \val, (~(SZ_4K-1)), (~(SZ_16K-1)), (~(SZ_64K-1)) diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 663cc76569a27..ee94de556d3f0 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1908,11 +1908,27 @@ static phys_addr_t __init kpti_ng_pgd_alloc(int shift) return kpti_ng_temp_alloc; } +struct install_ng_pgtable_geometry { + unsigned long ptrs_per_pte; + unsigned long ptrs_per_pmd; + unsigned long ptrs_per_pud; + unsigned long ptrs_per_p4d; + unsigned long ptrs_per_pgd; +}; + static int __init __kpti_install_ng_mappings(void *__unused) { - typedef void (kpti_remap_fn)(int, int, phys_addr_t, unsigned long); + typedef void (kpti_remap_fn)(int, int, phys_addr_t, unsigned long, + struct install_ng_pgtable_geometry *); extern kpti_remap_fn idmap_kpti_install_ng_mappings; kpti_remap_fn *remap_fn; + struct install_ng_pgtable_geometry geometry = { + .ptrs_per_pte = PTRS_PER_PTE, + .ptrs_per_pmd = PTRS_PER_PMD, + .ptrs_per_pud = PTRS_PER_PUD, + .ptrs_per_p4d = PTRS_PER_P4D, + .ptrs_per_pgd = PTRS_PER_PGD, + }; int cpu = smp_processor_id(); int levels = CONFIG_PGTABLE_LEVELS; @@ -1957,7 +1973,8 @@ static int __init __kpti_install_ng_mappings(void *__unused) } cpu_install_idmap(); - remap_fn(cpu, num_online_cpus(), kpti_ng_temp_pgd_pa, KPTI_NG_TEMP_VA); + remap_fn(cpu, num_online_cpus(), kpti_ng_temp_pgd_pa, KPTI_NG_TEMP_VA, + &geometry); cpu_uninstall_idmap(); if (!cpu) { diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index ab5aa84923524..11bf6ba6dac33 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -190,7 +190,7 @@ SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1) .pushsection ".idmap.text", "a" .macro pte_to_phys, phys, pte - and \phys, \pte, #PTE_ADDR_LOW + and \phys, \pte, pte_addr_low #ifdef CONFIG_ARM64_PA_BITS_52 and \pte, \pte, #PTE_ADDR_HIGH orr \phys, \phys, \pte, lsl #PTE_ADDR_HIGH_SHIFT @@ -198,7 +198,8 @@ SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1) .endm .macro kpti_mk_tbl_ng, type, num_entries - add end_\type\()p, cur_\type\()p, #\num_entries * 8 + lsl scratch, \num_entries, #3 + add end_\type\()p, cur_\type\()p, scratch .Ldo_\type: ldr \type, [cur_\type\()p], #8 // Load the entry and advance tbz \type, #0, .Lnext_\type // Skip invalid and @@ -220,14 +221,18 @@ SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1) .macro kpti_map_pgtbl, type, level str xzr, [temp_pte, #8 * (\level + 2)] // break before make dsb nshst - add pte, temp_pte, #PAGE_SIZE * (\level + 2) + mov scratch, #(\level + 2) + mul scratch, scratch, page_size + add pte, temp_pte, scratch lsr pte, pte, #12 tlbi vaae1, pte dsb nsh isb phys_to_pte pte, cur_\type\()p - add cur_\type\()p, temp_pte, #PAGE_SIZE * (\level + 2) + mov scratch, #(\level + 2) + mul scratch, scratch, page_size + add cur_\type\()p, temp_pte, scratch orr pte, pte, pte_flags str pte, [temp_pte, #8 * (\level + 2)] dsb nshst @@ -235,7 +240,8 @@ SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1) /* * void __kpti_install_ng_mappings(int cpu, int num_secondaries, phys_addr_t temp_pgd, - * unsigned long temp_pte_va) + * unsigned long temp_pte_va, + * struct install_ng_pgtable_geometry *geometry) * * Called exactly once from stop_machine context by each CPU found during boot. */ @@ -251,6 +257,8 @@ SYM_TYPED_FUNC_START(idmap_kpti_install_ng_mappings) temp_pgd_phys .req x2 swapper_ttb .req x3 flag_ptr .req x4 + geometry .req x4 + scratch .req x4 cur_pgdp .req x5 end_pgdp .req x6 pgd .req x7 @@ -264,18 +272,45 @@ SYM_TYPED_FUNC_START(idmap_kpti_install_ng_mappings) valid .req x17 cur_p4dp .req x19 end_p4dp .req x20 - - mov x5, x3 // preserve temp_pte arg - mrs swapper_ttb, ttbr1_el1 - adr_l flag_ptr, __idmap_kpti_flag + page_size .req x21 + ptrs_per_pte .req x22 + ptrs_per_pmd .req x23 + ptrs_per_pud .req x24 + ptrs_per_p4d .req x25 + ptrs_per_pgd .req x26 + pte_addr_low .req x27 cbnz cpu, __idmap_kpti_secondary -#if CONFIG_PGTABLE_LEVELS > 4 - stp x29, x30, [sp, #-32]! + /* Preserve callee-saved registers */ + stp x19, x20, [sp, #-96]! + stp x21, x22, [sp, #80] + stp x23, x24, [sp, #64] + stp x25, x26, [sp, #48] + stp x27, x28, [sp, #32] + stp x29, x30, [sp, #16] mov x29, sp - stp x19, x20, [sp, #16] -#endif + + /* Load pgtable geometry parameters */ + get_page_size page_size + ldr ptrs_per_pte, [geometry, #0] + ldr ptrs_per_pmd, [geometry, #8] + ldr ptrs_per_pud, [geometry, #16] + ldr ptrs_per_p4d, [geometry, #24] + ldr ptrs_per_pgd, [geometry, #32] + + /* Precalculate pte_addr_low mask */ + get_page_shift x0 + mov pte_addr_low, #50 + sub pte_addr_low, pte_addr_low, x0 + mov scratch, #1 + lsl pte_addr_low, scratch, pte_addr_low + sub pte_addr_low, pte_addr_low, #1 + lsl pte_addr_low, pte_addr_low, x0 + + mov temp_pte, x3 + mrs swapper_ttb, ttbr1_el1 + adr_l flag_ptr, __idmap_kpti_flag /* We're the boot CPU. Wait for the others to catch up */ sevl @@ -290,7 +325,6 @@ SYM_TYPED_FUNC_START(idmap_kpti_install_ng_mappings) msr ttbr1_el1, temp_pgd_phys isb - mov temp_pte, x5 mov_q pte_flags, KPTI_NG_PTE_FLAGS /* Everybody is enjoying the idmap, so we can rewrite swapper. */ @@ -320,7 +354,7 @@ alternative_else_nop_endif /* PGD */ adrp cur_pgdp, swapper_pg_dir kpti_map_pgtbl pgd, -1 - kpti_mk_tbl_ng pgd, PTRS_PER_PGD + kpti_mk_tbl_ng pgd, ptrs_per_pgd /* Ensure all the updated entries are visible to secondary CPUs */ dsb ishst @@ -331,21 +365,33 @@ alternative_else_nop_endif isb /* Set the flag to zero to indicate that we're all done */ + adr_l flag_ptr, __idmap_kpti_flag str wzr, [flag_ptr] -#if CONFIG_PGTABLE_LEVELS > 4 - ldp x19, x20, [sp, #16] - ldp x29, x30, [sp], #32 -#endif + + /* Restore callee-saved registers */ + ldp x29, x30, [sp, #16] + ldp x27, x28, [sp, #32] + ldp x25, x26, [sp, #48] + ldp x23, x24, [sp, #64] + ldp x21, x22, [sp, #80] + ldp x19, x20, [sp], #96 + ret .Lderef_pgd: /* P4D */ .if CONFIG_PGTABLE_LEVELS > 4 p4d .req x30 + cmp ptrs_per_p4d, #1 + b.eq .Lfold_p4d pte_to_phys cur_p4dp, pgd kpti_map_pgtbl p4d, 0 - kpti_mk_tbl_ng p4d, PTRS_PER_P4D + kpti_mk_tbl_ng p4d, ptrs_per_p4d b .Lnext_pgd +.Lfold_p4d: + mov p4d, pgd // fold to next level + mov cur_p4dp, end_p4dp // must be equal to terminate loop + b .Lderef_p4d .else /* CONFIG_PGTABLE_LEVELS <= 4 */ p4d .req pgd .set .Lnext_p4d, .Lnext_pgd @@ -355,10 +401,16 @@ alternative_else_nop_endif /* PUD */ .if CONFIG_PGTABLE_LEVELS > 3 pud .req x10 + cmp ptrs_per_pud, #1 + b.eq .Lfold_pud pte_to_phys cur_pudp, p4d kpti_map_pgtbl pud, 1 - kpti_mk_tbl_ng pud, PTRS_PER_PUD + kpti_mk_tbl_ng pud, ptrs_per_pud b .Lnext_p4d +.Lfold_pud: + mov pud, p4d // fold to next level + mov cur_pudp, end_pudp // must be equal to terminate loop + b .Lderef_pud .else /* CONFIG_PGTABLE_LEVELS <= 3 */ pud .req pgd .set .Lnext_pud, .Lnext_pgd @@ -368,10 +420,16 @@ alternative_else_nop_endif /* PMD */ .if CONFIG_PGTABLE_LEVELS > 2 pmd .req x13 + cmp ptrs_per_pmd, #1 + b.eq .Lfold_pmd pte_to_phys cur_pmdp, pud kpti_map_pgtbl pmd, 2 - kpti_mk_tbl_ng pmd, PTRS_PER_PMD + kpti_mk_tbl_ng pmd, ptrs_per_pmd b .Lnext_pud +.Lfold_pmd: + mov pmd, pud // fold to next level + mov cur_pmdp, end_pmdp // must be equal to terminate loop + b .Lderef_pmd .else /* CONFIG_PGTABLE_LEVELS <= 2 */ pmd .req pgd .set .Lnext_pmd, .Lnext_pgd @@ -381,7 +439,7 @@ alternative_else_nop_endif /* PTE */ pte_to_phys cur_ptep, pmd kpti_map_pgtbl pte, 3 - kpti_mk_tbl_ng pte, PTRS_PER_PTE + kpti_mk_tbl_ng pte, ptrs_per_pte b .Lnext_pmd .unreq cpu @@ -408,6 +466,9 @@ alternative_else_nop_endif /* Secondary CPUs end up here */ __idmap_kpti_secondary: + mrs swapper_ttb, ttbr1_el1 + adr_l flag_ptr, __idmap_kpti_flag + /* Uninstall swapper before surgery begins */ __idmap_cpu_set_reserved_ttbr1 x16, x17