From patchwork Thu Mar 16 13:17:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Ghiti X-Patchwork-Id: 13177627 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ADF2C6FD19 for ; Thu, 16 Mar 2023 13:28:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30769900003; Thu, 16 Mar 2023 09:28:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B715900002; Thu, 16 Mar 2023 09:28:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17F10900003; Thu, 16 Mar 2023 09:28:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 08601900002 for ; Thu, 16 Mar 2023 09:28:04 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C6CC0A09A6 for ; Thu, 16 Mar 2023 13:28:03 +0000 (UTC) X-FDA: 80574839646.13.4E0D8EB Received: from mail-ed1-f46.google.com (mail-ed1-f46.google.com [209.85.208.46]) by imf14.hostedemail.com (Postfix) with ESMTP id B7605100014 for ; Thu, 16 Mar 2023 13:28:01 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=HyU+ShRO; spf=pass (imf14.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678973281; a=rsa-sha256; cv=none; b=8axYSfVH0bAJ4LNAGN6UmKP+pxz1tYpNpYuZ9nZbingSMScOvMYhVCO7JXwGKaIo9sSpZO rkO8edMBiE8pl5BdmJXyKd/Kx02wEj/JJMIgOd5hW5fzJRFR2IcBYQULwoE/tvNVyTlbuh ddpp7vCPApdkMZQlj25i4Xd+vngRc9g= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=HyU+ShRO; spf=pass (imf14.hostedemail.com: domain of alexghiti@rivosinc.com designates 209.85.208.46 as permitted sender) smtp.mailfrom=alexghiti@rivosinc.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678973281; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xI+mOo9w6N5hnNZASu9KEqOIyOdQmEOvWrAnFqrzLH0=; b=rRi5n2ZoIVZyNidDldIHQ3KS3d+loE87NEABeVm3SkQU6le8LvtdDI1gjPCYvPCOttgxL8 p7inSQG4eODudGBjx4Uy1VnKX3mLlJRw18uXZ6026hDe79EQHFqsnSEsCk7zFDXIw/1Xq9 l26EJ+/pEgUg/FGkZxDsYaCYAI9HMew= Received: by mail-ed1-f46.google.com with SMTP id x3so7593135edb.10 for ; Thu, 16 Mar 2023 06:28:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1678973280; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xI+mOo9w6N5hnNZASu9KEqOIyOdQmEOvWrAnFqrzLH0=; b=HyU+ShROEFVeScMNEWVX3zkJn/83YknBtCESZ8I4D4X9rNOqMaqrTlsasz0BPTWZPQ w1QvEJ40UNKS22PVRs4ojQ7JYwzM0CEDHuFUSzhhGxu6JIqe7xR3HF5iEIKeq/EIfJ7n RPuHvGjFr9y4tLZiw1CDzUklgCfzCbhNbU022NX0M21mUQPTOwf3n8Q5jZseVkviNDRp o8rYW1AC+INHgCYt5YXxOWHIwyZiOM9+Ta8MDesKEvTrXg+g2Qs2r5+eE54ktYn9KIvf uhYR2JZ25Mj+b0zRHLfCnubKo5vM3VSRuVgt1TaNO2ez+qdgNueCXFNtqsx6syIFnJkv kP9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678973280; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xI+mOo9w6N5hnNZASu9KEqOIyOdQmEOvWrAnFqrzLH0=; b=zIQ1M6H9QIkepP8YJfLUv8wPDN4Hlkh4NTLZxQh2lVJg7w2IRBhplfPhACROflLj1K ZO1osG2TKTh422np3ulr95LdBrqFJ26WWdn6NrsWIVCdv5Rk29K89nt2y3zfvvsKd8Fs rmMI7yER6gZiIsNlDVMt769Wf3sdgBXGs/1SexBKj606qI2Ll44uxoa1PZ7EjWsVNEXU YDN7FxUHda6S6tb+njXuhxdOZu8HEWY/btTSnkq9eAFu8+voxXUfrsjxn8yY7ONXbB/C ADWmJIq4iDW/BMOCr/AQFeXM7qapqgxXajuZSuFMqhpvLPGjWglI5R2CyXtMyXircSo8 PImQ== X-Gm-Message-State: AO0yUKXfvuQ/0NxQPLRZ4HfhEVt6vjp6cS7KfdGME1XQSWws5IaPHCny oP9pVXvQajt0oi6zyPTqN+/0V77gGmZw3sftBPA= X-Google-Smtp-Source: AK7set8TF8WABsMAn9COJsU0X76koVWVflLbQ8RfbeebysWvsoAh0GH6ucIAFoBJcTIPjKQxwFOEDQ== X-Received: by 2002:a05:600c:4453:b0:3eb:389d:156c with SMTP id v19-20020a05600c445300b003eb389d156cmr22345321wmn.37.1678972883803; Thu, 16 Mar 2023 06:21:23 -0700 (PDT) Received: from alex-rivos.ba.rivosinc.com (238.174.185.81.rev.sfr.net. [81.185.174.238]) by smtp.gmail.com with ESMTPSA id bh10-20020a05600c3d0a00b003ed23e9e03bsm4993248wmb.46.2023.03.16.06.21.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Mar 2023 06:21:23 -0700 (PDT) From: Alexandre Ghiti To: Catalin Marinas , Will Deacon , Paul Walmsley , Palmer Dabbelt , Albert Ou , Rob Herring , Frank Rowand , Mike Rapoport , Andrew Morton , Anup Patel , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, devicetree@vger.kernel.org, linux-mm@kvack.org Cc: Alexandre Ghiti , Rob Herring , Andrew Jones Subject: [PATCH v8 4/4] riscv: Use PUD/P4D/PGD pages for the linear mapping Date: Thu, 16 Mar 2023 14:17:11 +0100 Message-Id: <20230316131711.1284451-5-alexghiti@rivosinc.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20230316131711.1284451-1-alexghiti@rivosinc.com> References: <20230316131711.1284451-1-alexghiti@rivosinc.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B7605100014 X-Rspamd-Server: rspam01 X-Stat-Signature: u7j96gymdi1c1arsffc5a3ucpgi5wfn3 X-HE-Tag: 1678973281-592923 X-HE-Meta: U2FsdGVkX1/gvwMuqvc4KjjJJeMiXLSjPgH2ZXAS32LBpO327Ei16vRcq9K4FtT/+lODquWxeiNKWpOVVWYVfRO2YhIyOvD/tsYgMCcxmS0IOeN5F0hpb00lIMi5cCPvOid3gF44F9Fkkjd5dBu+XSojLU3jNj7eG3Cnw1bVqdSzSy92fIj4T2ExEcbMQYL/YG2AQe7MgTpwg6khfIhrgvkPLq9cNddMbIPM0jVqkwUfrj1OOMox39lnYQsndwE5xqtCd9skQbyBwJSLcEHtkYFEBxD3M8Ah/gzP2gMiKvEtOFxIawVP/FIPHUdJAht6lCso5J7cKgZGah79RIDMGQYTZThsv3QrY/rs6w2M4zMXtIyGPQoAZZZ2jUrDSM8OCHhHqqsJq2+x1eaxfyMXBtBAlDyUPURL7fuIi+2tuX2xZkEVGZEJy7A1QQeFe1U9PqvQFMCZ1iPqFG333bmVzeB58lXYST8Dtp/TfnMvKOMX2AIy1airmbmVTIw1ab0wEyjec7z8eY9FUc69NXMixHFMNcSd5I/anqjErgVU2VvUJtRl0MGcKs3dNgSzQN6Tw913mqL1RdM3T4qxtt6hpMqqOhu4br33JDwEmMKKinDuxdyE1QqUs5FRczHM3mPg9QdFhHn/Z7jz5hCCjrCbvQ4Oxg14fIbaYn8BtVuaEazc3tMqr5L5VJ/G4qB1Uz5DjvJXGmmwGG3QmNYpw/YAjeQNtixSjsDvrQHl37GSHgNXr3+9krbktNp6W/IXA3ZQzgRadxbP9IBjcqeEX+d5lmkBORn62tbq/cN+8POZ1AFg0EgA0tk2I1GFv4vf+tjd7ASIE4/ebEDS30NyOBsa5Rsn/wADjQdZsge7w/obYvCyglBD8+Sj/AyZWofv1OIEB+/Xfgnb4TtiAAo7CtklV/mZJuS6fdjOp11XdEwKteZkOjsfp7jpWiqma8/cxqsQq8W7Ab6vQtzP+WX1vnW RBCnJnix AU36ysD9iwHBlUSQCJU5GB+NN7efyMnYrLankndHfiBgCd/uYsdqDJFHmDtmn4fquRnnUGlVVc45sS8x6JOoHEk1g4+LcmEDkrvk+hdHQkgMMPM4+/bBFrBBTUdiQrd6QKjDO9FaFtP0sBQdFm0xrIuTU3yLhr+EelnyVQN0IUTTNpB8Wx3lt0bmD9yOcjhiiANJ+hhswalEAs5yWGZzvXb6UBBfDGCdumY1TIRy9G9Lv2RpPBwCZWr1ugBnJgNcx8Rtx7fEZ5FSMIhaDE3wxM9zsM4TjZ1HY4ah/hN7RibR886Xg4qcNu5jn9w779M5xYE6j3KZuKrSbMb2QVLoakLFEA22QhX/9TseeYyEl/jx5IXgT0pAAyhqw42K3rJNnOlBkKchgd0zYg2G/J8+gGwETLwmjZaEtpRR3aTYZuB+AUze2h53NxvO0mdTzMG/uWDQazHzYjkrXlsOHlUgG2xbUKhBOG5antYFtqbyl9IBTeaXLP4UCBmYyhAJc6kC/qyRt/AuRZIx/cQkQXdmwKpUH0kIkxk7bajxQLr1fqUU7tc0ZHJLAvnYwA9+/1Z9ThP71mHBBSFE0IDUuPqeRp/hWHV9hKQHS9+bnkRK2tvPdDfutHzPHm0BRR0XEklx61TtqNjKNqL3v+qXSkIAfs4riVXrmsJQDeAcW68fVKLmCFyVK0/B0u0rh3dNCI44lSzpMREOkx6JEUcN+idUSF6L6Mw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During the early page table creation, we used to set the mapping for PAGE_OFFSET to the kernel load address: but the kernel load address is always offseted by PMD_SIZE which makes it impossible to use PUD/P4D/PGD pages as this physical address is not aligned on PUD/P4D/PGD size (whereas PAGE_OFFSET is). But actually we don't have to establish this mapping (ie set va_pa_offset) that early in the boot process because: - first, setup_vm installs a temporary kernel mapping and among other things, discovers the system memory, - then, setup_vm_final creates the final kernel mapping and takes advantage of the discovered system memory to create the linear mapping. During the first phase, we don't know the start of the system memory and then until the second phase is finished, we can't use the linear mapping at all and phys_to_virt/virt_to_phys translations must not be used because it would result in a different translation from the 'real' one once the final mapping is installed. So here we simply delay the initialization of va_pa_offset to after the system memory discovery. But to make sure noone uses the linear mapping before, we add some guard in the DEBUG_VIRTUAL config. Finally we can use PUD/P4D/PGD hugepages when possible, which will result in a better TLB utilization. Note that: - this does not apply to rv32 as the kernel mapping lies in the linear mapping. - we rely on the firmware to protect itself using PMP. Signed-off-by: Alexandre Ghiti Acked-by: Rob Herring # DT bits Reviewed-by: Andrew Jones Reviewed-by: Anup Patel Tested-by: Anup Patel --- arch/riscv/include/asm/page.h | 16 ++++++++++++ arch/riscv/mm/init.c | 49 ++++++++++++++++++++++++++++++----- arch/riscv/mm/physaddr.c | 16 ++++++++++++ drivers/of/fdt.c | 11 ++++---- 4 files changed, 81 insertions(+), 11 deletions(-) diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h index 8dc686f549b6..ea1a0e237211 100644 --- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -90,6 +90,14 @@ typedef struct page *pgtable_t; #define PTE_FMT "%08lx" #endif +#ifdef CONFIG_64BIT +/* + * We override this value as its generic definition uses __pa too early in + * the boot process (before kernel_map.va_pa_offset is set). + */ +#define MIN_MEMBLOCK_ADDR 0 +#endif + #ifdef CONFIG_MMU #define ARCH_PFN_OFFSET (PFN_DOWN((unsigned long)phys_ram_base)) #else @@ -121,7 +129,11 @@ extern phys_addr_t phys_ram_base; #define is_linear_mapping(x) \ ((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE)) +#ifndef CONFIG_DEBUG_VIRTUAL #define linear_mapping_pa_to_va(x) ((void *)((unsigned long)(x) + kernel_map.va_pa_offset)) +#else +void *linear_mapping_pa_to_va(unsigned long x); +#endif #define kernel_mapping_pa_to_va(y) ({ \ unsigned long _y = (unsigned long)(y); \ (IS_ENABLED(CONFIG_XIP_KERNEL) && _y < phys_ram_base) ? \ @@ -130,7 +142,11 @@ extern phys_addr_t phys_ram_base; }) #define __pa_to_va_nodebug(x) linear_mapping_pa_to_va(x) +#ifndef CONFIG_DEBUG_VIRTUAL #define linear_mapping_va_to_pa(x) ((unsigned long)(x) - kernel_map.va_pa_offset) +#else +phys_addr_t linear_mapping_va_to_pa(unsigned long x); +#endif #define kernel_mapping_va_to_pa(y) ({ \ unsigned long _y = (unsigned long)(y); \ (IS_ENABLED(CONFIG_XIP_KERNEL) && _y < kernel_map.virt_addr + XIP_OFFSET) ? \ diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index cc558d94559a..7af7cd201a9c 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -213,6 +213,14 @@ static void __init setup_bootmem(void) phys_ram_end = memblock_end_of_DRAM(); if (!IS_ENABLED(CONFIG_XIP_KERNEL)) phys_ram_base = memblock_start_of_DRAM(); + + /* + * In 64-bit, any use of __va/__pa before this point is wrong as we + * did not know the start of DRAM before. + */ + if (IS_ENABLED(CONFIG_64BIT)) + kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base; + /* * memblock allocator is not aware of the fact that last 4K bytes of * the addressable memory can not be mapped because of IS_ERR_VALUE @@ -667,9 +675,16 @@ void __init create_pgd_mapping(pgd_t *pgdp, static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size) { - /* Upgrade to PMD_SIZE mappings whenever possible */ - base &= PMD_SIZE - 1; - if (!base && size >= PMD_SIZE) + if (!(base & (PGDIR_SIZE - 1)) && size >= PGDIR_SIZE) + return PGDIR_SIZE; + + if (!(base & (P4D_SIZE - 1)) && size >= P4D_SIZE) + return P4D_SIZE; + + if (!(base & (PUD_SIZE - 1)) && size >= PUD_SIZE) + return PUD_SIZE; + + if (!(base & (PMD_SIZE - 1)) && size >= PMD_SIZE) return PMD_SIZE; return PAGE_SIZE; @@ -978,11 +993,22 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) set_satp_mode(); #endif - kernel_map.va_pa_offset = PAGE_OFFSET - kernel_map.phys_addr; + /* + * In 64-bit, we defer the setup of va_pa_offset to setup_bootmem, + * where we have the system memory layout: this allows us to align + * the physical and virtual mappings and then make use of PUD/P4D/PGD + * for the linear mapping. This is only possible because the kernel + * mapping lies outside the linear mapping. + * In 32-bit however, as the kernel resides in the linear mapping, + * setup_vm_final can not change the mapping established here, + * otherwise the same kernel addresses would get mapped to different + * physical addresses (if the start of dram is different from the + * kernel physical address start). + */ + kernel_map.va_pa_offset = IS_ENABLED(CONFIG_64BIT) ? + 0UL : PAGE_OFFSET - kernel_map.phys_addr; kernel_map.va_kernel_pa_offset = kernel_map.virt_addr - kernel_map.phys_addr; - phys_ram_base = kernel_map.phys_addr; - /* * The default maximal physical memory size is KERN_VIRT_SIZE for 32-bit * kernel, whereas for 64-bit kernel, the end of the virtual address @@ -1097,6 +1123,17 @@ static void __init setup_vm_final(void) __pa_symbol(fixmap_pgd_next), PGDIR_SIZE, PAGE_TABLE); +#ifdef CONFIG_STRICT_KERNEL_RWX + /* + * Isolate the kernel text and rodata linear so they don't + * get mapped with a PUD in the linear mapping. + */ + memblock_isolate_memory(__pa_symbol(_start), + __init_data_begin - _start); + memblock_isolate_memory(__pa_symbol(__start_rodata), + __start_rodata - _data); +#endif + /* Map all memory banks in the linear mapping */ for_each_mem_range(i, &start, &end) { if (start >= end) diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c index 9b18bda74154..18706f457da7 100644 --- a/arch/riscv/mm/physaddr.c +++ b/arch/riscv/mm/physaddr.c @@ -33,3 +33,19 @@ phys_addr_t __phys_addr_symbol(unsigned long x) return __va_to_pa_nodebug(x); } EXPORT_SYMBOL(__phys_addr_symbol); + +phys_addr_t linear_mapping_va_to_pa(unsigned long x) +{ + BUG_ON(!kernel_map.va_pa_offset); + + return ((unsigned long)(x) - kernel_map.va_pa_offset); +} +EXPORT_SYMBOL(linear_mapping_va_to_pa); + +void *linear_mapping_pa_to_va(unsigned long x) +{ + BUG_ON(!kernel_map.va_pa_offset); + + return ((void *)((unsigned long)(x) + kernel_map.va_pa_offset)); +} +EXPORT_SYMBOL(linear_mapping_pa_to_va); diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index d1a68b6d03b3..d14735a81301 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -887,12 +887,13 @@ const void * __init of_flat_dt_match_machine(const void *default_match, static void __early_init_dt_declare_initrd(unsigned long start, unsigned long end) { - /* ARM64 would cause a BUG to occur here when CONFIG_DEBUG_VM is - * enabled since __va() is called too early. ARM64 does make use - * of phys_initrd_start/phys_initrd_size so we can skip this - * conversion. + /* + * __va() is not yet available this early on some platforms. In that + * case, the platform uses phys_initrd_start/phys_initrd_size instead + * and does the VA conversion itself. */ - if (!IS_ENABLED(CONFIG_ARM64)) { + if (!IS_ENABLED(CONFIG_ARM64) && + !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { initrd_start = (unsigned long)__va(start); initrd_end = (unsigned long)__va(end); initrd_below_start_ok = 1;