From patchwork Fri May 24 17:07:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A23EEC25B74 for ; Fri, 24 May 2024 17:10:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYOy-00085d-8R; Fri, 24 May 2024 13:08:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYOw-00084w-3b for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:18 -0400 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYOs-0007Yr-QU for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:17 -0400 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-792d65cd7a8so380449085a.1 for ; Fri, 24 May 2024 10:08:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570493; x=1717175293; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YCLyBeWwlWDGHSPnJVgiovtrLKP4uX/NPwAANH2melY=; b=E27Iv5qaa8qBKmQ7ywpMcb+LNsMuKhhwC90bPxFEYAozt6R5FnR4DyUzUuPdsgwNo8 VEcVXFdfnuWUNTFYTfEhvFgLatt2DI0xFTcFWDwNGPhNqfSDu17U8Mo30ykqWm2jlyUl FqnZxQu0NUyG8cUD1oL04EU68OYpQi24HFnBkKA8iRAEJ7/UNb6Id2NZ1R94r3hxzhms jtd1nZOJTjHWJ0jDlmcnvYjxSq7Cgme/cdZbnQQwzqp+ue5wYBJDe6PxexNuWakZ1m5d o1Eh1TehALTUAqaKSvqEsPVX7HxoOgsRVI6dDnSLb064Wkh5Gaj2ITVmr/4kAMrL6XsG H4dA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570493; x=1717175293; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YCLyBeWwlWDGHSPnJVgiovtrLKP4uX/NPwAANH2melY=; b=IpsffNkj1RudTBfOecB4bgBSGX/BDHz8dBOFz7a6HrWP/0P9kKDQWswIPlmveJ7XGQ goBp7OnolYp6UdJD1pxj7WEuuHStUjwqThXvJrrrnHpkItshdOCrxUhfS0aSpObdCvAL Q+ex6g5gVurZDl4eZWZH7K+guC1JyzKuPl9UdqalMyD8iA1qICpjSMUZw4AN8Y+R2kfA wDY6I86ZD1sBy1X5r2paknw+bTjpt/cl8wSWBvoIL6GrRRWr850GQXXp//OPoITgcvQ/ Wk/VzKT5xXiacDYzizcQVTlbjFepGGhOqy135Jfai7xOUpcpXLXJSKPF5VGGRLHOwcDd 4WOQ== X-Gm-Message-State: AOJu0Yx024/BY8kH5rLA7TRokB+bStbx4mXM4IMwBOpXfxZin4ceZzbS Q61StyntGqC1LaO9pWxqzY4v7tV37dovOMl0G4auGL89uHGaUYaULSX3P3RVA65eUohQyCskK7G bJh4XxvclYIFjThGRRZTGTXOklg/7MSW5L2QFAGgKMxjVaWvPAumzsWW/vabZOrji1roXYYKeYj qxIJWPIHEAuHdA5Zrf8CubgQaQAESs X-Google-Smtp-Source: AGHT+IFXasKvYjq58X+lhRLgI6We9oFLAhenRoenpkicgVPE8W4TDJziPO1AXvh+CfYQlffkBBJzHA== X-Received: by 2002:a05:620a:e0d:b0:792:c6cc:6f04 with SMTP id af79cd13be357-794ab048869mr313665985a.13.1716570492840; Fri, 24 May 2024 10:08:12 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:12 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 1/6] Add an "info pg" command that prints the current page tables Date: Fri, 24 May 2024 13:07:43 -0400 Message-Id: <20240524170748.1842030-2-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::734; envelope-from=porter@cs.unc.edu; helo=mail-qk1-x734.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The new "info pg" monitor command prints the current page table, including virtual address ranges, flag bits, and snippets of physical page numbers. Completely filled regions of the page table with compatible flags are "folded", with the result that the complete output for a freshly booted x86-64 Linux VM can fit in a single terminal window. The output looks like this: VPN range Entry Flags Physical page [7f0000000-7f0000000] PML4[0fe] ---DA--UWP [7f28c0000-7f28fffff] PDP[0a3] ---DA--UWP [7f28c4600-7f28c47ff] PDE[023] ---DA--UWP [7f28c4655-7f28c4656] PTE[055-056] X--D---U-P 0000007f14-0000007f15 [7f28c465b-7f28c465b] PTE[05b] ----A--U-P 0000001cfc ... [ff8000000-ff8000000] PML4[1ff] ---DA--UWP [ffff80000-ffffbffff] PDP[1fe] ---DA---WP [ffff81000-ffff81dff] PDE[008-00e] -GSDA---WP 0000001000-0000001dff [ffffc0000-fffffffff] PDP[1ff] ---DA--UWP [ffffff400-ffffff5ff] PDE[1fa] ---DA--UWP [ffffff5fb-ffffff5fc] PTE[1fb-1fc] XG-DACT-WP 00000fec00 00000fee00 [ffffff600-ffffff7ff] PDE[1fb] ---DA--UWP [ffffff600-ffffff600] PTE[000] -G-DA--U-P 0000001467 This draws heavy inspiration from Austin Clements' original patch. This also adds a generic page table walker, which other monitor and execution commands will be migrated to in subsequent patches. Signed-off-by: Don Porter --- hmp-commands-info.hx | 26 ++ include/monitor/hmp-target.h | 1 + target/i386/arch_memory_mapping.c | 486 +++++++++++++++++++++++++++++- target/i386/cpu.h | 16 + target/i386/monitor.c | 380 +++++++++++++++++++++++ 5 files changed, 908 insertions(+), 1 deletion(-) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 20a9835ea8..918b82015c 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -237,6 +237,32 @@ ERST .cmd = hmp_info_mtree, }, +#if defined(TARGET_I386) + { + .name = "pg", + .args_type = "", + .params = "", + .help = "show the page table", + .cmd = hmp_info_pg, + }, +#endif + +SRST | + ``info pg`` | + Show the active page table. | +ERST + + { + .name = "mtree", + .args_type = "flatview:-f,dispatch_tree:-d,owner:-o,disabled:-D", + .params = "[-f][-d][-o][-D]", + .help = "show memory tree (-f: dump flat view for address spaces;" + "-d: dump dispatch tree, valid with -f only);" + "-o: dump region owners/parents;" + "-D: dump disabled regions", + .cmd = hmp_info_mtree, + }, + SRST ``info mtree`` Show memory tree. diff --git a/include/monitor/hmp-target.h b/include/monitor/hmp-target.h index b679aaebbf..9af72ea58d 100644 --- a/include/monitor/hmp-target.h +++ b/include/monitor/hmp-target.h @@ -50,6 +50,7 @@ CPUState *mon_get_cpu(Monitor *mon); void hmp_info_mem(Monitor *mon, const QDict *qdict); void hmp_info_tlb(Monitor *mon, const QDict *qdict); void hmp_mce(Monitor *mon, const QDict *qdict); +void hmp_info_pg(Monitor *mon, const QDict *qdict); void hmp_info_local_apic(Monitor *mon, const QDict *qdict); void hmp_info_sev(Monitor *mon, const QDict *qdict); void hmp_info_sgx(Monitor *mon, const QDict *qdict); diff --git a/target/i386/arch_memory_mapping.c b/target/i386/arch_memory_mapping.c index d1ff659128..00bf2a2116 100644 --- a/target/i386/arch_memory_mapping.c +++ b/target/i386/arch_memory_mapping.c @@ -15,6 +15,491 @@ #include "cpu.h" #include "sysemu/memory_mapping.h" +/** + ************** code hook implementations for x86 *********** + */ + +#define PML4_ADDR_MASK 0xffffffffff000ULL /* selects bits 51:12 */ + +/** + * mmu_page_table_root - Given a CPUState, return the physical address + * of the current page table root, as well as + * write the height of the tree into *height. + * + * @cs - CPU state + * @height - a pointer to an integer, to store the page table tree height + * + * Returns a hardware address on success. Should not fail (i.e., caller is + * responsible to ensure that a page table is actually present). + */ +static hwaddr mmu_page_table_root(CPUState *cs, int *height) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + /* + * DEP 5/15/24: Some original page table walking code sets the a20 + * mask as a 32 bit integer and checks it on each level of hte + * page table walk; some only checks it against the final result. + * For 64 bits, I think we need to sign extend in the common case + * it is not set (and returns -1), or we will lose bits. + */ + int64_t a20_mask; + + assert(cpu_paging_enabled(cs)); + a20_mask = x86_get_a20_mask(env); + + if (env->cr[4] & CR4_PAE_MASK) { +#ifdef TARGET_X86_64 + if (env->hflags & HF_LMA_MASK) { + if (env->cr[4] & CR4_LA57_MASK) { + *height = 5; + } else { + *height = 4; + } + return (env->cr[3] & PML4_ADDR_MASK) & a20_mask; + } else +#endif + { + *height = 3; + return (env->cr[3] & ~0x1f) & a20_mask; + } + } else { + *height = 2; + return (env->cr[3] & ~0xfff) & a20_mask; + } +} + + +/** + * mmu_page_table_entries_per_node - Return the number of + * entries in a page table + * node for the CPU at a given + * height. + * + * @cs - CPU state + * @height - height of the page table tree to query, where the leaves + * are 1. + * + * Returns a value greater than zero on success, -1 on error. + */ +int mmu_page_table_entries_per_node(CPUState *cs, int height) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + bool pae_enabled = env->cr[4] & CR4_PAE_MASK; + + assert(height < 6); + assert(height > 0); + + switch (height) { +#ifdef TARGET_X86_64 + case 5: + assert(env->cr[4] & CR4_LA57_MASK); + case 4: + assert(env->hflags & HF_LMA_MASK); + assert(pae_enabled); + return 512; +#endif + case 3: + assert(pae_enabled); +#ifdef TARGET_X86_64 + if (env->hflags & HF_LMA_MASK) { + return 512; + } else +#endif + { + return 4; + } + case 2: + case 1: + return pae_enabled ? 512 : 1024; + default: + g_assert_not_reached(); + } + return -1; +} + +/** + * mmu_pte_leaf_page_size - Return the page size of a leaf entry, + * given the height and CPU state + * + * @cs - CPU state + * @height - height of the page table tree to query, where the leaves + * are 1. + * + * Returns a value greater than zero on success, -1 on error. + */ +target_ulong mmu_pte_leaf_page_size(CPUState *cs, int height) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + bool pae_enabled = env->cr[4] & CR4_PAE_MASK; + + assert(height < 6); + assert(height > 0); + + switch (height) { +#ifdef TARGET_X86_64 + case 5: + assert(pae_enabled); + assert(env->cr[4] & CR4_LA57_MASK); + assert(env->hflags & HF_LMA_MASK); + return 1ULL << 48; + case 4: + assert(pae_enabled); + assert(env->hflags & HF_LMA_MASK); + return 1ULL << 39; +#endif + case 3: + assert(pae_enabled); + return 1 << 30; + case 2: + if (pae_enabled) { + return 1 << 21; + } else { + return 1 << 22; + } + case 1: + return 4096; + default: + g_assert_not_reached(); + } + return -1; +} + +/* + * Given a CPU state and height, return the number of bits + * to shift right/left in going from virtual to PTE index + * and vice versa, the number of useful bits. + */ +static void _mmu_decode_va_parameters(CPUState *cs, int height, + int *shift, int *width) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + int _shift = 0; + int _width = 0; + bool pae_enabled = env->cr[4] & CR4_PAE_MASK; + + switch (height) { + case 5: + _shift = 48; + _width = 9; + break; + case 4: + _shift = 39; + _width = 9; + break; + case 3: + _shift = 30; + _width = 9; + break; + case 2: + /* 64 bit page tables shift from 30->21 bits here */ + if (pae_enabled) { + _shift = 21; + _width = 9; + } else { + /* 32 bit page tables shift from 32->22 bits */ + _shift = 22; + _width = 10; + } + break; + case 1: + _shift = 12; + if (pae_enabled) { + _width = 9; + } else { + _width = 10; + } + + break; + default: + g_assert_not_reached(); + } + + if (shift) { + *shift = _shift; + } + + if (width) { + *width = _width; + } +} + +/** + * get_pte - Copy the contents of the page table entry at node[i] into pt_entry. + * Optionally, add the relevant bits to the virtual address in + * vaddr_pte. + * + * @cs - CPU state + * @node - physical address of the current page table node + * @i - index (in page table entries, not bytes) of the page table + * entry, within node + * @height - height of node within the tree (leaves are 1, not 0) + * @pt_entry - Poiter to a PTE_t, stores the contents of the page table entry + * @vaddr_parent - The virtual address bits already translated in walking the + * page table to node. Optional: only used if vaddr_pte is set. + * @vaddr_pte - Optional pointer to a variable storing the virtual address bits + * translated by node[i]. + * @pte_paddr - Pointer to the physical address of the PTE within node. + * Optional parameter. + */ + +static void +get_pte(CPUState *cs, hwaddr node, int i, int height, + PTE_t *pt_entry, target_ulong vaddr_parent, target_ulong *vaddr_pte, + hwaddr *pte_paddr) + +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + int32_t a20_mask = x86_get_a20_mask(env); + hwaddr pte; + + if (env->hflags & HF_LMA_MASK) { + /* 64 bit */ + int pte_width = 8; + pte = (node + (i * pte_width)) & a20_mask; + pt_entry->pte64_t = address_space_ldq(cs->as, pte, + MEMTXATTRS_UNSPECIFIED, NULL); + } else { + /* 32 bit */ + int pte_width = 4; + pte = (node + (i * pte_width)) & a20_mask; + pt_entry->pte32_t = address_space_ldl(cs->as, pte, + MEMTXATTRS_UNSPECIFIED, NULL); + } + + if (vaddr_pte) { + int shift = 0; + _mmu_decode_va_parameters(cs, height, &shift, NULL); + *vaddr_pte = vaddr_parent | ((i & 0x1ffULL) << shift); + } + + if (pte_paddr) { + *pte_paddr = pte; + } +} + + +static bool +mmu_pte_check_bits(CPUState *cs, PTE_t *pte, int64_t mask) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + if (env->hflags & HF_LMA_MASK) { + return pte->pte64_t & mask; + } else { + return pte->pte32_t & mask; + } +} + +/** + * mmu_pte_presetn - Return true if the pte is + * marked 'present' + */ +static bool +mmu_pte_present(CPUState *cs, PTE_t *pte) +{ + return mmu_pte_check_bits(cs, pte, PG_PRESENT_MASK); +} + +/** + * mmu_pte_leaf - Return true if the pte is + * a page table leaf, false if + * the pte points to another + * node in the radix tree. + */ +bool +mmu_pte_leaf(CPUState *cs, int height, PTE_t *pte) +{ + return height == 1 || mmu_pte_check_bits(cs, pte, PG_PSE_MASK); +} + +/** + * mmu_pte_child - Returns the physical address + * of a radix tree node pointed to by pte. + * + * @cs - CPU state + * @pte - The page table entry + * @height - The height in the tree of pte + * + * Returns the physical address stored in pte on success, + * -1 on error. + */ +hwaddr +mmu_pte_child(CPUState *cs, PTE_t *pte, int height) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + bool pae_enabled = env->cr[4] & CR4_PAE_MASK; + int32_t a20_mask = x86_get_a20_mask(env); + + switch (height) { +#ifdef TARGET_X86_64 + case 5: + assert(env->cr[4] & CR4_LA57_MASK); + case 4: + assert(env->hflags & HF_LMA_MASK); + /* assert(pae_enabled); */ + /* Fall through */ +#endif + case 3: + assert(pae_enabled); +#ifdef TARGET_X86_64 + if (env->hflags & HF_LMA_MASK) { + return (pte->pte64_t & PG_ADDRESS_MASK) & a20_mask; + } else +#endif + { + return (pte->pte64_t & ~0xfff) & a20_mask; + } + case 2: + case 1: + if (pae_enabled) { + return (pte->pte64_t & PG_ADDRESS_MASK) & a20_mask; + } else { + return (pte->pte32_t & ~0xfff) & a20_mask; + } + default: + g_assert_not_reached(); + } + return -1; +} + + +/** + ************** generic page table code *********** + */ + +/** + * _for_each_pte - recursive helper function + * + * @cs - CPU state + * @fn(cs, data, pte, vaddr, height) - User-provided function to call on each + * pte. + * * @cs - pass through cs + * * @data - user-provided, opaque pointer + * * @pte - current pte + * * @vaddr - virtual address translated by pte + * * @height - height in the tree of pte + * @data - user-provided, opaque pointer, passed to fn() + * @visit_interior_nodes - if true, call fn() on page table entries in + * interior nodes. If false, only call fn() on page + * table entries in leaves. + * @visit_not_present - if true, call fn() on entries that are not present. + * if false, visit only present entries. + * @node - The physical address of the current page table radix tree node + * @vaddr - The virtual address bits translated in walking the page table to + * node + * @height - The height of node in the radix tree + * + * height starts at the max and counts down. + * In a 4 level x86 page table, pml4e is level 4, pdpe is level 3, + * pde is level 2, and pte is level 1 + * + * Returns true on success, false on error. + */ +static bool +_for_each_pte(CPUState *cs, + int (*fn)(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, int offset), + void *data, bool visit_interior_nodes, + bool visit_not_present, hwaddr node, + target_ulong vaddr, int height) +{ + int ptes_per_node; + int i; + + assert(height > 0); + + ptes_per_node = mmu_page_table_entries_per_node(cs, height); + + for (i = 0; i < ptes_per_node; i++) { + PTE_t pt_entry; + target_ulong vaddr_i; + bool pte_present; + + get_pte(cs, node, i, height, &pt_entry, vaddr, &vaddr_i, NULL); + pte_present = mmu_pte_present(cs, &pt_entry); + + if (pte_present || visit_not_present) { + if ((!pte_present) || mmu_pte_leaf(cs, height, &pt_entry)) { + if (fn(cs, data, &pt_entry, vaddr_i, height, i)) { + /* Error */ + return false; + } + } else { /* Non-leaf */ + if (visit_interior_nodes) { + if (fn(cs, data, &pt_entry, vaddr_i, height, i)) { + /* Error */ + return false; + } + } + hwaddr child = mmu_pte_child(cs, &pt_entry, height); + assert(height > 1); + if (!_for_each_pte(cs, fn, data, visit_interior_nodes, + visit_not_present, child, vaddr_i, + height - 1)) { + return false; + } + } + } + } + + return true; +} + +/** + * for_each_pte - iterate over a page table, and + * call fn on each entry + * + * @cs - CPU state + * @fn(cs, data, pte, vaddr, height) - User-provided function to call on each + * pte. + * * @cs - pass through cs + * * @data - user-provided, opaque pointer + * * @pte - current pte + * * @vaddr - virtual address translated by pte + * * @height - height in the tree of pte + * @data - opaque pointer; passed through to fn + * @visit_interior_nodes - if true, call fn() on interior entries in + * page table; if false, visit only leaf entries. + * @visit_not_present - if true, call fn() on entries that are not present. + * if false, visit only present entries. + * + * Returns true on success, false on error. + * + */ +bool for_each_pte(CPUState *cs, + int (*fn)(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, int offset), + void *data, bool visit_interior_nodes, + bool visit_not_present) +{ + int height; + target_ulong vaddr = 0; + hwaddr root; + + if (!cpu_paging_enabled(cs)) { + /* paging is disabled */ + return true; + } + + root = mmu_page_table_root(cs, &height); + + assert(height > 1); + + /* Recursively call a helper to walk the page table */ + return _for_each_pte(cs, fn, data, visit_interior_nodes, visit_not_present, + root, vaddr, height); +} + +/** + * Back to x86 hooks + */ + /* PAE Paging or IA-32e Paging */ static void walk_pte(MemoryMappingList *list, AddressSpace *as, hwaddr pte_start_addr, @@ -313,4 +798,3 @@ bool x86_cpu_get_memory_mapping(CPUState *cs, MemoryMappingList *list, return true; } - diff --git a/target/i386/cpu.h b/target/i386/cpu.h index ccccb62fc3..fc3ae55213 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -2094,6 +2094,12 @@ struct X86CPUClass { ResettablePhases parent_phases; }; +/* Intended to become a generic PTE type */ +typedef union PTE { + uint64_t pte64_t; + uint32_t pte32_t; +} PTE_t; + #ifndef CONFIG_USER_ONLY extern const VMStateDescription vmstate_x86_cpu; #endif @@ -2109,6 +2115,16 @@ int x86_cpu_write_elf64_qemunote(WriteCoreDumpFunction f, CPUState *cpu, int x86_cpu_write_elf32_qemunote(WriteCoreDumpFunction f, CPUState *cpu, DumpState *s); +bool mmu_pte_leaf(CPUState *cs, int height, PTE_t *pte); +target_ulong mmu_pte_leaf_page_size(CPUState *cs, int height); +hwaddr mmu_pte_child(CPUState *cs, PTE_t *pte, int height); +int mmu_page_table_entries_per_node(CPUState *cs, int height); +bool for_each_pte(CPUState *cs, + int (*fn)(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, int offset), + void *data, bool visit_interior_nodes, + bool visit_not_present); + bool x86_cpu_get_memory_mapping(CPUState *cpu, MemoryMappingList *list, Error **errp); diff --git a/target/i386/monitor.c b/target/i386/monitor.c index 2d766b2637..d7aae99c73 100644 --- a/target/i386/monitor.c +++ b/target/i386/monitor.c @@ -32,6 +32,201 @@ #include "qapi/qapi-commands-misc-target.h" #include "qapi/qapi-commands-misc.h" +/* Maximum x86 height */ +#define MAX_HEIGHT 5 + +struct mem_print_state { + Monitor *mon; + CPUArchState *env; + int vaw, paw; /* VA and PA width in characters */ + int max_height; + bool (*flusher)(CPUState *cs, struct mem_print_state *state); + bool flush_interior; /* If false, only call flusher() on leaves */ + bool require_physical_contiguity; + /* + * The height at which we started accumulating ranges, i.e., the + * next height we need to print once we hit the end of a + * contiguous range. + */ + int start_height; + /* + * For compressing contiguous ranges, track the + * start and end of the range + */ + hwaddr vstart[MAX_HEIGHT + 1]; /* Starting virt. addr. of open pte range */ + hwaddr vend[MAX_HEIGHT + 1]; /* Ending virtual address of open pte range */ + hwaddr pstart; /* Starting physical address of open pte range */ + hwaddr pend; /* Ending physical address of open pte range */ + int64_t ent[MAX_HEIGHT + 1]; /* PTE contents on current root->leaf path */ + int offset[MAX_HEIGHT + 1]; /* PTE range starting offsets */ + int last_offset[MAX_HEIGHT + 1]; /* PTE range ending offsets */ +}; + +/********************* x86 specific hooks for printing page table stuff ****/ + +const char *names[7] = {(char *)NULL, "PTE", "PDE", "PDP", "PML4", "Pml5", + (char *)NULL}; +static char *pg_bits(hwaddr ent) +{ + static char buf[32]; + sprintf(buf, "%c%c%c%c%c%c%c%c%c%c", + ent & PG_NX_MASK ? 'X' : '-', + ent & PG_GLOBAL_MASK ? 'G' : '-', + ent & PG_PSE_MASK ? 'S' : '-', + ent & PG_DIRTY_MASK ? 'D' : '-', + ent & PG_ACCESSED_MASK ? 'A' : '-', + ent & PG_PCD_MASK ? 'C' : '-', + ent & PG_PWT_MASK ? 'T' : '-', + ent & PG_USER_MASK ? 'U' : '-', + ent & PG_RW_MASK ? 'W' : '-', + ent & PG_PRESENT_MASK ? 'P' : '-'); + return buf; +} + +static bool init_iterator(Monitor *mon, struct mem_print_state *state) +{ + CPUArchState *env; + state->mon = mon; + state->flush_interior = false; + state->require_physical_contiguity = false; + + for (int i = 0; i < MAX_HEIGHT; i++) { + state->vstart[i] = -1; + state->last_offset[i] = 0; + } + state->start_height = 0; + + env = mon_get_cpu_env(mon); + if (!env) { + monitor_printf(mon, "No CPU available\n"); + return false; + } + state->env = env; + + if (!(env->cr[0] & CR0_PG_MASK)) { + monitor_printf(mon, "PG disabled\n"); + return false; + } + + /* set va and pa width */ + if (env->cr[4] & CR4_PAE_MASK) { + state->paw = 13; +#ifdef TARGET_X86_64 + if (env->hflags & HF_LMA_MASK) { + if (env->cr[4] & CR4_LA57_MASK) { + state->vaw = 15; + state->max_height = 5; + } else { + state->vaw = 12; + state->max_height = 4; + } + } else +#endif + { + state->vaw = 8; + state->max_height = 3; + } + } else { + state->max_height = 2; + state->vaw = 8; + state->paw = 8; + } + + return true; +} + +static void pg_print_header(Monitor *mon, struct mem_print_state *state) +{ + /* Header line */ + monitor_printf(mon, "%-*s %-13s %-10s %*s%s\n", + 3 + 2 * (state->vaw - 3), "VPN range", + "Entry", "Flags", + 2 * (state->max_height - 1), "", "Physical page(s)"); +} + + +static void pg_print(CPUState *cs, Monitor *mon, uint64_t pt_ent, + target_ulong vaddr_s, target_ulong vaddr_l, + hwaddr paddr_s, hwaddr paddr_l, + int offset_s, int offset_l, + int height, int max_height, int vaw, int paw, + bool is_leaf) + +{ + char buf[128]; + char *pos = buf, *end = buf + sizeof(buf); + target_ulong size = mmu_pte_leaf_page_size(cs, height); + + /* VFN range */ + pos += sprintf(pos, "%*s[%0*"PRIx64"-%0*"PRIx64"] ", + (max_height - height) * 2, "", + vaw - 3, (uint64_t)vaddr_s >> 12, + vaw - 3, ((uint64_t)vaddr_l + size - 1) >> 12); + + /* Slot */ + if (vaddr_s == vaddr_l) { + pos += sprintf(pos, "%4s[%03x] ", + names[height], offset_s); + } else { + pos += sprintf(pos, "%4s[%03x-%03x]", + names[height], offset_s, offset_l); + } + + /* Flags */ + pos += sprintf(pos, " %s", pg_bits(pt_ent)); + + + /* Range-compressed PFN's */ + if (is_leaf) { + if (vaddr_s == vaddr_l) { + pos += snprintf(pos, end - pos, " %0*"PRIx64, + paw - 3, (uint64_t)paddr_s >> 12); + } else { + pos += snprintf(pos, end - pos, " %0*"PRIx64"-%0*"PRIx64, + paw - 3, (uint64_t)paddr_s >> 12, + paw - 3, (uint64_t)paddr_l >> 12); + } + pos = MIN(pos, end); + } + + /* Trim line to fit screen */ + if (pos - buf > 79) { + strcpy(buf + 77, ".."); + } + + monitor_printf(mon, "%s\n", buf); +} + +static inline +int ent2prot(uint64_t prot) +{ + return prot & (PG_USER_MASK | PG_RW_MASK | + PG_PRESENT_MASK); +} + +/* Returns true if it emitted anything */ +static +bool flush_print_pg_state(CPUState *cs, struct mem_print_state *state) +{ + bool ret = false; + for (int i = state->start_height; i > 0; i--) { + if (state->vstart[i] == -1) { + break; + } + PTE_t my_pte; + my_pte.pte64_t = state->ent[i]; + ret = true; + pg_print(cs, state->mon, state->ent[i], + state->vstart[i], state->vend[i], + state->pstart, state->pend, + state->offset[i], state->last_offset[i], + i, state->max_height, state->vaw, state->paw, + mmu_pte_leaf(cs, i, &my_pte)); + } + + return ret; +} + /* Perform linear address sign extension */ static hwaddr addr_canonical(CPUArchState *env, hwaddr addr) { @@ -49,6 +244,191 @@ static hwaddr addr_canonical(CPUArchState *env, hwaddr addr) return addr; } + + +/*************************** Start generic page table monitor code *********/ + +/* Assume only called on present entries */ +static +int compressing_iterator(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, int offset) +{ + struct mem_print_state *state = (struct mem_print_state *) data; + hwaddr paddr = mmu_pte_child(cs, pte, height); + target_ulong size = mmu_pte_leaf_page_size(cs, height); + bool start_new_run = false, flush = false; + bool is_leaf = mmu_pte_leaf(cs, height, pte); + + int entries_per_node = mmu_page_table_entries_per_node(cs, height); + + /* Prot of current pte */ + int prot = ent2prot(pte->pte64_t); + + + /* If there is a prior run, first try to extend it. */ + if (state->start_height != 0) { + + /* + * If we aren't flushing interior nodes, raise the start height. + * We don't need to detect non-compressible interior nodes. + */ + if ((!state->flush_interior) && state->start_height < height) { + state->start_height = height; + state->vstart[height] = vaddr; + state->vend[height] = vaddr; + state->ent[height] = pte->pte64_t; + if (offset == 0) { + state->last_offset[height] = entries_per_node - 1; + } else { + state->last_offset[height] = offset - 1; + } + } + + /* Detect when we are walking down the "left edge" of a range */ + if (state->vstart[height] == -1 + && (height + 1) <= state->start_height + && state->vstart[height + 1] == vaddr) { + + state->vstart[height] = vaddr; + state->vend[height] = vaddr; + state->ent[height] = pte->pte64_t; + state->offset[height] = offset; + state->last_offset[height] = offset; + + if (is_leaf) { + state->pstart = paddr; + state->pend = paddr; + } + + /* Detect contiguous entries at same level */ + } else if ((state->vstart[height] != -1) + && (state->start_height >= height) + && ent2prot(state->ent[height]) == prot + && (((state->last_offset[height] + 1) % entries_per_node) + == offset) + && ((!is_leaf) + || (!state->require_physical_contiguity) + || (state->pend + size == paddr))) { + + + /* + * If there are entries at the levels below, make sure we + * completed them. We only compress interior nodes + * without holes in the mappings. + */ + if (height != 1) { + for (int i = height - 1; i >= 1; i--) { + int entries = mmu_page_table_entries_per_node(cs, i); + + /* Stop if we hit large pages before level 1 */ + if (state->vstart[i] == -1) { + break; + } + + if ((state->last_offset[i] + 1) != entries) { + flush = true; + start_new_run = true; + break; + } + } + } + + + if (!flush) { + + /* We can compress these entries */ + state->ent[height] = pte->pte64_t; + state->vend[height] = vaddr; + state->last_offset[height] = offset; + + /* Only update the physical range on leaves */ + if (is_leaf) { + state->pend = paddr; + } + } + /* Let PTEs accumulate... */ + } else { + flush = true; + } + + if (flush) { + /* + * We hit dicontiguous permissions or pages. + * Print the old entries, then start accumulating again + * + * Some clients only want the flusher called on a leaf. + * Check that too. + * + * We can infer whether the accumulated range includes a + * leaf based on whether pstart is -1. + */ + if (state->flush_interior || (state->pstart != -1)) { + if (state->flusher(cs, state)) { + start_new_run = true; + } + } else { + start_new_run = true; + } + } + } else { + start_new_run = true; + } + + if (start_new_run) { + /* start a new run with this PTE */ + for (int i = state->start_height; i > 0; i--) { + if (state->vstart[i] != -1) { + state->ent[i] = 0; + state->last_offset[i] = 0; + state->vstart[i] = -1; + } + } + state->pstart = -1; + state->vstart[height] = vaddr; + state->vend[height] = vaddr; + state->ent[height] = pte->pte64_t; + state->offset[height] = offset; + state->last_offset[height] = offset; + if (is_leaf) { + state->pstart = paddr; + state->pend = paddr; + } + state->start_height = height; + } + + return 0; +} + + +void hmp_info_pg(Monitor *mon, const QDict *qdict) +{ + struct mem_print_state state; + + CPUState *cs = mon_get_cpu(mon); + if (!cs) { + monitor_printf(mon, "Unable to get CPUState. Internal error\n"); + return; + } + + if (!init_iterator(mon, &state)) { + return; + } + state.flush_interior = true; + state.require_physical_contiguity = true; + state.flusher = &flush_print_pg_state; + + pg_print_header(mon, &state); + + /* + * We must visit interior entries to get the hierarchy, but + * can skip not present mappings + */ + for_each_pte(cs, &compressing_iterator, &state, true, false); + + /* Print last entry, if one present */ + flush_print_pg_state(cs, &state); +} + static void print_pte(Monitor *mon, CPUArchState *env, hwaddr addr, hwaddr pte, hwaddr mask) { From patchwork Fri May 24 17:07:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55E26C25B74 for ; Fri, 24 May 2024 17:09:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYPD-00089s-Qr; Fri, 24 May 2024 13:08:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYP9-00089B-12 for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:31 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYOt-0007Yu-Ef for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:29 -0400 Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-792d65cd7a8so380449285a.1 for ; Fri, 24 May 2024 10:08:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570494; x=1717175294; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QRwvTY+4XhYfL9PJXxknJUGay/i87It9LQZF1nGxs/4=; b=Hg80qQqG67dsyRBxV/lf7GsXMQwykBP0qTxRPW/cHkf9DNg9rOBmsmbmv4ewLS47ND +3QH0PqH7clsgtLxJZw+GnZHZf6NOdNhO7N/wYYcW43QNegBiarD0WMnYhOp7zhofnTl S8iCKgFGBfRFME0BBdU4Vmnxoxqsn+pPUQmVb9E7+iiFAxOC3eTLtwB42oRFNRCfrLSe 86HtMTx4EuCx8+IYz+q1Hys+vkmUsNr7FZFqdRZYCvDdBcPnRsfeE220KIOD+GW5pdc2 lUATNXskiXvFrUvFaBQevaY+lMO2sJvvR9lS81mMa/JXtWC4Z5BU/IGjEaRhAhHKenge Hilg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570494; x=1717175294; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QRwvTY+4XhYfL9PJXxknJUGay/i87It9LQZF1nGxs/4=; b=w1kIStM7FPZk621DpIfALxNaTJ5WEOB76sRzCHEAISYXI/UFUtm6fBPSRcMRvJXe56 xBqzTQRXVSDyFeGe4986SzfY53QAx+K+5pP3bk9KZ9i/UVKEIGum38krPkeZyeIyy4pB BcQeGGkwlOknUvjSghCM9eQlus8Hz70hazsJ+f7jss947N3fvaFeW2uYeI5n/tplD5DT ZkVJK0dcVyJUSbK1WraGjBxqNMcdUnKZCeVbblmLqN4NtAE+eDGNsAMg/QyYWpVVBBJn xaIjTio3ju+KZuiDl5MFLijRUdaisT2f/ViBDm3UIoFzM3pqFpIB7eEqjAChHUsgdIjv oIXQ== X-Gm-Message-State: AOJu0Yz1s4/uo1YUsQWYD2EBegR7adq6FDf37kA0UnE0xEysMQlHNCDU /fqyd3oy56z2dCoCRaTRSznBFy6BilDZBFmHAUJGBsvHgTo/Sx3O3zqHlLfbQi6c85b8WMNyLXu KJkhbtb6qc76LFOXCDjmP3uLJ3qSHEiSuBVgOtamOwgdywqs9WmM4m0tWny/RHpvoXY/qcQfK8/ bRjdgiVAXKRRxLDb7dooUwh+ZgFzOC X-Google-Smtp-Source: AGHT+IFY6hIDYWYldWjie0/4Tks4TyE+9UgFA29j6DigoUyS9kq5piLySzanGCRQDH0icKdsLVAmQw== X-Received: by 2002:a05:620a:460e:b0:790:f696:2eb0 with SMTP id af79cd13be357-794ab1125ccmr339374285a.75.1716570493738; Fri, 24 May 2024 10:08:13 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:13 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 2/6] Convert 'info tlb' to use generic iterator Date: Fri, 24 May 2024 13:07:44 -0400 Message-Id: <20240524170748.1842030-3-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=porter@cs.unc.edu; helo=mail-qk1-x735.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Don Porter --- target/i386/monitor.c | 203 ++++++------------------------------------ 1 file changed, 28 insertions(+), 175 deletions(-) diff --git a/target/i386/monitor.c b/target/i386/monitor.c index d7aae99c73..adf95edfb4 100644 --- a/target/i386/monitor.c +++ b/target/i386/monitor.c @@ -430,201 +430,54 @@ void hmp_info_pg(Monitor *mon, const QDict *qdict) } static void print_pte(Monitor *mon, CPUArchState *env, hwaddr addr, - hwaddr pte, hwaddr mask) + hwaddr pte) { - addr = addr_canonical(env, addr); - - monitor_printf(mon, HWADDR_FMT_plx ": " HWADDR_FMT_plx - " %c%c%c%c%c%c%c%c%c\n", - addr, - pte & mask, - pte & PG_NX_MASK ? 'X' : '-', - pte & PG_GLOBAL_MASK ? 'G' : '-', - pte & PG_PSE_MASK ? 'P' : '-', - pte & PG_DIRTY_MASK ? 'D' : '-', - pte & PG_ACCESSED_MASK ? 'A' : '-', - pte & PG_PCD_MASK ? 'C' : '-', - pte & PG_PWT_MASK ? 'T' : '-', - pte & PG_USER_MASK ? 'U' : '-', - pte & PG_RW_MASK ? 'W' : '-'); -} + char buf[128]; + char *pos = buf; -static void tlb_info_32(Monitor *mon, CPUArchState *env) -{ - unsigned int l1, l2; - uint32_t pgd, pde, pte; + addr = addr_canonical(env, addr); - pgd = env->cr[3] & ~0xfff; - for(l1 = 0; l1 < 1024; l1++) { - cpu_physical_memory_read(pgd + l1 * 4, &pde, 4); - pde = le32_to_cpu(pde); - if (pde & PG_PRESENT_MASK) { - if ((pde & PG_PSE_MASK) && (env->cr[4] & CR4_PSE_MASK)) { - /* 4M pages */ - print_pte(mon, env, (l1 << 22), pde, ~((1 << 21) - 1)); - } else { - for(l2 = 0; l2 < 1024; l2++) { - cpu_physical_memory_read((pde & ~0xfff) + l2 * 4, &pte, 4); - pte = le32_to_cpu(pte); - if (pte & PG_PRESENT_MASK) { - print_pte(mon, env, (l1 << 22) + (l2 << 12), - pte & ~PG_PSE_MASK, - ~0xfff); - } - } - } - } - } -} + pos += sprintf(pos, HWADDR_FMT_plx ": " HWADDR_FMT_plx " ", addr, + (hwaddr) (pte & PG_ADDRESS_MASK)); -static void tlb_info_pae32(Monitor *mon, CPUArchState *env) -{ - unsigned int l1, l2, l3; - uint64_t pdpe, pde, pte; - uint64_t pdp_addr, pd_addr, pt_addr; + pos += sprintf(pos, " %s", pg_bits(pte)); - pdp_addr = env->cr[3] & ~0x1f; - for (l1 = 0; l1 < 4; l1++) { - cpu_physical_memory_read(pdp_addr + l1 * 8, &pdpe, 8); - pdpe = le64_to_cpu(pdpe); - if (pdpe & PG_PRESENT_MASK) { - pd_addr = pdpe & 0x3fffffffff000ULL; - for (l2 = 0; l2 < 512; l2++) { - cpu_physical_memory_read(pd_addr + l2 * 8, &pde, 8); - pde = le64_to_cpu(pde); - if (pde & PG_PRESENT_MASK) { - if (pde & PG_PSE_MASK) { - /* 2M pages with PAE, CR4.PSE is ignored */ - print_pte(mon, env, (l1 << 30) + (l2 << 21), pde, - ~((hwaddr)(1 << 20) - 1)); - } else { - pt_addr = pde & 0x3fffffffff000ULL; - for (l3 = 0; l3 < 512; l3++) { - cpu_physical_memory_read(pt_addr + l3 * 8, &pte, 8); - pte = le64_to_cpu(pte); - if (pte & PG_PRESENT_MASK) { - print_pte(mon, env, (l1 << 30) + (l2 << 21) - + (l3 << 12), - pte & ~PG_PSE_MASK, - ~(hwaddr)0xfff); - } - } - } - } - } - } + /* Trim line to fit screen */ + if (pos - buf > 79) { + strcpy(buf + 77, ".."); } -} -#ifdef TARGET_X86_64 -static void tlb_info_la48(Monitor *mon, CPUArchState *env, - uint64_t l0, uint64_t pml4_addr) -{ - uint64_t l1, l2, l3, l4; - uint64_t pml4e, pdpe, pde, pte; - uint64_t pdp_addr, pd_addr, pt_addr; - - for (l1 = 0; l1 < 512; l1++) { - cpu_physical_memory_read(pml4_addr + l1 * 8, &pml4e, 8); - pml4e = le64_to_cpu(pml4e); - if (!(pml4e & PG_PRESENT_MASK)) { - continue; - } - - pdp_addr = pml4e & 0x3fffffffff000ULL; - for (l2 = 0; l2 < 512; l2++) { - cpu_physical_memory_read(pdp_addr + l2 * 8, &pdpe, 8); - pdpe = le64_to_cpu(pdpe); - if (!(pdpe & PG_PRESENT_MASK)) { - continue; - } - - if (pdpe & PG_PSE_MASK) { - /* 1G pages, CR4.PSE is ignored */ - print_pte(mon, env, (l0 << 48) + (l1 << 39) + (l2 << 30), - pdpe, 0x3ffffc0000000ULL); - continue; - } - - pd_addr = pdpe & 0x3fffffffff000ULL; - for (l3 = 0; l3 < 512; l3++) { - cpu_physical_memory_read(pd_addr + l3 * 8, &pde, 8); - pde = le64_to_cpu(pde); - if (!(pde & PG_PRESENT_MASK)) { - continue; - } - - if (pde & PG_PSE_MASK) { - /* 2M pages, CR4.PSE is ignored */ - print_pte(mon, env, (l0 << 48) + (l1 << 39) + (l2 << 30) + - (l3 << 21), pde, 0x3ffffffe00000ULL); - continue; - } - - pt_addr = pde & 0x3fffffffff000ULL; - for (l4 = 0; l4 < 512; l4++) { - cpu_physical_memory_read(pt_addr - + l4 * 8, - &pte, 8); - pte = le64_to_cpu(pte); - if (pte & PG_PRESENT_MASK) { - print_pte(mon, env, (l0 << 48) + (l1 << 39) + - (l2 << 30) + (l3 << 21) + (l4 << 12), - pte & ~PG_PSE_MASK, 0x3fffffffff000ULL); - } - } - } - } - } + monitor_printf(mon, "%s\n", buf); } -static void tlb_info_la57(Monitor *mon, CPUArchState *env) +static +int mem_print_tlb(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, int offset) { - uint64_t l0; - uint64_t pml5e; - uint64_t pml5_addr; - - pml5_addr = env->cr[3] & 0x3fffffffff000ULL; - for (l0 = 0; l0 < 512; l0++) { - cpu_physical_memory_read(pml5_addr + l0 * 8, &pml5e, 8); - pml5e = le64_to_cpu(pml5e); - if (pml5e & PG_PRESENT_MASK) { - tlb_info_la48(mon, env, l0, pml5e & 0x3fffffffff000ULL); - } - } + struct mem_print_state *state = (struct mem_print_state *) data; + print_pte(state->mon, state->env, vaddr, pte->pte64_t); + return 0; } -#endif /* TARGET_X86_64 */ void hmp_info_tlb(Monitor *mon, const QDict *qdict) { - CPUArchState *env; + struct mem_print_state state; - env = mon_get_cpu_env(mon); - if (!env) { - monitor_printf(mon, "No CPU available\n"); + CPUState *cs = mon_get_cpu(mon); + if (!cs) { + monitor_printf(mon, "Unable to get CPUState. Internal error\n"); return; } - if (!(env->cr[0] & CR0_PG_MASK)) { - monitor_printf(mon, "PG disabled\n"); + if (!init_iterator(mon, &state)) { return; } - if (env->cr[4] & CR4_PAE_MASK) { -#ifdef TARGET_X86_64 - if (env->hflags & HF_LMA_MASK) { - if (env->cr[4] & CR4_LA57_MASK) { - tlb_info_la57(mon, env); - } else { - tlb_info_la48(mon, env, 0, env->cr[3] & 0x3fffffffff000ULL); - } - } else -#endif - { - tlb_info_pae32(mon, env); - } - } else { - tlb_info_32(mon, env); - } + + /** + * 'info tlb' visits only leaf PTEs marked present. + * It does not check other protection bits. + */ + for_each_pte(cs, &mem_print_tlb, &state, false, false); } static void mem_print(Monitor *mon, CPUArchState *env, From patchwork Fri May 24 17:07:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 074E3C25B74 for ; Fri, 24 May 2024 17:09:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYP1-00086S-3G; Fri, 24 May 2024 13:08:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYOz-00086C-QI for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:21 -0400 Received: from mail-oo1-xc34.google.com ([2607:f8b0:4864:20::c34]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYOw-0007Z3-4U for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:20 -0400 Received: by mail-oo1-xc34.google.com with SMTP id 006d021491bc7-5b2a2ef4e4cso4826788eaf.0 for ; Fri, 24 May 2024 10:08:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570496; x=1717175296; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gAABzHvK46FYs1GzXkwiwwoFqpPX7zi34euYwr4q6m0=; b=mNSBbqhy6V/K0uaCsAMNYIr/E1lvcAC86jnAoV7e4+yMrROdlyhYmtS9UzeOGzDj7+ jFABZQfqSd4Yxwinj5M/IBNJC6Bvye8GhoTZJoaoRVhLhlUTmNP6oSn9BXxNsqJ2zV8e zfvt7GBSNQLsjk9KyOwf2/qDCnywMW/oyASmW1HlXnwa0L/RMwJHS2OD0Gw2KFerGqDQ yOjUZQ/ZS+cKSBZtGJ8OHR77q+UiZJDKo55jQz1ejVhwLt6a6QOtwyJ/xSFAnP5QLxk2 YdMVtmnwPVtFCFfPybH8h7ibrRfBd7RvKWoitZgPeLsb40/lKbnZ5lGiHcu5O7ktfDRd SRbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570496; x=1717175296; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gAABzHvK46FYs1GzXkwiwwoFqpPX7zi34euYwr4q6m0=; b=V5U1fEq4d9RcLb+suivWfvjP4lepr/AcVMw1yn678vfi/Tuj0GEmJukvUwqx8QKbvp 2pATwnxv5aKtsVx9jLSu96KJVSFDxCI4JjTkLiLcQTZXHIqA2zKkWs8As/X9utmOYioH s+khVHt4rQTDmVgcvd6CDkpEHHrxrcAk6dJ00o/3AokN9jPSS5ntJ/lsw9q3W/KZio7o hmfbdVSvKV99eZTok95BvT8RVw8M/AaOGmmWYnQg2FgZAlhdYwPtW73fUda+braKwaXH C/Ma6SP5oyMuvDgHqKyh/jez5j+CbkyDJ8/wiAFxPDkTZHkBfdqBwyy3/DxhxH+ApVjE yaKQ== X-Gm-Message-State: AOJu0YzQgtXG9SWsT/gxQa6aOzRNsj0cYGUTQHrfR4rrv6fHkst3i/8t gerLP+Adi1s6VIRa6wtViV71B5LvPlJ+aBf39y/V9G9Yp/8z0r6dQeeF5VJj9KtgkaBgSJaeho5 RBPw1Wc0LDiPCn1C0+ZAhH+Zd0eV99M3rmmi5IPVCZAmKcia/KU4G2Z3EwgP+CakM3XHUcRb9X+ zvQv/Xb4CsstCB0ZvC1UobI23MHFlF X-Google-Smtp-Source: AGHT+IF82b90CUA9aSZH7RNh5p58p/P8iyNpAhjK+HRb6ErKhnIQtrsV/7s56wUl9P3xBhj0NsaV2Q== X-Received: by 2002:a05:6359:c29:b0:18d:8cdc:13ce with SMTP id e5c5f4694b2df-197e50c8c88mr299929755d.2.1716570495455; Fri, 24 May 2024 10:08:15 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:14 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 3/6] Convert 'info mem' to use generic iterator Date: Fri, 24 May 2024 13:07:45 -0400 Message-Id: <20240524170748.1842030-4-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::c34; envelope-from=porter@cs.unc.edu; helo=mail-oo1-xc34.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Don Porter --- target/i386/monitor.c | 344 +++++------------------------------------- 1 file changed, 35 insertions(+), 309 deletions(-) diff --git a/target/i386/monitor.c b/target/i386/monitor.c index adf95edfb4..147743392d 100644 --- a/target/i386/monitor.c +++ b/target/i386/monitor.c @@ -480,332 +480,58 @@ void hmp_info_tlb(Monitor *mon, const QDict *qdict) for_each_pte(cs, &mem_print_tlb, &state, false, false); } -static void mem_print(Monitor *mon, CPUArchState *env, - hwaddr *pstart, int *plast_prot, - hwaddr end, int prot) -{ - int prot1; - prot1 = *plast_prot; - if (prot != prot1) { - if (*pstart != -1) { - monitor_printf(mon, HWADDR_FMT_plx "-" HWADDR_FMT_plx " " - HWADDR_FMT_plx " %c%c%c\n", - addr_canonical(env, *pstart), - addr_canonical(env, end), - addr_canonical(env, end - *pstart), - prot1 & PG_USER_MASK ? 'u' : '-', - 'r', - prot1 & PG_RW_MASK ? 'w' : '-'); - } - if (prot != 0) - *pstart = end; - else - *pstart = -1; - *plast_prot = prot; - } -} - -static void mem_info_32(Monitor *mon, CPUArchState *env) +static +bool mem_print(CPUState *cs, struct mem_print_state *state) { - unsigned int l1, l2; - int prot, last_prot; - uint32_t pgd, pde, pte; - hwaddr start, end; - - pgd = env->cr[3] & ~0xfff; - last_prot = 0; - start = -1; - for(l1 = 0; l1 < 1024; l1++) { - cpu_physical_memory_read(pgd + l1 * 4, &pde, 4); - pde = le32_to_cpu(pde); - end = l1 << 22; - if (pde & PG_PRESENT_MASK) { - if ((pde & PG_PSE_MASK) && (env->cr[4] & CR4_PSE_MASK)) { - prot = pde & (PG_USER_MASK | PG_RW_MASK | PG_PRESENT_MASK); - mem_print(mon, env, &start, &last_prot, end, prot); - } else { - for(l2 = 0; l2 < 1024; l2++) { - cpu_physical_memory_read((pde & ~0xfff) + l2 * 4, &pte, 4); - pte = le32_to_cpu(pte); - end = (l1 << 22) + (l2 << 12); - if (pte & PG_PRESENT_MASK) { - prot = pte & pde & - (PG_USER_MASK | PG_RW_MASK | PG_PRESENT_MASK); - } else { - prot = 0; - } - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - } else { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - /* Flush last range */ - mem_print(mon, env, &start, &last_prot, (hwaddr)1 << 32, 0); -} + CPUArchState *env = state->env; + int i = 0; -static void mem_info_pae32(Monitor *mon, CPUArchState *env) -{ - unsigned int l1, l2, l3; - int prot, last_prot; - uint64_t pdpe, pde, pte; - uint64_t pdp_addr, pd_addr, pt_addr; - hwaddr start, end; - - pdp_addr = env->cr[3] & ~0x1f; - last_prot = 0; - start = -1; - for (l1 = 0; l1 < 4; l1++) { - cpu_physical_memory_read(pdp_addr + l1 * 8, &pdpe, 8); - pdpe = le64_to_cpu(pdpe); - end = l1 << 30; - if (pdpe & PG_PRESENT_MASK) { - pd_addr = pdpe & 0x3fffffffff000ULL; - for (l2 = 0; l2 < 512; l2++) { - cpu_physical_memory_read(pd_addr + l2 * 8, &pde, 8); - pde = le64_to_cpu(pde); - end = (l1 << 30) + (l2 << 21); - if (pde & PG_PRESENT_MASK) { - if (pde & PG_PSE_MASK) { - prot = pde & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - mem_print(mon, env, &start, &last_prot, end, prot); - } else { - pt_addr = pde & 0x3fffffffff000ULL; - for (l3 = 0; l3 < 512; l3++) { - cpu_physical_memory_read(pt_addr + l3 * 8, &pte, 8); - pte = le64_to_cpu(pte); - end = (l1 << 30) + (l2 << 21) + (l3 << 12); - if (pte & PG_PRESENT_MASK) { - prot = pte & pde & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - } else { - prot = 0; - } - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - } else { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - } else { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - /* Flush last range */ - mem_print(mon, env, &start, &last_prot, (hwaddr)1 << 32, 0); -} - - -#ifdef TARGET_X86_64 -static void mem_info_la48(Monitor *mon, CPUArchState *env) -{ - int prot, last_prot; - uint64_t l1, l2, l3, l4; - uint64_t pml4e, pdpe, pde, pte; - uint64_t pml4_addr, pdp_addr, pd_addr, pt_addr, start, end; - - pml4_addr = env->cr[3] & 0x3fffffffff000ULL; - last_prot = 0; - start = -1; - for (l1 = 0; l1 < 512; l1++) { - cpu_physical_memory_read(pml4_addr + l1 * 8, &pml4e, 8); - pml4e = le64_to_cpu(pml4e); - end = l1 << 39; - if (pml4e & PG_PRESENT_MASK) { - pdp_addr = pml4e & 0x3fffffffff000ULL; - for (l2 = 0; l2 < 512; l2++) { - cpu_physical_memory_read(pdp_addr + l2 * 8, &pdpe, 8); - pdpe = le64_to_cpu(pdpe); - end = (l1 << 39) + (l2 << 30); - if (pdpe & PG_PRESENT_MASK) { - if (pdpe & PG_PSE_MASK) { - prot = pdpe & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml4e; - mem_print(mon, env, &start, &last_prot, end, prot); - } else { - pd_addr = pdpe & 0x3fffffffff000ULL; - for (l3 = 0; l3 < 512; l3++) { - cpu_physical_memory_read(pd_addr + l3 * 8, &pde, 8); - pde = le64_to_cpu(pde); - end = (l1 << 39) + (l2 << 30) + (l3 << 21); - if (pde & PG_PRESENT_MASK) { - if (pde & PG_PSE_MASK) { - prot = pde & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml4e & pdpe; - mem_print(mon, env, &start, - &last_prot, end, prot); - } else { - pt_addr = pde & 0x3fffffffff000ULL; - for (l4 = 0; l4 < 512; l4++) { - cpu_physical_memory_read(pt_addr - + l4 * 8, - &pte, 8); - pte = le64_to_cpu(pte); - end = (l1 << 39) + (l2 << 30) + - (l3 << 21) + (l4 << 12); - if (pte & PG_PRESENT_MASK) { - prot = pte & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml4e & pdpe & pde; - } else { - prot = 0; - } - mem_print(mon, env, &start, - &last_prot, end, prot); - } - } - } else { - prot = 0; - mem_print(mon, env, &start, - &last_prot, end, prot); - } - } - } - } else { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - } else { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); + /* We need to figure out the lowest populated level */ + for ( ; i < state->max_height; i++) { + if (state->vstart[i] != -1) { + break; } } - /* Flush last range */ - mem_print(mon, env, &start, &last_prot, (hwaddr)1 << 48, 0); -} -static void mem_info_la57(Monitor *mon, CPUArchState *env) -{ - int prot, last_prot; - uint64_t l0, l1, l2, l3, l4; - uint64_t pml5e, pml4e, pdpe, pde, pte; - uint64_t pml5_addr, pml4_addr, pdp_addr, pd_addr, pt_addr, start, end; - - pml5_addr = env->cr[3] & 0x3fffffffff000ULL; - last_prot = 0; - start = -1; - for (l0 = 0; l0 < 512; l0++) { - cpu_physical_memory_read(pml5_addr + l0 * 8, &pml5e, 8); - pml5e = le64_to_cpu(pml5e); - end = l0 << 48; - if (!(pml5e & PG_PRESENT_MASK)) { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } + hwaddr vstart = state->vstart[i]; + hwaddr end = state->vend[i] + mmu_pte_leaf_page_size(cs, i); + int prot = ent2prot(state->ent[i]); - pml4_addr = pml5e & 0x3fffffffff000ULL; - for (l1 = 0; l1 < 512; l1++) { - cpu_physical_memory_read(pml4_addr + l1 * 8, &pml4e, 8); - pml4e = le64_to_cpu(pml4e); - end = (l0 << 48) + (l1 << 39); - if (!(pml4e & PG_PRESENT_MASK)) { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } - - pdp_addr = pml4e & 0x3fffffffff000ULL; - for (l2 = 0; l2 < 512; l2++) { - cpu_physical_memory_read(pdp_addr + l2 * 8, &pdpe, 8); - pdpe = le64_to_cpu(pdpe); - end = (l0 << 48) + (l1 << 39) + (l2 << 30); - if (pdpe & PG_PRESENT_MASK) { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } - - if (pdpe & PG_PSE_MASK) { - prot = pdpe & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml5e & pml4e; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } - pd_addr = pdpe & 0x3fffffffff000ULL; - for (l3 = 0; l3 < 512; l3++) { - cpu_physical_memory_read(pd_addr + l3 * 8, &pde, 8); - pde = le64_to_cpu(pde); - end = (l0 << 48) + (l1 << 39) + (l2 << 30) + (l3 << 21); - if (pde & PG_PRESENT_MASK) { - prot = 0; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } - - if (pde & PG_PSE_MASK) { - prot = pde & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml5e & pml4e & pdpe; - mem_print(mon, env, &start, &last_prot, end, prot); - continue; - } - - pt_addr = pde & 0x3fffffffff000ULL; - for (l4 = 0; l4 < 512; l4++) { - cpu_physical_memory_read(pt_addr + l4 * 8, &pte, 8); - pte = le64_to_cpu(pte); - end = (l0 << 48) + (l1 << 39) + (l2 << 30) + - (l3 << 21) + (l4 << 12); - if (pte & PG_PRESENT_MASK) { - prot = pte & (PG_USER_MASK | PG_RW_MASK | - PG_PRESENT_MASK); - prot &= pml5e & pml4e & pdpe & pde; - } else { - prot = 0; - } - mem_print(mon, env, &start, &last_prot, end, prot); - } - } - } - } - } - /* Flush last range */ - mem_print(mon, env, &start, &last_prot, (hwaddr)1 << 57, 0); + monitor_printf(state->mon, HWADDR_FMT_plx "-" HWADDR_FMT_plx " " + HWADDR_FMT_plx " %c%c%c\n", + addr_canonical(env, vstart), + addr_canonical(env, end), + addr_canonical(env, end - vstart), + prot & PG_USER_MASK ? 'u' : '-', + 'r', + prot & PG_RW_MASK ? 'w' : '-'); + return true; } -#endif /* TARGET_X86_64 */ void hmp_info_mem(Monitor *mon, const QDict *qdict) { - CPUArchState *env; + CPUState *cs; + struct mem_print_state state; - env = mon_get_cpu_env(mon); - if (!env) { - monitor_printf(mon, "No CPU available\n"); + if (!init_iterator(mon, &state)) { return; } + state.flusher = mem_print; - if (!(env->cr[0] & CR0_PG_MASK)) { - monitor_printf(mon, "PG disabled\n"); + cs = mon_get_cpu(mon); + if (!cs) { + monitor_printf(mon, "Unable to get CPUState. Internal error\n"); return; } - if (env->cr[4] & CR4_PAE_MASK) { -#ifdef TARGET_X86_64 - if (env->hflags & HF_LMA_MASK) { - if (env->cr[4] & CR4_LA57_MASK) { - mem_info_la57(mon, env); - } else { - mem_info_la48(mon, env); - } - } else -#endif - { - mem_info_pae32(mon, env); - } - } else { - mem_info_32(mon, env); - } + + /** + * We must visit interior entries to update prot + */ + for_each_pte(cs, &compressing_iterator, &state, true, false); + + /* Flush the last entry, if needed */ + mem_print(cs, &state); } void hmp_mce(Monitor *mon, const QDict *qdict) From patchwork Fri May 24 17:07:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7E45BC25B7D for ; Fri, 24 May 2024 17:09:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYPL-0008Ec-Hq; Fri, 24 May 2024 13:08:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYPH-0008EA-Se for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:40 -0400 Received: from mail-qk1-x72a.google.com ([2607:f8b0:4864:20::72a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYP0-0007ZP-I7 for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:39 -0400 Received: by mail-qk1-x72a.google.com with SMTP id af79cd13be357-794ab20699cso71132485a.2 for ; Fri, 24 May 2024 10:08:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570500; x=1717175300; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3fRrkktFvJuRy2Zz2FXPRdXp3S0sxLe31xIdJ5Am9vE=; b=Go3ldG7kdwSi5fmxW4C79JLiDvsBX6ykWVy+bd1utNc7rS3a4W7XSRWQ/2Yr5fjwMP c236CNXmn9eRtOKynIZim1q7eQ9bxZtUBhirtmXqMsz/2jO6eXkksgZjfX7822hqGiVX 9ut/VTtCehypUE1kF9ICsJurnWZ7nnaPhhAxlL72Hox713kQBs7HYHdQioaiT42RbxXD dBHSXqoyen2dwufuW+uMKlDuv0UKTgpVLzDJh93Ie9XIJ3ZVwYWZ1bieLN39Eb0UjMdr c+u7SE1pLU4HRks1r/F1IKB7RDgG5zMFvVYlwuLaI/yKkUyMbzj6Nnl2bEnOz5ozKtfC yxXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570500; x=1717175300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3fRrkktFvJuRy2Zz2FXPRdXp3S0sxLe31xIdJ5Am9vE=; b=d2DkFVUiru5OJY6hV1ZxrhQCoDait7Iohs8KYw0QW++jQpjhrL/pZkQCVRhIj00ITL SApaC8zaMfztyWDdqAuCCFYQ6gGXKlMKGCoyT6bYsxtF/r1mNDwubbVxNcS1XSJw6h2Y KM+5jzCxizuZZOmiB/CY0odd2qRqOdrGP1i/nXs2hkAFXjdL/hk6AyCWUJM5/luDUSrA cl1llYu8aoNs7PfHP0F+BOrMpnagdB9OcjaBehRb2YZ/CuusdnYpbuz+GXBBJPTsx2ar qrMzFUMeQPyRpIVNsYm9W4pD6xk3xzpGeCLci0+y0Ell0WYYdFg/CGW9zUojp/iCAwvA 2Jng== X-Gm-Message-State: AOJu0YyGWoX/2VI84GPlwqGN3hY9pkFRxkEvpnPeqN+022IKRi2WfgXp Szsuok0L13avYJe+zWixW5864iuM4xMb+6XRYscAH+Qlbn6cjJVIRMa47GuaXoqdNURlUZ2g3BX AvrYILsBdG1Cmh/czkggpWeRwnl/TA9uH9z8FnyUCNv40e0i7r5MkJmsYZbOgCIeQjFpMX996VN YkLnMooMnwVUvXeiJW396reCdEHn9x X-Google-Smtp-Source: AGHT+IFKpfUijr1a3DnAc0U503kM2guEp50y1D7lNhQIrHg+mTuhTTMNkAMo+QX8CAKbz7A7r1rRQw== X-Received: by 2002:a05:620a:5304:b0:792:e88e:f7b1 with SMTP id af79cd13be357-794ab0f8494mr285397785a.51.1716570498101; Fri, 24 May 2024 10:08:18 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:15 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 4/6] Convert x86_cpu_get_memory_mapping() to use generic iterators Date: Fri, 24 May 2024 13:07:46 -0400 Message-Id: <20240524170748.1842030-5-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72a; envelope-from=porter@cs.unc.edu; helo=mail-qk1-x72a.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Don Porter --- target/i386/arch_memory_mapping.c | 318 ++++-------------------------- 1 file changed, 40 insertions(+), 278 deletions(-) diff --git a/target/i386/arch_memory_mapping.c b/target/i386/arch_memory_mapping.c index 00bf2a2116..040464dd34 100644 --- a/target/i386/arch_memory_mapping.c +++ b/target/i386/arch_memory_mapping.c @@ -19,6 +19,7 @@ ************** code hook implementations for x86 *********** */ +/* PAE Paging or IA-32e Paging */ #define PML4_ADDR_MASK 0xffffffffff000ULL /* selects bits 51:12 */ /** @@ -499,302 +500,63 @@ bool for_each_pte(CPUState *cs, /** * Back to x86 hooks */ +struct memory_mapping_data { + MemoryMappingList *list; +}; -/* PAE Paging or IA-32e Paging */ -static void walk_pte(MemoryMappingList *list, AddressSpace *as, - hwaddr pte_start_addr, - int32_t a20_mask, target_ulong start_line_addr) -{ - hwaddr pte_addr, start_paddr; - uint64_t pte; - target_ulong start_vaddr; - int i; - - for (i = 0; i < 512; i++) { - pte_addr = (pte_start_addr + i * 8) & a20_mask; - pte = address_space_ldq(as, pte_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pte & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - start_paddr = (pte & ~0xfff) & ~(0x1ULL << 63); - if (cpu_physical_memory_is_io(start_paddr)) { - /* I/O region */ - continue; - } - - start_vaddr = start_line_addr | ((i & 0x1ff) << 12); - memory_mapping_list_add_merge_sorted(list, start_paddr, - start_vaddr, 1 << 12); - } -} - -/* 32-bit Paging */ -static void walk_pte2(MemoryMappingList *list, AddressSpace *as, - hwaddr pte_start_addr, int32_t a20_mask, - target_ulong start_line_addr) -{ - hwaddr pte_addr, start_paddr; - uint32_t pte; - target_ulong start_vaddr; - int i; - - for (i = 0; i < 1024; i++) { - pte_addr = (pte_start_addr + i * 4) & a20_mask; - pte = address_space_ldl(as, pte_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pte & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - start_paddr = pte & ~0xfff; - if (cpu_physical_memory_is_io(start_paddr)) { - /* I/O region */ - continue; - } - - start_vaddr = start_line_addr | ((i & 0x3ff) << 12); - memory_mapping_list_add_merge_sorted(list, start_paddr, - start_vaddr, 1 << 12); - } -} - -/* PAE Paging or IA-32e Paging */ -#define PLM4_ADDR_MASK 0xffffffffff000ULL /* selects bits 51:12 */ - -static void walk_pde(MemoryMappingList *list, AddressSpace *as, - hwaddr pde_start_addr, - int32_t a20_mask, target_ulong start_line_addr) +static int add_memory_mapping_to_list(CPUState *cs, void *data, PTE_t *pte, + target_ulong vaddr, int height, + int offset) { - hwaddr pde_addr, pte_start_addr, start_paddr; - uint64_t pde; - target_ulong line_addr, start_vaddr; - int i; - - for (i = 0; i < 512; i++) { - pde_addr = (pde_start_addr + i * 8) & a20_mask; - pde = address_space_ldq(as, pde_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pde & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - line_addr = start_line_addr | ((i & 0x1ff) << 21); - if (pde & PG_PSE_MASK) { - /* 2 MB page */ - start_paddr = (pde & ~0x1fffff) & ~(0x1ULL << 63); - if (cpu_physical_memory_is_io(start_paddr)) { - /* I/O region */ - continue; - } - start_vaddr = line_addr; - memory_mapping_list_add_merge_sorted(list, start_paddr, - start_vaddr, 1 << 21); - continue; - } - - pte_start_addr = (pde & PLM4_ADDR_MASK) & a20_mask; - walk_pte(list, as, pte_start_addr, a20_mask, line_addr); - } -} + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; -/* 32-bit Paging */ -static void walk_pde2(MemoryMappingList *list, AddressSpace *as, - hwaddr pde_start_addr, int32_t a20_mask, - bool pse) -{ - hwaddr pde_addr, pte_start_addr, start_paddr, high_paddr; - uint32_t pde; - target_ulong line_addr, start_vaddr; - int i; + struct memory_mapping_data *mm_data = (struct memory_mapping_data *) data; - for (i = 0; i < 1024; i++) { - pde_addr = (pde_start_addr + i * 4) & a20_mask; - pde = address_space_ldl(as, pde_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pde & PG_PRESENT_MASK)) { - /* not present */ - continue; + hwaddr start_paddr = 0; + size_t pg_size = mmu_pte_leaf_page_size(cs, height); + switch (height) { + case 1: + start_paddr = pte->pte64_t & ~0xfff; + if (env->cr[4] & CR4_PAE_MASK) { + start_paddr &= ~(0x1ULL << 63); } - - line_addr = (((unsigned int)i & 0x3ff) << 22); - if ((pde & PG_PSE_MASK) && pse) { + break; + case 2: + if (env->cr[4] & CR4_PAE_MASK) { + start_paddr = (pte->pte64_t & ~0x1fffff) & ~(0x1ULL << 63); + } else { + assert(!!(env->cr[4] & CR4_PSE_MASK)); /* * 4 MB page: * bits 39:32 are bits 20:13 of the PDE * bit3 31:22 are bits 31:22 of the PDE */ - high_paddr = ((hwaddr)(pde & 0x1fe000) << 19); - start_paddr = (pde & ~0x3fffff) | high_paddr; - if (cpu_physical_memory_is_io(start_paddr)) { - /* I/O region */ - continue; - } - start_vaddr = line_addr; - memory_mapping_list_add_merge_sorted(list, start_paddr, - start_vaddr, 1 << 22); - continue; - } - - pte_start_addr = (pde & ~0xfff) & a20_mask; - walk_pte2(list, as, pte_start_addr, a20_mask, line_addr); - } -} - -/* PAE Paging */ -static void walk_pdpe2(MemoryMappingList *list, AddressSpace *as, - hwaddr pdpe_start_addr, int32_t a20_mask) -{ - hwaddr pdpe_addr, pde_start_addr; - uint64_t pdpe; - target_ulong line_addr; - int i; - - for (i = 0; i < 4; i++) { - pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask; - pdpe = address_space_ldq(as, pdpe_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pdpe & PG_PRESENT_MASK)) { - /* not present */ - continue; + hwaddr high_paddr = ((hwaddr)(pte->pte64_t & 0x1fe000) << 19); + start_paddr = (pte->pte64_t & ~0x3fffff) | high_paddr; } - - line_addr = (((unsigned int)i & 0x3) << 30); - pde_start_addr = (pdpe & ~0xfff) & a20_mask; - walk_pde(list, as, pde_start_addr, a20_mask, line_addr); - } -} - -#ifdef TARGET_X86_64 -/* IA-32e Paging */ -static void walk_pdpe(MemoryMappingList *list, AddressSpace *as, - hwaddr pdpe_start_addr, int32_t a20_mask, - target_ulong start_line_addr) -{ - hwaddr pdpe_addr, pde_start_addr, start_paddr; - uint64_t pdpe; - target_ulong line_addr, start_vaddr; - int i; - - for (i = 0; i < 512; i++) { - pdpe_addr = (pdpe_start_addr + i * 8) & a20_mask; - pdpe = address_space_ldq(as, pdpe_addr, MEMTXATTRS_UNSPECIFIED, NULL); - if (!(pdpe & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - line_addr = start_line_addr | ((i & 0x1ffULL) << 30); - if (pdpe & PG_PSE_MASK) { - /* 1 GB page */ - start_paddr = (pdpe & ~0x3fffffff) & ~(0x1ULL << 63); - if (cpu_physical_memory_is_io(start_paddr)) { - /* I/O region */ - continue; - } - start_vaddr = line_addr; - memory_mapping_list_add_merge_sorted(list, start_paddr, - start_vaddr, 1 << 30); - continue; - } - - pde_start_addr = (pdpe & PLM4_ADDR_MASK) & a20_mask; - walk_pde(list, as, pde_start_addr, a20_mask, line_addr); + break; + case 3: + /* Select bits 30--51 */ + start_paddr = (pte->pte64_t & 0xfffffc0000000); + break; + default: + g_assert_not_reached(); } -} - -/* IA-32e Paging */ -static void walk_pml4e(MemoryMappingList *list, AddressSpace *as, - hwaddr pml4e_start_addr, int32_t a20_mask, - target_ulong start_line_addr) -{ - hwaddr pml4e_addr, pdpe_start_addr; - uint64_t pml4e; - target_ulong line_addr; - int i; - for (i = 0; i < 512; i++) { - pml4e_addr = (pml4e_start_addr + i * 8) & a20_mask; - pml4e = address_space_ldq(as, pml4e_addr, MEMTXATTRS_UNSPECIFIED, - NULL); - if (!(pml4e & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - line_addr = start_line_addr | ((i & 0x1ffULL) << 39); - pdpe_start_addr = (pml4e & PLM4_ADDR_MASK) & a20_mask; - walk_pdpe(list, as, pdpe_start_addr, a20_mask, line_addr); + /* This hook skips mappings for the I/O region */ + if (cpu_physical_memory_is_io(start_paddr)) { + /* I/O region */ + return 0; } -} -static void walk_pml5e(MemoryMappingList *list, AddressSpace *as, - hwaddr pml5e_start_addr, int32_t a20_mask) -{ - hwaddr pml5e_addr, pml4e_start_addr; - uint64_t pml5e; - target_ulong line_addr; - int i; - - for (i = 0; i < 512; i++) { - pml5e_addr = (pml5e_start_addr + i * 8) & a20_mask; - pml5e = address_space_ldq(as, pml5e_addr, MEMTXATTRS_UNSPECIFIED, - NULL); - if (!(pml5e & PG_PRESENT_MASK)) { - /* not present */ - continue; - } - - line_addr = (0x7fULL << 57) | ((i & 0x1ffULL) << 48); - pml4e_start_addr = (pml5e & PLM4_ADDR_MASK) & a20_mask; - walk_pml4e(list, as, pml4e_start_addr, a20_mask, line_addr); - } + memory_mapping_list_add_merge_sorted(mm_data->list, start_paddr, + vaddr, pg_size); + return 0; } -#endif bool x86_cpu_get_memory_mapping(CPUState *cs, MemoryMappingList *list, Error **errp) { - X86CPU *cpu = X86_CPU(cs); - CPUX86State *env = &cpu->env; - int32_t a20_mask; - - if (!cpu_paging_enabled(cs)) { - /* paging is disabled */ - return true; - } - - a20_mask = x86_get_a20_mask(env); - if (env->cr[4] & CR4_PAE_MASK) { -#ifdef TARGET_X86_64 - if (env->hflags & HF_LMA_MASK) { - if (env->cr[4] & CR4_LA57_MASK) { - hwaddr pml5e_addr; - - pml5e_addr = (env->cr[3] & PLM4_ADDR_MASK) & a20_mask; - walk_pml5e(list, cs->as, pml5e_addr, a20_mask); - } else { - hwaddr pml4e_addr; - - pml4e_addr = (env->cr[3] & PLM4_ADDR_MASK) & a20_mask; - walk_pml4e(list, cs->as, pml4e_addr, a20_mask, - 0xffffULL << 48); - } - } else -#endif - { - hwaddr pdpe_addr; - - pdpe_addr = (env->cr[3] & ~0x1f) & a20_mask; - walk_pdpe2(list, cs->as, pdpe_addr, a20_mask); - } - } else { - hwaddr pde_addr; - bool pse; - - pde_addr = (env->cr[3] & ~0xfff) & a20_mask; - pse = !!(env->cr[4] & CR4_PSE_MASK); - walk_pde2(list, cs->as, pde_addr, a20_mask, pse); - } - - return true; + return for_each_pte(cs, &add_memory_mapping_to_list, list, false, false); } From patchwork Fri May 24 17:07:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8876EC25B74 for ; Fri, 24 May 2024 17:09:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYPF-0008De-Iu; Fri, 24 May 2024 13:08:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYPD-0008Au-Pp for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:35 -0400 Received: from mail-oo1-xc2f.google.com ([2607:f8b0:4864:20::c2f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYP0-0007ZJ-9M for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:34 -0400 Received: by mail-oo1-xc2f.google.com with SMTP id 006d021491bc7-5b970e90ab8so545702eaf.3 for ; Fri, 24 May 2024 10:08:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570500; x=1717175300; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vZ8Olop5o69sFhCXF1Pjbx8bqbvvoRclKjZlHZw/Olk=; b=Nofx2K8eVtVzcMPdZKlblefuEWoHwpxljcm7mQdHjRbSwDlN2hzUZbj0QdSvK5Rqf+ 54PAybrbQ0FBKnDJTQ6eyrMupTggAZRyuQS6nRUpUL5LPjlxtjFzAX9+ntwtEmfXFxv2 qkfYpVvlKhV5eq051WnJQ8AEuZGUGHAureG9kHKQLZZb6J3Xsoql8nhVMCmWAjsxXcjd MZ1vn8cX4L8gcidyZGQvc8wSjXHAD0S6cYbjL02B5rKmVmS00evWgXqvBaHM1v6uua7r XeIkzgvFaPQLpjN/ffp1bUFsAsjAhAzS5jkteXWKWT5bhp85/dyRtVD0migMwZ0QkmZu xLOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570500; x=1717175300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vZ8Olop5o69sFhCXF1Pjbx8bqbvvoRclKjZlHZw/Olk=; b=AXSKn083DrY4HTi+zd3kjNM2oz0cJQzcNCtR3XHul2FBtR5ER1pD3U+H/+B7k2uCay mMbidHZUpU/sY9nhsDCKwQUZAkAshSXPM4YMRYi37OFqVxRqezAjQ5ngMRwOu3Id9mA7 jEdG9r58mBbaG2yR2l/gY6/rP+94EsdvogYM8NhsC8uV6TFOOfs5t54KwUBhJdf2butZ uisIczCzRGbC5CVc/Fhedg3w4pgKvMzVQgvzDuBHThWKRYVsnflmUI//2aZh1x3n6zfe Zt8LYayrG/30eDLZdr/TQFgRw9IjrG+UjyckFt+QHYI8IQUKYgdEDfBYsvUSjPaI3B1A ro6g== X-Gm-Message-State: AOJu0YzjUXuSE7oZgpOjwYlbuQvetL82pCajNSZx5CMJiZ5rPD5lCkS3 0jiyUszzMbGkYiAIX+5q6xkfnp3hQYv9wLjB2vamKJrg67HQ/Mbu79tAHS03qoy496U917envBY i9NcHErXekeAo4lkZ66FwE2s73amiOBKONld8fmnJSCHMkVoW8RlsCkVBgjyTTuLj4foxjXTDxj yAawNMajt27c/7Sp6AydYxYAyLW2pt X-Google-Smtp-Source: AGHT+IFpZbYWaGtT8DCg7Sxrq7kn4BgI01hLZGnadnHCOMddJqfq4RD1tOyGvmh5PLIOOaR40kxV0A== X-Received: by 2002:a05:6359:684:b0:17f:7206:fd81 with SMTP id e5c5f4694b2df-197e545496bmr341488055d.20.1716570499125; Fri, 24 May 2024 10:08:19 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:18 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 5/6] Move tcg implementation of x86 get_physical_address into common helper code. Date: Fri, 24 May 2024 13:07:47 -0400 Message-Id: <20240524170748.1842030-6-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::c2f; envelope-from=porter@cs.unc.edu; helo=mail-oo1-xc2f.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Don Porter --- target/i386/cpu.h | 41 ++ target/i386/helper.c | 515 +++++++++++++++++++++++++ target/i386/tcg/sysemu/excp_helper.c | 555 +-------------------------- 3 files changed, 561 insertions(+), 550 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index fc3ae55213..39ce49e61f 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -2094,6 +2094,42 @@ struct X86CPUClass { ResettablePhases parent_phases; }; +typedef struct X86TranslateParams { + target_ulong addr; + target_ulong cr3; + int pg_mode; + int mmu_idx; + int ptw_idx; + MMUAccessType access_type; +} X86TranslateParams; + +typedef struct X86TranslateResult { + hwaddr paddr; + int prot; + int page_size; +} X86TranslateResult; + +typedef enum X86TranslateFaultStage2 { + S2_NONE, + S2_GPA, + S2_GPT, +} X86TranslateFaultStage2; + +typedef struct X86TranslateFault { + int exception_index; + int error_code; + target_ulong cr2; + X86TranslateFaultStage2 stage2; +} X86TranslateFault; + +typedef struct X86PTETranslate { + CPUX86State *env; + X86TranslateFault *err; + int ptw_idx; + void *haddr; + hwaddr gaddr; +} X86PTETranslate; + /* Intended to become a generic PTE type */ typedef union PTE { uint64_t pte64_t; @@ -2137,6 +2173,11 @@ void x86_cpu_list(void); int cpu_x86_support_mca_broadcast(CPUX86State *env); #ifndef CONFIG_USER_ONLY +bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, + MMUAccessType access_type, int mmu_idx, + X86TranslateResult *out, + X86TranslateFault *err, uint64_t ra); + hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr, MemTxAttrs *attrs); int cpu_get_pic_interrupt(CPUX86State *s); diff --git a/target/i386/helper.c b/target/i386/helper.c index 48d1513a35..21445e84b2 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -26,6 +26,7 @@ #include "sysemu/hw_accel.h" #include "monitor/monitor.h" #include "kvm/kvm_i386.h" +#include "exec/cpu_ldst.h" #endif #include "qemu/log.h" #ifdef CONFIG_TCG @@ -227,6 +228,520 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4) } #if !defined(CONFIG_USER_ONLY) + +static inline uint32_t ptw_ldl(const X86PTETranslate *in, uint64_t ra) +{ + if (likely(in->haddr)) { + return ldl_p(in->haddr); + } + return cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, ra); +} + +static inline uint64_t ptw_ldq(const X86PTETranslate *in, uint64_t ra) +{ + if (likely(in->haddr)) { + return ldq_p(in->haddr); + } + return cpu_ldq_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, ra); +} +/* + * Note that we can use a 32-bit cmpxchg for all page table entries, + * even 64-bit ones, because PG_PRESENT_MASK, PG_ACCESSED_MASK and + * PG_DIRTY_MASK are all in the low 32 bits. + */ +static bool ptw_setl_slow(const X86PTETranslate *in, uint32_t old, uint32_t new) +{ + uint32_t cmp; + + /* Does x86 really perform a rmw cycle on mmio for ptw? */ + start_exclusive(); + cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0); + if (cmp == old) { + cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0); + } + end_exclusive(); + return cmp == old; +} + +static inline bool ptw_setl(const X86PTETranslate *in, uint32_t old, + uint32_t set) +{ + if (set & ~old) { + uint32_t new = old | set; + if (likely(in->haddr)) { + old = cpu_to_le32(old); + new = cpu_to_le32(new); + return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old; + } + return ptw_setl_slow(in, old, new); + } + return true; +} + + +static bool ptw_translate(X86PTETranslate *inout, hwaddr addr, uint64_t ra) +{ + CPUTLBEntryFull *full; + int flags; + + inout->gaddr = addr; + flags = probe_access_full(inout->env, addr, 0, MMU_DATA_STORE, + inout->ptw_idx, true, &inout->haddr, &full, ra); + + if (unlikely(flags & TLB_INVALID_MASK)) { + X86TranslateFault *err = inout->err; + + assert(inout->ptw_idx == MMU_NESTED_IDX); + *err = (X86TranslateFault){ + .error_code = inout->env->error_code, + .cr2 = addr, + .stage2 = S2_GPT, + }; + return false; + } + return true; +} + +static bool x86_mmu_translate(CPUX86State *env, const X86TranslateParams *in, + X86TranslateResult *out, + X86TranslateFault *err, uint64_t ra) +{ + const target_ulong addr = in->addr; + const int pg_mode = in->pg_mode; + const bool is_user = is_mmu_index_user(in->mmu_idx); + const MMUAccessType access_type = in->access_type; + uint64_t ptep, pte, rsvd_mask; + X86PTETranslate pte_trans = { + .env = env, + .err = err, + .ptw_idx = in->ptw_idx, + }; + hwaddr pte_addr, paddr; + uint32_t pkr; + int page_size; + int error_code; + + restart_all: + rsvd_mask = ~MAKE_64BIT_MASK(0, env_archcpu(env)->phys_bits); + rsvd_mask &= PG_ADDRESS_MASK; + if (!(pg_mode & PG_MODE_NXE)) { + rsvd_mask |= PG_NX_MASK; + } + + if (pg_mode & PG_MODE_PAE) { +#ifdef TARGET_X86_64 + if (pg_mode & PG_MODE_LMA) { + if (pg_mode & PG_MODE_LA57) { + /* + * Page table level 5 + */ + pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + restart_5: + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & (rsvd_mask | PG_PSE_MASK)) { + goto do_fault_rsvd; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_5; + } + ptep = pte ^ PG_NX_MASK; + } else { + pte = in->cr3; + ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; + } + + /* + * Page table level 4 + */ + pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 39) & 0x1ff) << 3); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + restart_4: + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & (rsvd_mask | PG_PSE_MASK)) { + goto do_fault_rsvd; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_4; + } + ptep &= pte ^ PG_NX_MASK; + + /* + * Page table level 3 + */ + pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 30) & 0x1ff) << 3); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + restart_3_lma: + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & rsvd_mask) { + goto do_fault_rsvd; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_3_lma; + } + ptep &= pte ^ PG_NX_MASK; + if (pte & PG_PSE_MASK) { + /* 1 GB page */ + page_size = 1024 * 1024 * 1024; + goto do_check_protect; + } + } else +#endif + { + /* + * Page table level 3 + */ + pte_addr = (in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + rsvd_mask |= PG_HI_USER_MASK; + restart_3_nolma: + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & (rsvd_mask | PG_NX_MASK)) { + goto do_fault_rsvd; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_3_nolma; + } + ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; + } + + /* + * Page table level 2 + */ + pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 21) & 0x1ff) << 3); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + restart_2_pae: + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & rsvd_mask) { + goto do_fault_rsvd; + } + if (pte & PG_PSE_MASK) { + /* 2 MB page */ + page_size = 2048 * 1024; + ptep &= pte ^ PG_NX_MASK; + goto do_check_protect; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_2_pae; + } + ptep &= pte ^ PG_NX_MASK; + + /* + * Page table level 1 + */ + pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 12) & 0x1ff) << 3); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + pte = ptw_ldq(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + if (pte & rsvd_mask) { + goto do_fault_rsvd; + } + /* combine pde and pte nx, user and rw protections */ + ptep &= pte ^ PG_NX_MASK; + page_size = 4096; + } else { + /* + * Page table level 2 + */ + pte_addr = (in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + restart_2_nopae: + pte = ptw_ldl(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + ptep = pte | PG_NX_MASK; + + /* if PSE bit is set, then we use a 4MB page */ + if ((pte & PG_PSE_MASK) && (pg_mode & PG_MODE_PSE)) { + page_size = 4096 * 1024; + /* + * Bits 20-13 provide bits 39-32 of the address, bit 21 is reserved. + * Leave bits 20-13 in place for setting accessed/dirty bits below. + */ + pte = (uint32_t)pte | ((pte & 0x1fe000LL) << (32 - 13)); + rsvd_mask = 0x200000; + goto do_check_protect_pse36; + } + if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { + goto restart_2_nopae; + } + + /* + * Page table level 1 + */ + pte_addr = (pte & ~0xfffu) + ((addr >> 10) & 0xffc); + if (!ptw_translate(&pte_trans, pte_addr, ra)) { + return false; + } + pte = ptw_ldl(&pte_trans, ra); + if (!(pte & PG_PRESENT_MASK)) { + goto do_fault; + } + /* combine pde and pte user and rw protections */ + ptep &= pte | PG_NX_MASK; + page_size = 4096; + rsvd_mask = 0; + } + +do_check_protect: + rsvd_mask |= (page_size - 1) & PG_ADDRESS_MASK & ~PG_PSE_PAT_MASK; +do_check_protect_pse36: + if (pte & rsvd_mask) { + goto do_fault_rsvd; + } + ptep ^= PG_NX_MASK; + + /* can the page can be put in the TLB? prot will tell us */ + if (is_user && !(ptep & PG_USER_MASK)) { + goto do_fault_protect; + } + + int prot = 0; + if (!is_mmu_index_smap(in->mmu_idx) || !(ptep & PG_USER_MASK)) { + prot |= PAGE_READ; + if ((ptep & PG_RW_MASK) || !(is_user || (pg_mode & PG_MODE_WP))) { + prot |= PAGE_WRITE; + } + } + if (!(ptep & PG_NX_MASK) && + (is_user || + !((pg_mode & PG_MODE_SMEP) && (ptep & PG_USER_MASK)))) { + prot |= PAGE_EXEC; + } + + if (ptep & PG_USER_MASK) { + pkr = pg_mode & PG_MODE_PKE ? env->pkru : 0; + } else { + pkr = pg_mode & PG_MODE_PKS ? env->pkrs : 0; + } + if (pkr) { + uint32_t pk = (pte & PG_PKRU_MASK) >> PG_PKRU_BIT; + uint32_t pkr_ad = (pkr >> pk * 2) & 1; + uint32_t pkr_wd = (pkr >> pk * 2) & 2; + uint32_t pkr_prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; + + if (pkr_ad) { + pkr_prot &= ~(PAGE_READ | PAGE_WRITE); + } else if (pkr_wd && (is_user || (pg_mode & PG_MODE_WP))) { + pkr_prot &= ~PAGE_WRITE; + } + if ((pkr_prot & (1 << access_type)) == 0) { + goto do_fault_pk_protect; + } + prot &= pkr_prot; + } + + if ((prot & (1 << access_type)) == 0) { + goto do_fault_protect; + } + + /* yes, it can! */ + { + uint32_t set = PG_ACCESSED_MASK; + if (access_type == MMU_DATA_STORE) { + set |= PG_DIRTY_MASK; + } else if (!(pte & PG_DIRTY_MASK)) { + /* + * Only set write access if already dirty... + * otherwise wait for dirty access. + */ + prot &= ~PAGE_WRITE; + } + if (!ptw_setl(&pte_trans, pte, set)) { + /* + * We can arrive here from any of 3 levels and 2 formats. + * The only safe thing is to restart the entire lookup. + */ + goto restart_all; + } + } + + /* merge offset within page */ + paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | + (addr & (page_size - 1)); + + /* + * Note that NPT is walked (for both paging structures and final guest + * addresses) using the address with the A20 bit set. + */ + if (in->ptw_idx == MMU_NESTED_IDX) { + CPUTLBEntryFull *full; + int flags, nested_page_size; + + flags = probe_access_full(env, paddr, 0, access_type, + MMU_NESTED_IDX, true, + &pte_trans.haddr, &full, 0); + if (unlikely(flags & TLB_INVALID_MASK)) { + *err = (X86TranslateFault){ + .error_code = env->error_code, + .cr2 = paddr, + .stage2 = S2_GPA, + }; + return false; + } + + /* Merge stage1 & stage2 protection bits. */ + prot &= full->prot; + + /* Re-verify resulting protection. */ + if ((prot & (1 << access_type)) == 0) { + goto do_fault_protect; + } + + /* Merge stage1 & stage2 addresses to final physical address. */ + nested_page_size = 1 << full->lg_page_size; + paddr = (full->phys_addr & ~(nested_page_size - 1)) + | (paddr & (nested_page_size - 1)); + + /* + * Use the larger of stage1 & stage2 page sizes, so that + * invalidation works. + */ + if (nested_page_size > page_size) { + page_size = nested_page_size; + } + } + + out->paddr = paddr & x86_get_a20_mask(env); + out->prot = prot; + out->page_size = page_size; + return true; + + do_fault_rsvd: + error_code = PG_ERROR_RSVD_MASK; + goto do_fault_cont; + do_fault_protect: + error_code = PG_ERROR_P_MASK; + goto do_fault_cont; + do_fault_pk_protect: + assert(access_type != MMU_INST_FETCH); + error_code = PG_ERROR_PK_MASK | PG_ERROR_P_MASK; + goto do_fault_cont; + do_fault: + error_code = 0; + do_fault_cont: + if (is_user) { + error_code |= PG_ERROR_U_MASK; + } + switch (access_type) { + case MMU_DATA_LOAD: + break; + case MMU_DATA_STORE: + error_code |= PG_ERROR_W_MASK; + break; + case MMU_INST_FETCH: + if (pg_mode & (PG_MODE_NXE | PG_MODE_SMEP)) { + error_code |= PG_ERROR_I_D_MASK; + } + break; + } + *err = (X86TranslateFault){ + .exception_index = EXCP0E_PAGE, + .error_code = error_code, + .cr2 = addr, + }; + return false; +} + +bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, + MMUAccessType access_type, int mmu_idx, + X86TranslateResult *out, + X86TranslateFault *err, uint64_t ra) +{ + X86TranslateParams in; + bool use_stage2 = env->hflags2 & HF2_NPT_MASK; + + in.addr = addr; + in.access_type = access_type; + + switch (mmu_idx) { + case MMU_PHYS_IDX: + break; + + case MMU_NESTED_IDX: + if (likely(use_stage2)) { + in.cr3 = env->nested_cr3; + in.pg_mode = env->nested_pg_mode; + in.mmu_idx = + env->nested_pg_mode & PG_MODE_LMA ? + MMU_USER64_IDX : MMU_USER32_IDX; + in.ptw_idx = MMU_PHYS_IDX; + + if (!x86_mmu_translate(env, &in, out, err, ra)) { + err->stage2 = S2_GPA; + return false; + } + return true; + } + break; + + default: + if (is_mmu_index_32(mmu_idx)) { + addr = (uint32_t)addr; + } + + if (likely(env->cr[0] & CR0_PG_MASK)) { + in.cr3 = env->cr[3]; + in.mmu_idx = mmu_idx; + in.ptw_idx = use_stage2 ? MMU_NESTED_IDX : MMU_PHYS_IDX; + in.pg_mode = get_pg_mode(env); + + if (in.pg_mode & PG_MODE_LMA) { + /* test virtual address sign extension */ + int shift = in.pg_mode & PG_MODE_LA57 ? 56 : 47; + int64_t sext = (int64_t)addr >> shift; + if (sext != 0 && sext != -1) { + *err = (X86TranslateFault){ + .exception_index = EXCP0D_GPF, + .cr2 = addr, + }; + return false; + } + } + return x86_mmu_translate(env, &in, out, err, ra); + } + break; + } + + /* No translation needed. */ + out->paddr = addr & x86_get_a20_mask(env); + out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; + out->page_size = TARGET_PAGE_SIZE; + return true; +} + hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr, MemTxAttrs *attrs) { diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index 8fb05b1f53..4c48e5a68b 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -24,487 +24,7 @@ #include "exec/page-protection.h" #include "tcg/helper-tcg.h" -typedef struct TranslateParams { - target_ulong addr; - target_ulong cr3; - int pg_mode; - int mmu_idx; - int ptw_idx; - MMUAccessType access_type; -} TranslateParams; - -typedef struct TranslateResult { - hwaddr paddr; - int prot; - int page_size; -} TranslateResult; - -typedef enum TranslateFaultStage2 { - S2_NONE, - S2_GPA, - S2_GPT, -} TranslateFaultStage2; - -typedef struct TranslateFault { - int exception_index; - int error_code; - target_ulong cr2; - TranslateFaultStage2 stage2; -} TranslateFault; - -typedef struct PTETranslate { - CPUX86State *env; - TranslateFault *err; - int ptw_idx; - void *haddr; - hwaddr gaddr; -} PTETranslate; - -static bool ptw_translate(PTETranslate *inout, hwaddr addr, uint64_t ra) -{ - CPUTLBEntryFull *full; - int flags; - - inout->gaddr = addr; - flags = probe_access_full(inout->env, addr, 0, MMU_DATA_STORE, - inout->ptw_idx, true, &inout->haddr, &full, ra); - - if (unlikely(flags & TLB_INVALID_MASK)) { - TranslateFault *err = inout->err; - - assert(inout->ptw_idx == MMU_NESTED_IDX); - *err = (TranslateFault){ - .error_code = inout->env->error_code, - .cr2 = addr, - .stage2 = S2_GPT, - }; - return false; - } - return true; -} - -static inline uint32_t ptw_ldl(const PTETranslate *in, uint64_t ra) -{ - if (likely(in->haddr)) { - return ldl_p(in->haddr); - } - return cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, ra); -} - -static inline uint64_t ptw_ldq(const PTETranslate *in, uint64_t ra) -{ - if (likely(in->haddr)) { - return ldq_p(in->haddr); - } - return cpu_ldq_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, ra); -} - -/* - * Note that we can use a 32-bit cmpxchg for all page table entries, - * even 64-bit ones, because PG_PRESENT_MASK, PG_ACCESSED_MASK and - * PG_DIRTY_MASK are all in the low 32 bits. - */ -static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new) -{ - uint32_t cmp; - - /* Does x86 really perform a rmw cycle on mmio for ptw? */ - start_exclusive(); - cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0); - if (cmp == old) { - cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0); - } - end_exclusive(); - return cmp == old; -} - -static inline bool ptw_setl(const PTETranslate *in, uint32_t old, uint32_t set) -{ - if (set & ~old) { - uint32_t new = old | set; - if (likely(in->haddr)) { - old = cpu_to_le32(old); - new = cpu_to_le32(new); - return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old; - } - return ptw_setl_slow(in, old, new); - } - return true; -} - -static bool mmu_translate(CPUX86State *env, const TranslateParams *in, - TranslateResult *out, TranslateFault *err, - uint64_t ra) -{ - const target_ulong addr = in->addr; - const int pg_mode = in->pg_mode; - const bool is_user = is_mmu_index_user(in->mmu_idx); - const MMUAccessType access_type = in->access_type; - uint64_t ptep, pte, rsvd_mask; - PTETranslate pte_trans = { - .env = env, - .err = err, - .ptw_idx = in->ptw_idx, - }; - hwaddr pte_addr, paddr; - uint32_t pkr; - int page_size; - int error_code; - - restart_all: - rsvd_mask = ~MAKE_64BIT_MASK(0, env_archcpu(env)->phys_bits); - rsvd_mask &= PG_ADDRESS_MASK; - if (!(pg_mode & PG_MODE_NXE)) { - rsvd_mask |= PG_NX_MASK; - } - - if (pg_mode & PG_MODE_PAE) { -#ifdef TARGET_X86_64 - if (pg_mode & PG_MODE_LMA) { - if (pg_mode & PG_MODE_LA57) { - /* - * Page table level 5 - */ - pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_5: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_PSE_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_5; - } - ptep = pte ^ PG_NX_MASK; - } else { - pte = in->cr3; - ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; - } - - /* - * Page table level 4 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 39) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_4: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_PSE_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_4; - } - ptep &= pte ^ PG_NX_MASK; - - /* - * Page table level 3 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 30) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_3_lma: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_3_lma; - } - ptep &= pte ^ PG_NX_MASK; - if (pte & PG_PSE_MASK) { - /* 1 GB page */ - page_size = 1024 * 1024 * 1024; - goto do_check_protect; - } - } else -#endif - { - /* - * Page table level 3 - */ - pte_addr = (in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - rsvd_mask |= PG_HI_USER_MASK; - restart_3_nolma: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_NX_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_3_nolma; - } - ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; - } - - /* - * Page table level 2 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 21) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_2_pae: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - if (pte & PG_PSE_MASK) { - /* 2 MB page */ - page_size = 2048 * 1024; - ptep &= pte ^ PG_NX_MASK; - goto do_check_protect; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_2_pae; - } - ptep &= pte ^ PG_NX_MASK; - - /* - * Page table level 1 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 12) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - /* combine pde and pte nx, user and rw protections */ - ptep &= pte ^ PG_NX_MASK; - page_size = 4096; - } else { - /* - * Page table level 2 - */ - pte_addr = (in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_2_nopae: - pte = ptw_ldl(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - ptep = pte | PG_NX_MASK; - - /* if PSE bit is set, then we use a 4MB page */ - if ((pte & PG_PSE_MASK) && (pg_mode & PG_MODE_PSE)) { - page_size = 4096 * 1024; - /* - * Bits 20-13 provide bits 39-32 of the address, bit 21 is reserved. - * Leave bits 20-13 in place for setting accessed/dirty bits below. - */ - pte = (uint32_t)pte | ((pte & 0x1fe000LL) << (32 - 13)); - rsvd_mask = 0x200000; - goto do_check_protect_pse36; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_2_nopae; - } - - /* - * Page table level 1 - */ - pte_addr = (pte & ~0xfffu) + ((addr >> 10) & 0xffc); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - pte = ptw_ldl(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - /* combine pde and pte user and rw protections */ - ptep &= pte | PG_NX_MASK; - page_size = 4096; - rsvd_mask = 0; - } - -do_check_protect: - rsvd_mask |= (page_size - 1) & PG_ADDRESS_MASK & ~PG_PSE_PAT_MASK; -do_check_protect_pse36: - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - ptep ^= PG_NX_MASK; - - /* can the page can be put in the TLB? prot will tell us */ - if (is_user && !(ptep & PG_USER_MASK)) { - goto do_fault_protect; - } - - int prot = 0; - if (!is_mmu_index_smap(in->mmu_idx) || !(ptep & PG_USER_MASK)) { - prot |= PAGE_READ; - if ((ptep & PG_RW_MASK) || !(is_user || (pg_mode & PG_MODE_WP))) { - prot |= PAGE_WRITE; - } - } - if (!(ptep & PG_NX_MASK) && - (is_user || - !((pg_mode & PG_MODE_SMEP) && (ptep & PG_USER_MASK)))) { - prot |= PAGE_EXEC; - } - - if (ptep & PG_USER_MASK) { - pkr = pg_mode & PG_MODE_PKE ? env->pkru : 0; - } else { - pkr = pg_mode & PG_MODE_PKS ? env->pkrs : 0; - } - if (pkr) { - uint32_t pk = (pte & PG_PKRU_MASK) >> PG_PKRU_BIT; - uint32_t pkr_ad = (pkr >> pk * 2) & 1; - uint32_t pkr_wd = (pkr >> pk * 2) & 2; - uint32_t pkr_prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; - - if (pkr_ad) { - pkr_prot &= ~(PAGE_READ | PAGE_WRITE); - } else if (pkr_wd && (is_user || (pg_mode & PG_MODE_WP))) { - pkr_prot &= ~PAGE_WRITE; - } - if ((pkr_prot & (1 << access_type)) == 0) { - goto do_fault_pk_protect; - } - prot &= pkr_prot; - } - - if ((prot & (1 << access_type)) == 0) { - goto do_fault_protect; - } - - /* yes, it can! */ - { - uint32_t set = PG_ACCESSED_MASK; - if (access_type == MMU_DATA_STORE) { - set |= PG_DIRTY_MASK; - } else if (!(pte & PG_DIRTY_MASK)) { - /* - * Only set write access if already dirty... - * otherwise wait for dirty access. - */ - prot &= ~PAGE_WRITE; - } - if (!ptw_setl(&pte_trans, pte, set)) { - /* - * We can arrive here from any of 3 levels and 2 formats. - * The only safe thing is to restart the entire lookup. - */ - goto restart_all; - } - } - - /* merge offset within page */ - paddr = (pte & PG_ADDRESS_MASK & ~(page_size - 1)) | (addr & (page_size - 1)); - - /* - * Note that NPT is walked (for both paging structures and final guest - * addresses) using the address with the A20 bit set. - */ - if (in->ptw_idx == MMU_NESTED_IDX) { - CPUTLBEntryFull *full; - int flags, nested_page_size; - - flags = probe_access_full(env, paddr, 0, access_type, - MMU_NESTED_IDX, true, - &pte_trans.haddr, &full, 0); - if (unlikely(flags & TLB_INVALID_MASK)) { - *err = (TranslateFault){ - .error_code = env->error_code, - .cr2 = paddr, - .stage2 = S2_GPA, - }; - return false; - } - - /* Merge stage1 & stage2 protection bits. */ - prot &= full->prot; - - /* Re-verify resulting protection. */ - if ((prot & (1 << access_type)) == 0) { - goto do_fault_protect; - } - - /* Merge stage1 & stage2 addresses to final physical address. */ - nested_page_size = 1 << full->lg_page_size; - paddr = (full->phys_addr & ~(nested_page_size - 1)) - | (paddr & (nested_page_size - 1)); - - /* - * Use the larger of stage1 & stage2 page sizes, so that - * invalidation works. - */ - if (nested_page_size > page_size) { - page_size = nested_page_size; - } - } - - out->paddr = paddr & x86_get_a20_mask(env); - out->prot = prot; - out->page_size = page_size; - return true; - - do_fault_rsvd: - error_code = PG_ERROR_RSVD_MASK; - goto do_fault_cont; - do_fault_protect: - error_code = PG_ERROR_P_MASK; - goto do_fault_cont; - do_fault_pk_protect: - assert(access_type != MMU_INST_FETCH); - error_code = PG_ERROR_PK_MASK | PG_ERROR_P_MASK; - goto do_fault_cont; - do_fault: - error_code = 0; - do_fault_cont: - if (is_user) { - error_code |= PG_ERROR_U_MASK; - } - switch (access_type) { - case MMU_DATA_LOAD: - break; - case MMU_DATA_STORE: - error_code |= PG_ERROR_W_MASK; - break; - case MMU_INST_FETCH: - if (pg_mode & (PG_MODE_NXE | PG_MODE_SMEP)) { - error_code |= PG_ERROR_I_D_MASK; - } - break; - } - *err = (TranslateFault){ - .exception_index = EXCP0E_PAGE, - .error_code = error_code, - .cr2 = addr, - }; - return false; -} - -static G_NORETURN void raise_stage2(CPUX86State *env, TranslateFault *err, +static G_NORETURN void raise_stage2(CPUX86State *env, X86TranslateFault *err, uintptr_t retaddr) { uint64_t exit_info_1 = err->error_code; @@ -526,82 +46,17 @@ static G_NORETURN void raise_stage2(CPUX86State *env, TranslateFault *err, cpu_vmexit(env, SVM_EXIT_NPF, exit_info_1, retaddr); } -static bool get_physical_address(CPUX86State *env, vaddr addr, - MMUAccessType access_type, int mmu_idx, - TranslateResult *out, TranslateFault *err, - uint64_t ra) -{ - TranslateParams in; - bool use_stage2 = env->hflags2 & HF2_NPT_MASK; - - in.addr = addr; - in.access_type = access_type; - - switch (mmu_idx) { - case MMU_PHYS_IDX: - break; - - case MMU_NESTED_IDX: - if (likely(use_stage2)) { - in.cr3 = env->nested_cr3; - in.pg_mode = env->nested_pg_mode; - in.mmu_idx = - env->nested_pg_mode & PG_MODE_LMA ? MMU_USER64_IDX : MMU_USER32_IDX; - in.ptw_idx = MMU_PHYS_IDX; - - if (!mmu_translate(env, &in, out, err, ra)) { - err->stage2 = S2_GPA; - return false; - } - return true; - } - break; - - default: - if (is_mmu_index_32(mmu_idx)) { - addr = (uint32_t)addr; - } - - if (likely(env->cr[0] & CR0_PG_MASK)) { - in.cr3 = env->cr[3]; - in.mmu_idx = mmu_idx; - in.ptw_idx = use_stage2 ? MMU_NESTED_IDX : MMU_PHYS_IDX; - in.pg_mode = get_pg_mode(env); - - if (in.pg_mode & PG_MODE_LMA) { - /* test virtual address sign extension */ - int shift = in.pg_mode & PG_MODE_LA57 ? 56 : 47; - int64_t sext = (int64_t)addr >> shift; - if (sext != 0 && sext != -1) { - *err = (TranslateFault){ - .exception_index = EXCP0D_GPF, - .cr2 = addr, - }; - return false; - } - } - return mmu_translate(env, &in, out, err, ra); - } - break; - } - - /* No translation needed. */ - out->paddr = addr & x86_get_a20_mask(env); - out->prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC; - out->page_size = TARGET_PAGE_SIZE; - return true; -} bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int size, MMUAccessType access_type, int mmu_idx, bool probe, uintptr_t retaddr) { CPUX86State *env = cpu_env(cs); - TranslateResult out; - TranslateFault err; + X86TranslateResult out; + X86TranslateFault err; - if (get_physical_address(env, addr, access_type, mmu_idx, &out, &err, - retaddr)) { + if (x86_cpu_get_physical_address(env, addr, access_type, mmu_idx, &out, + &err, retaddr)) { /* * Even if 4MB pages, we map only one 4KB page in the cache to * avoid filling it too fast. From patchwork Fri May 24 17:07:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Don Porter X-Patchwork-Id: 13673338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B58DAC25B74 for ; Fri, 24 May 2024 17:09:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sAYPJ-0008EO-4D; Fri, 24 May 2024 13:08:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sAYPF-0008Dh-VG for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:38 -0400 Received: from mail-qk1-x72d.google.com ([2607:f8b0:4864:20::72d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1sAYP1-0007ZY-8c for qemu-devel@nongnu.org; Fri, 24 May 2024 13:08:37 -0400 Received: by mail-qk1-x72d.google.com with SMTP id af79cd13be357-792ce7a1febso302876585a.1 for ; Fri, 24 May 2024 10:08:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.unc.edu; s=google; t=1716570500; x=1717175300; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m81ihihXbX3dByguU9U1kCLKTPvRq+pYAv8JMDaNtfI=; b=lDjHxtdo37VGpiRGcxZTiKY2H6/IIovmSt5z5j5THJN5ZRdIZjIfuu3VWm4DvC22Cs CaG1INUdqKDfA5L417PpEhcBX9KdhOjkfl3v2wQ2AjmSvdz8oYLjuLMpVrVfPxTgfi/e oDypGmyY5wu1rbtlX0csTPRIiOctYkOCLaz+hpP7fCPf6nyFJCViQyjFIcsz3PMDPJKx 0HBzHImcpgbzNjmadMVN7EV4gdNsQ7QzU6DSqbJ+tg96TeWx2IvgUI3MKEw52RyEcZK6 AqsrLWD2r3+MmIavmE3AkKHz32SDdXXb8K/+OXvz7Bd6EugpSIHwDd+W3KAyx5/GwyJG VPvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716570500; x=1717175300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m81ihihXbX3dByguU9U1kCLKTPvRq+pYAv8JMDaNtfI=; b=oCRcwvEjs6DbPiIj/kvEI+Fwwkr2Ly4xLosj1HvUjB8H3+jHyZGCbMJAWnwKJd9XR7 lezk7abNa5QidEt7CMfzIb0Y9rPT6kqj2pNuWeKd7nWcoCAfGv1kXs3i9DBqouywtSAV Larz1J1xlvm05jBbcJcQhGh4Y+FDa+cT1rQtShuuqGazjVsa5WReE+Zkno3Xk66l+aon wKWwUfHtvV+0rMyvuulHklF0lhXysprBEpe9ReQjFzBBD4ZVblfm2wbFfF5VxgbVqm+z dkiW4QaerHeKhb/gr/tcjzmaI07p3Ht6ZP0uMLdFuaaPcANZZhhd8T2I6kFqhbiewj/j c/Vg== X-Gm-Message-State: AOJu0Ywuun+U6Zn+TM/SY7VNrNAA+7qedPSpLcNbSN5vV+ohFNgGUsWE /1GdB5z2vm53MShfdQpUHAJIMWUUb+LeR2W+jmap4L0Iv6uErFIl5Ow9Ok3ugtoSoIbhZAK673m QfGfdYEuLV+1FMrAfe87/uN0plCfuDOPg3mhmHvlAwKVmvGsGS/kCJ7Rhkk17/G6QdTSSKdo4mu vwMZCSRH/hECedwyEm/re/jvONjlXP X-Google-Smtp-Source: AGHT+IGLBU+ZWY5gA2oRUcMCo7SpAXOG50Pz6Zevh0sIkShivp2aubSlilcR3VNwqmfbon9kwtjfYg== X-Received: by 2002:a05:620a:c4b:b0:792:c2b7:5fd6 with SMTP id af79cd13be357-794ab110c21mr339397885a.50.1716570500099; Fri, 24 May 2024 10:08:20 -0700 (PDT) Received: from kermit.cs.unc.edu (kermit.cs.unc.edu. [152.2.133.133]) by smtp.gmail.com with ESMTPSA id af79cd13be357-794abcc0f0fsm79816585a.38.2024.05.24.10.08.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 10:08:19 -0700 (PDT) From: Don Porter To: qemu-devel@nongnu.org Cc: dave@treblig.org, peter.maydell@linaro.org, nadav.amit@gmail.com, richard.henderson@linaro.org, Don Porter Subject: [PATCH v2 6/6] Convert x86_mmu_translate() to use common code. Date: Fri, 24 May 2024 13:07:48 -0400 Message-Id: <20240524170748.1842030-7-porter@cs.unc.edu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240524170748.1842030-1-porter@cs.unc.edu> References: <20240524170748.1842030-1-porter@cs.unc.edu> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72d; envelope-from=porter@cs.unc.edu; helo=mail-qk1-x72d.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Signed-off-by: Don Porter --- target/i386/arch_memory_mapping.c | 37 ++- target/i386/cpu.h | 11 +- target/i386/helper.c | 371 ++++++--------------------- target/i386/tcg/sysemu/excp_helper.c | 2 +- 4 files changed, 128 insertions(+), 293 deletions(-) diff --git a/target/i386/arch_memory_mapping.c b/target/i386/arch_memory_mapping.c index 040464dd34..9ea5aeff16 100644 --- a/target/i386/arch_memory_mapping.c +++ b/target/i386/arch_memory_mapping.c @@ -33,7 +33,7 @@ * Returns a hardware address on success. Should not fail (i.e., caller is * responsible to ensure that a page table is actually present). */ -static hwaddr mmu_page_table_root(CPUState *cs, int *height) +hwaddr mmu_page_table_root(CPUState *cs, int *height) { X86CPU *cpu = X86_CPU(cs); CPUX86State *env = &cpu->env; @@ -228,6 +228,35 @@ static void _mmu_decode_va_parameters(CPUState *cs, int height, } } +/** + * mmu_virtual_to_pte_index - Given a virtual address and height in the + * page table radix tree, return the index that should be used + * to look up the next page table entry (pte) in translating an + * address. + * + * @cs - CPU state + * @vaddr - The virtual address to translate + * @height - height of node within the tree (leaves are 1, not 0). + * + * Example: In 32-bit x86 page tables, the virtual address is split + * into 10 bits at height 2, 10 bits at height 1, and 12 offset bits. + * So a call with VA and height 2 would return the first 10 bits of va, + * right shifted by 22. + */ + +int mmu_virtual_to_pte_index(CPUState *cs, target_ulong vaddr, int height) +{ + int shift = 0; + int width = 0; + int mask = 0; + + _mmu_decode_va_parameters(cs, height, &shift, &width); + + mask = (1 << width) - 1; + + return (vaddr >> shift) & mask; +} + /** * get_pte - Copy the contents of the page table entry at node[i] into pt_entry. * Optionally, add the relevant bits to the virtual address in @@ -247,7 +276,7 @@ static void _mmu_decode_va_parameters(CPUState *cs, int height, * Optional parameter. */ -static void +void get_pte(CPUState *cs, hwaddr node, int i, int height, PTE_t *pt_entry, target_ulong vaddr_parent, target_ulong *vaddr_pte, hwaddr *pte_paddr) @@ -284,7 +313,7 @@ get_pte(CPUState *cs, hwaddr node, int i, int height, } -static bool +bool mmu_pte_check_bits(CPUState *cs, PTE_t *pte, int64_t mask) { X86CPU *cpu = X86_CPU(cs); @@ -300,7 +329,7 @@ mmu_pte_check_bits(CPUState *cs, PTE_t *pte, int64_t mask) * mmu_pte_presetn - Return true if the pte is * marked 'present' */ -static bool +bool mmu_pte_present(CPUState *cs, PTE_t *pte) { return mmu_pte_check_bits(cs, pte, PG_PRESENT_MASK); diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 39ce49e61f..51d4a55e6b 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -2151,15 +2151,23 @@ int x86_cpu_write_elf64_qemunote(WriteCoreDumpFunction f, CPUState *cpu, int x86_cpu_write_elf32_qemunote(WriteCoreDumpFunction f, CPUState *cpu, DumpState *s); +hwaddr mmu_page_table_root(CPUState *cs, int *height); +bool mmu_pte_check_bits(CPUState *cs, PTE_t *pte, int64_t mask); +bool mmu_pte_present(CPUState *cs, PTE_t *pte); bool mmu_pte_leaf(CPUState *cs, int height, PTE_t *pte); target_ulong mmu_pte_leaf_page_size(CPUState *cs, int height); hwaddr mmu_pte_child(CPUState *cs, PTE_t *pte, int height); int mmu_page_table_entries_per_node(CPUState *cs, int height); +int mmu_virtual_to_pte_index(CPUState *cs, target_ulong vaddr, int height); bool for_each_pte(CPUState *cs, int (*fn)(CPUState *cs, void *data, PTE_t *pte, target_ulong vaddr, int height, int offset), void *data, bool visit_interior_nodes, bool visit_not_present); +void get_pte(CPUState *cs, hwaddr node, int i, int height, PTE_t *pt_entry, + target_ulong vaddr_parent, target_ulong *vaddr_pte, + hwaddr *pte_paddr); + bool x86_cpu_get_memory_mapping(CPUState *cpu, MemoryMappingList *list, Error **errp); @@ -2176,7 +2184,8 @@ int cpu_x86_support_mca_broadcast(CPUX86State *env); bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, MMUAccessType access_type, int mmu_idx, X86TranslateResult *out, - X86TranslateFault *err, uint64_t ra); + X86TranslateFault *err, uint64_t ra, + bool read_only); hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cpu, vaddr addr, MemTxAttrs *attrs); diff --git a/target/i386/helper.c b/target/i386/helper.c index 21445e84b2..17ffba200d 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -304,7 +304,8 @@ static bool ptw_translate(X86PTETranslate *inout, hwaddr addr, uint64_t ra) static bool x86_mmu_translate(CPUX86State *env, const X86TranslateParams *in, X86TranslateResult *out, - X86TranslateFault *err, uint64_t ra) + X86TranslateFault *err, uint64_t ra, + bool read_only) { const target_ulong addr = in->addr; const int pg_mode = in->pg_mode; @@ -320,6 +321,9 @@ static bool x86_mmu_translate(CPUX86State *env, const X86TranslateParams *in, uint32_t pkr; int page_size; int error_code; + CPUState *cs = env_cpu(env); + int height; + bool pae_enabled = env->cr[4] & CR4_PAE_MASK; restart_all: rsvd_mask = ~MAKE_64BIT_MASK(0, env_archcpu(env)->phys_bits); @@ -328,194 +332,85 @@ static bool x86_mmu_translate(CPUX86State *env, const X86TranslateParams *in, rsvd_mask |= PG_NX_MASK; } - if (pg_mode & PG_MODE_PAE) { -#ifdef TARGET_X86_64 - if (pg_mode & PG_MODE_LMA) { - if (pg_mode & PG_MODE_LA57) { - /* - * Page table level 5 - */ - pte_addr = (in->cr3 & ~0xfff) + (((addr >> 48) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_5: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_PSE_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_5; - } - ptep = pte ^ PG_NX_MASK; - } else { - pte = in->cr3; - ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; - } + /* Get the root of the page table */ - /* - * Page table level 4 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 39) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_4: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_PSE_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_4; - } - ptep &= pte ^ PG_NX_MASK; + /* + * ptep is really an accumulator for the permission bits. + * Thus, the xor-ing totally trashes the high bits, and that is + * ok - we only care about the low ones. + */ + ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; + hwaddr pt_node = mmu_page_table_root(cs, &height); - /* - * Page table level 3 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 30) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_3_lma: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_3_lma; - } - ptep &= pte ^ PG_NX_MASK; - if (pte & PG_PSE_MASK) { - /* 1 GB page */ - page_size = 1024 * 1024 * 1024; - goto do_check_protect; - } - } else -#endif - { - /* - * Page table level 3 - */ - pte_addr = (in->cr3 & 0xffffffe0ULL) + ((addr >> 27) & 0x18); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - rsvd_mask |= PG_HI_USER_MASK; - restart_3_nolma: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - if (pte & (rsvd_mask | PG_NX_MASK)) { - goto do_fault_rsvd; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_3_nolma; - } - ptep = PG_NX_MASK | PG_USER_MASK | PG_RW_MASK; - } + /* Special case for PAE paging */ + if (height == 3 && pg_mode & PG_MODE_PAE) { + rsvd_mask |= PG_HI_USER_MASK; + } - /* - * Page table level 2 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 21) & 0x1ff) << 3); + int i = height; + do { + int index = mmu_virtual_to_pte_index(cs, addr, i); + PTE_t pt_entry; + uint64_t my_rsvd_mask = rsvd_mask; + + get_pte(cs, pt_node, index, i, &pt_entry, 0, NULL, &pte_addr); + /* Check that we can access the page table entry */ if (!ptw_translate(&pte_trans, pte_addr, ra)) { return false; } - restart_2_pae: - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { + + restart: + if (!mmu_pte_present(cs, &pt_entry)) { goto do_fault; } - if (pte & rsvd_mask) { - goto do_fault_rsvd; - } - if (pte & PG_PSE_MASK) { - /* 2 MB page */ - page_size = 2048 * 1024; - ptep &= pte ^ PG_NX_MASK; - goto do_check_protect; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_2_pae; - } - ptep &= pte ^ PG_NX_MASK; - /* - * Page table level 1 - */ - pte_addr = (pte & PG_ADDRESS_MASK) + (((addr >> 12) & 0x1ff) << 3); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - pte = ptw_ldq(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; + /* For height > 3, check and reject PSE mask */ + if (i > 3) { + my_rsvd_mask |= PG_PSE_MASK; } - if (pte & rsvd_mask) { + + if (mmu_pte_check_bits(cs, &pt_entry, my_rsvd_mask)) { goto do_fault_rsvd; } - /* combine pde and pte nx, user and rw protections */ - ptep &= pte ^ PG_NX_MASK; - page_size = 4096; - } else { - /* - * Page table level 2 - */ - pte_addr = (in->cr3 & 0xfffff000ULL) + ((addr >> 20) & 0xffc); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; - } - restart_2_nopae: - pte = ptw_ldl(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; - } - ptep = pte | PG_NX_MASK; - /* if PSE bit is set, then we use a 4MB page */ - if ((pte & PG_PSE_MASK) && (pg_mode & PG_MODE_PSE)) { - page_size = 4096 * 1024; - /* - * Bits 20-13 provide bits 39-32 of the address, bit 21 is reserved. - * Leave bits 20-13 in place for setting accessed/dirty bits below. - */ - pte = (uint32_t)pte | ((pte & 0x1fe000LL) << (32 - 13)); - rsvd_mask = 0x200000; - goto do_check_protect_pse36; - } - if (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK)) { - goto restart_2_nopae; - } + pte = pt_entry.pte64_t; - /* - * Page table level 1 - */ - pte_addr = (pte & ~0xfffu) + ((addr >> 10) & 0xffc); - if (!ptw_translate(&pte_trans, pte_addr, ra)) { - return false; + /* Check if we have hit a leaf. Won't happen (yet) at heights > 3. */ + if (mmu_pte_leaf(cs, i, &pt_entry)) { + assert(i < 4); + page_size = mmu_pte_leaf_page_size(cs, i); + ptep &= pte ^ PG_NX_MASK; + + if (!pae_enabled) { + if (i == 2) { + /* + * Bits 20-13 provide bits 39-32 of the address, + * bit 21 is reserved. Leave bits 20-13 in place + * for setting accessed/dirty bits below. + */ + pte = (uint32_t)pte | ((pte & 0x1fe000LL) << (32 - 13)); + rsvd_mask = 0x200000; + goto do_check_protect_pse36; + } else if (i == 1) { + rsvd_mask = 0; + } + } + break; /* goto do_check_protect; */ } - pte = ptw_ldl(&pte_trans, ra); - if (!(pte & PG_PRESENT_MASK)) { - goto do_fault; + + if ((!read_only) && + (!ptw_setl(&pte_trans, pte, PG_ACCESSED_MASK))) { + goto restart; } - /* combine pde and pte user and rw protections */ - ptep &= pte | PG_NX_MASK; - page_size = 4096; - rsvd_mask = 0; - } -do_check_protect: + ptep &= pte ^ PG_NX_MASK; + + /* Move to the child node */ + assert(i > 1); + pt_node = mmu_pte_child(cs, &pt_entry, i - 1); + i--; + } while (i > 0); + rsvd_mask |= (page_size - 1) & PG_ADDRESS_MASK & ~PG_PSE_PAT_MASK; do_check_protect_pse36: if (pte & rsvd_mask) { @@ -675,10 +570,16 @@ do_check_protect_pse36: return false; } +/** + * The read-only argument indicates whether this access should + * trigger exceptions or otherwise disrupt TLB/MMU state. + * It should be true for monitor access, and false for tcg access. + */ bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, MMUAccessType access_type, int mmu_idx, X86TranslateResult *out, - X86TranslateFault *err, uint64_t ra) + X86TranslateFault *err, uint64_t ra, + bool read_only) { X86TranslateParams in; bool use_stage2 = env->hflags2 & HF2_NPT_MASK; @@ -699,7 +600,7 @@ bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, MMU_USER64_IDX : MMU_USER32_IDX; in.ptw_idx = MMU_PHYS_IDX; - if (!x86_mmu_translate(env, &in, out, err, ra)) { + if (!x86_mmu_translate(env, &in, out, err, ra, read_only)) { err->stage2 = S2_GPA; return false; } @@ -730,7 +631,7 @@ bool x86_cpu_get_physical_address(CPUX86State *env, vaddr addr, return false; } } - return x86_mmu_translate(env, &in, out, err, ra); + return x86_mmu_translate(env, &in, out, err, ra, read_only); } break; } @@ -747,123 +648,19 @@ hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cs, vaddr addr, { X86CPU *cpu = X86_CPU(cs); CPUX86State *env = &cpu->env; - target_ulong pde_addr, pte_addr; - uint64_t pte; - int32_t a20_mask; - uint32_t page_offset; - int page_size; + X86TranslateResult out; + X86TranslateFault err; *attrs = cpu_get_mem_attrs(env); - a20_mask = x86_get_a20_mask(env); - if (!(env->cr[0] & CR0_PG_MASK)) { - pte = addr & a20_mask; - page_size = 4096; - } else if (env->cr[4] & CR4_PAE_MASK) { - target_ulong pdpe_addr; - uint64_t pde, pdpe; - -#ifdef TARGET_X86_64 - if (env->hflags & HF_LMA_MASK) { - bool la57 = env->cr[4] & CR4_LA57_MASK; - uint64_t pml5e_addr, pml5e; - uint64_t pml4e_addr, pml4e; - int32_t sext; - - /* test virtual address sign extension */ - sext = la57 ? (int64_t)addr >> 56 : (int64_t)addr >> 47; - if (sext != 0 && sext != -1) { - return -1; - } - - if (la57) { - pml5e_addr = ((env->cr[3] & ~0xfff) + - (((addr >> 48) & 0x1ff) << 3)) & a20_mask; - pml5e = x86_ldq_phys(cs, pml5e_addr); - if (!(pml5e & PG_PRESENT_MASK)) { - return -1; - } - } else { - pml5e = env->cr[3]; - } - - pml4e_addr = ((pml5e & PG_ADDRESS_MASK) + - (((addr >> 39) & 0x1ff) << 3)) & a20_mask; - pml4e = x86_ldq_phys(cs, pml4e_addr); - if (!(pml4e & PG_PRESENT_MASK)) { - return -1; - } - pdpe_addr = ((pml4e & PG_ADDRESS_MASK) + - (((addr >> 30) & 0x1ff) << 3)) & a20_mask; - pdpe = x86_ldq_phys(cs, pdpe_addr); - if (!(pdpe & PG_PRESENT_MASK)) { - return -1; - } - if (pdpe & PG_PSE_MASK) { - page_size = 1024 * 1024 * 1024; - pte = pdpe; - goto out; - } - - } else -#endif - { - pdpe_addr = ((env->cr[3] & ~0x1f) + ((addr >> 27) & 0x18)) & - a20_mask; - pdpe = x86_ldq_phys(cs, pdpe_addr); - if (!(pdpe & PG_PRESENT_MASK)) - return -1; - } - - pde_addr = ((pdpe & PG_ADDRESS_MASK) + - (((addr >> 21) & 0x1ff) << 3)) & a20_mask; - pde = x86_ldq_phys(cs, pde_addr); - if (!(pde & PG_PRESENT_MASK)) { - return -1; - } - if (pde & PG_PSE_MASK) { - /* 2 MB page */ - page_size = 2048 * 1024; - pte = pde; - } else { - /* 4 KB page */ - pte_addr = ((pde & PG_ADDRESS_MASK) + - (((addr >> 12) & 0x1ff) << 3)) & a20_mask; - page_size = 4096; - pte = x86_ldq_phys(cs, pte_addr); - } - if (!(pte & PG_PRESENT_MASK)) { - return -1; - } - } else { - uint32_t pde; - - /* page directory entry */ - pde_addr = ((env->cr[3] & ~0xfff) + ((addr >> 20) & 0xffc)) & a20_mask; - pde = x86_ldl_phys(cs, pde_addr); - if (!(pde & PG_PRESENT_MASK)) - return -1; - if ((pde & PG_PSE_MASK) && (env->cr[4] & CR4_PSE_MASK)) { - pte = pde | ((pde & 0x1fe000LL) << (32 - 13)); - page_size = 4096 * 1024; - } else { - /* page directory entry */ - pte_addr = ((pde & ~0xfff) + ((addr >> 10) & 0xffc)) & a20_mask; - pte = x86_ldl_phys(cs, pte_addr); - if (!(pte & PG_PRESENT_MASK)) { - return -1; - } - page_size = 4096; - } - pte = pte & a20_mask; + /* This function merges the offset bits for us */ + if (!x86_cpu_get_physical_address(env, addr, MMU_DATA_LOAD, + cpu_mmu_index(cs, false), + &out, &err, 0, true)) { + return -1; } -#ifdef TARGET_X86_64 -out: -#endif - pte &= PG_ADDRESS_MASK & ~(page_size - 1); - page_offset = (addr & TARGET_PAGE_MASK) & (page_size - 1); - return pte | page_offset; + return out.paddr; } typedef struct MCEInjectionParams { diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index 4c48e5a68b..c85db11f05 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -56,7 +56,7 @@ bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int size, X86TranslateFault err; if (x86_cpu_get_physical_address(env, addr, access_type, mmu_idx, &out, - &err, retaddr)) { + &err, retaddr, false)) { /* * Even if 4MB pages, we map only one 4KB page in the cache to * avoid filling it too fast.