Message ID | 174453620439.31282.5525507256376485910.tip-bot2@tip-bot2 (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [tip:,x86/urgent] x86/e820: Discard high memory that can't be addressed by 32-bit systems | expand |
On 4/13/25 02:23, tip-bot2 for Mike Rapoport (Microsoft) wrote: > + /* > + * 32-bit systems are limited to 4BG of memory even with HIGHMEM and > + * to even less without it. > + * Discard memory after max_pfn - the actual limit detected at runtime. > + */ > + if (IS_ENABLED(CONFIG_X86_32)) > + memblock_remove(PFN_PHYS(max_pfn), -1); Mike, thanks for the quick fix! I did verify that this gets my silly test VM booting again. The patch obviously _works_. But in the case I was hitting max_pfn was set MAX_NONPAE_PFN. The unfortunate part about this hunk is that it's far away from the related warning: > if (max_pfn > MAX_NONPAE_PFN) { > max_pfn = MAX_NONPAE_PFN; > printk(KERN_WARNING MSG_HIGHMEM_TRIMMED); > } and it's logically doing the same thing: truncating memory at MAX_NONPAE_PFN. How about we reuse 'MAX_NONPAE_PFN' like this: if (IS_ENABLED(CONFIG_X86_32)) memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1); Would that make the connection more obvious?
On Mon, Apr 14, 2025 at 07:19:02AM -0700, Dave Hansen wrote: > On 4/13/25 02:23, tip-bot2 for Mike Rapoport (Microsoft) wrote: > > + /* > > + * 32-bit systems are limited to 4BG of memory even with HIGHMEM and > > + * to even less without it. > > + * Discard memory after max_pfn - the actual limit detected at runtime. > > + */ > > + if (IS_ENABLED(CONFIG_X86_32)) > > + memblock_remove(PFN_PHYS(max_pfn), -1); > > Mike, thanks for the quick fix! I did verify that this gets my silly > test VM booting again. > > The patch obviously _works_. But in the case I was hitting max_pfn was > set MAX_NONPAE_PFN. The unfortunate part about this hunk is that it's > far away from the related warning: Yeah, my first instinct was to put memblock_remove() in the same 'if', but there's no memblock there yet :) > > if (max_pfn > MAX_NONPAE_PFN) { > > max_pfn = MAX_NONPAE_PFN; > > printk(KERN_WARNING MSG_HIGHMEM_TRIMMED); > > } > > and it's logically doing the same thing: truncating memory at > MAX_NONPAE_PFN. > > How about we reuse 'MAX_NONPAE_PFN' like this: > > if (IS_ENABLED(CONFIG_X86_32)) > memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1); > > Would that make the connection more obvious? Yes, that's better. Here's the updated patch: From a235764221e4a849fa274a546ff2a3d9f15da2a9 Mon Sep 17 00:00:00 2001 From: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Date: Sun, 13 Apr 2025 10:36:17 +0300 Subject: [PATCH v2] x86/e820: discard high memory that can't be addressed by 32-bit systems Dave Hansen reports the following crash on a 32-bit system with CONFIG_HIGHMEM=y and CONFIG_X86_PAE=y: > 0xf75fe000 is the mem_map[] entry for the first page >4GB. It > obviously wasn't allocated, thus the oops. BUG: unable to handle page fault for address: f75fe000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page *pdpt = 0000000002da2001 *pde = 000000000300c067 *pte = 0000000000000000 Oops: Oops: 0002 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc1-00288-ge618ee89561b-dirty #311 PREEMPT(undef) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 EIP: __free_pages_core+0x3c/0x74 Code: c3 d3 e6 83 ec 10 89 44 24 08 89 74 24 04 c7 04 24 c6 32 3a c2 89 55 f4 e8 a9 11 45 fe 85 f6 8b 55 f4 74 19 89 d8 31 c9 66 90 <0f> ba 30 0d c7 40 1c 00 00 00 00 41 83 c0 28 39 ce 75 ed 8b EAX: f75fe000 EBX: f75fe000 ECX: 00000000 EDX: 0000000a ESI: 00000400 EDI: 00500000 EBP: c247becc ESP: c247beb4 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210046 CR0: 80050033 CR2: f75fe000 CR3: 02da6000 CR4: 000000b0 Call Trace: memblock_free_pages+0x11/0x2c memblock_free_all+0x2ce/0x3a0 mm_core_init+0xf5/0x320 start_kernel+0x296/0x79c ? set_init_arg+0x70/0x70 ? load_ucode_bsp+0x13c/0x1a8 i386_start_kernel+0xad/0xb0 startup_32_smp+0x151/0x154 Modules linked in: CR2: 00000000f75fe000 The mem_map[] is allocated up to the end of ZONE_HIGHMEM which is defined by max_pfn. Before 6faea3422e3b ("arch, mm: streamline HIGHMEM freeing") freeing of high memory was also clamped to the end of ZONE_HIGHMEM but after 6faea3422e3b memblock_free_all() tries to free memory above the of ZONE_HIGHMEM as well and that causes access to mem_map[] entries beyond the end of the memory map. Discard the memory after MAX_NONPAE_PFN from memblock on 32-bit systems so that core MM would be aware only of actually usable memory. Reported-by: Dave Hansen <dave.hansen@intel.com> Tested-by: Arnd Bergmann <arnd@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> --- arch/x86/kernel/e820.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 57120f0749cc..5e6b1034e6f1 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1300,6 +1300,13 @@ void __init e820__memblock_setup(void) memblock_add(entry->addr, entry->size); } + /* + * Discard memory above 4GB because 32-bit systems are limited to 4GB + * of memory even with HIGHMEM. + */ + if (IS_ENABLED(CONFIG_X86_32)) + memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1); + /* Throw away partial pages: */ memblock_trim_memory(PAGE_SIZE);
On 4/15/25 00:18, Mike Rapoport wrote: >> How about we reuse 'MAX_NONPAE_PFN' like this: >> >> if (IS_ENABLED(CONFIG_X86_32)) >> memblock_remove(PFN_PHYS(MAX_NONPAE_PFN), -1); >> >> Would that make the connection more obvious? > Yes, that's better. Here's the updated patch: Looks, great. Thanks for the update and the quick turnaround on the first one after the bug report! Tested-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com>
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index 9d8dd8d..9920122 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1299,6 +1299,14 @@ void __init e820__memblock_setup(void) memblock_add(entry->addr, entry->size); } + /* + * 32-bit systems are limited to 4BG of memory even with HIGHMEM and + * to even less without it. + * Discard memory after max_pfn - the actual limit detected at runtime. + */ + if (IS_ENABLED(CONFIG_X86_32)) + memblock_remove(PFN_PHYS(max_pfn), -1); + /* Throw away partial pages: */ memblock_trim_memory(PAGE_SIZE);