diff mbox series

[v4,1/3] crash: Fix x86_32 crash memory reserve dead loop bug

Message ID 20240719095735.1912878-2-ruanjinjie@huawei.com (mailing list archive)
State Handled Elsewhere
Headers show
Series crash: Fix x86_32 memory reserve dead loop bug | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh

Commit Message

Jinjie Ruan July 19, 2024, 9:57 a.m. UTC
On x86_32 Qemu machine with 1GB memory, the cmdline "crashkernel=1G,high"
will cause system stall as below:

	ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b]
	ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7]
	ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f]
	ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb]
	ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3]
	ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b]
	143MB HIGHMEM available.
	879MB LOWMEM available.
	  mapped low ram: 0 - 36ffe000
	  low ram: 0 - 36ffe000
	 (stall here)

The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX
on x86_32, the first high crash kernel memory reservation will fail, then
go into the "retry" loop and never came out as below.

-> reserve_crashkernel_generic() and high is true
 -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail
    -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly
       (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).

Fix it by prevent crashkernel=,high from being parsed successfully on 32bit
system with a architecture-defined macro.

After this patch, the 'crashkernel=,high' for 32bit system can't succeed,
and it has no chance to call reserve_crashkernel_generic(), therefore this
issue on x86_32 is solved.

Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
---
v4:
- Add the missing macro for loongarch.
- Only define the macro for 64bit RISCV.
- Signed-off-by -> Suggested-by as suggested.
- Remove the Tested-by as suggested.
- Add acked-by.
v3:
- Fix it as Baoquan suggested.
- Update the commit message.
v2:
- Peel off the other two patches.
- Update the commit message and fix tag.
---
 arch/arm64/include/asm/crash_reserve.h     | 2 ++
 arch/loongarch/include/asm/crash_reserve.h | 2 ++
 arch/riscv/include/asm/crash_reserve.h     | 4 ++++
 arch/x86/include/asm/crash_reserve.h       | 1 +
 kernel/crash_reserve.c                     | 2 +-
 5 files changed, 10 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/crash_reserve.h b/arch/arm64/include/asm/crash_reserve.h
index 4afe027a4e7b..bf362c1a612f 100644
--- a/arch/arm64/include/asm/crash_reserve.h
+++ b/arch/arm64/include/asm/crash_reserve.h
@@ -7,4 +7,6 @@ 
 
 #define CRASH_ADDR_LOW_MAX              arm64_dma_phys_limit
 #define CRASH_ADDR_HIGH_MAX             (PHYS_MASK + 1)
+
+#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH
 #endif
diff --git a/arch/loongarch/include/asm/crash_reserve.h b/arch/loongarch/include/asm/crash_reserve.h
index a1d9b84b1c7d..2d02517c2127 100644
--- a/arch/loongarch/include/asm/crash_reserve.h
+++ b/arch/loongarch/include/asm/crash_reserve.h
@@ -7,6 +7,8 @@ 
 #define CRASH_ADDR_LOW_MAX		SZ_4G
 #define CRASH_ADDR_HIGH_MAX		memblock_end_of_DRAM()
 
+#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH
+
 extern phys_addr_t memblock_end_of_DRAM(void);
 
 #endif
diff --git a/arch/riscv/include/asm/crash_reserve.h b/arch/riscv/include/asm/crash_reserve.h
index 013962e63587..216402ea5b7c 100644
--- a/arch/riscv/include/asm/crash_reserve.h
+++ b/arch/riscv/include/asm/crash_reserve.h
@@ -7,5 +7,9 @@ 
 #define CRASH_ADDR_LOW_MAX		dma32_phys_limit
 #define CRASH_ADDR_HIGH_MAX		memblock_end_of_DRAM()
 
+#ifdef CONFIG_64BIT
+#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH
+#endif
+
 extern phys_addr_t memblock_end_of_DRAM(void);
 #endif
diff --git a/arch/x86/include/asm/crash_reserve.h b/arch/x86/include/asm/crash_reserve.h
index 7835b2cdff04..24c2327f9a16 100644
--- a/arch/x86/include/asm/crash_reserve.h
+++ b/arch/x86/include/asm/crash_reserve.h
@@ -26,6 +26,7 @@  extern unsigned long swiotlb_size_or_default(void);
 #else
 # define CRASH_ADDR_LOW_MAX     SZ_4G
 # define CRASH_ADDR_HIGH_MAX    SZ_64T
+#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH
 #endif
 
 # define DEFAULT_CRASH_KERNEL_LOW_SIZE crash_low_size_default()
diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index 5b2722a93a48..c5213f123e19 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -306,7 +306,7 @@  int __init parse_crashkernel(char *cmdline,
 	/* crashkernel=X[@offset] */
 	ret = __parse_crashkernel(cmdline, system_ram, crash_size,
 				crash_base, NULL);
-#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
+#ifdef HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH
 	/*
 	 * If non-NULL 'high' passed in and no normal crashkernel
 	 * setting detected, try parsing crashkernel=,high|low.