Message ID | 20240718035444.2977105-2-ruanjinjie@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | crash: Fix x86_32 memory reserve dead loop bug | expand |
On 07/18/24 at 11:54am, Jinjie Ruan wrote: > On x86_32 Qemu machine with 1GB memory, the cmdline "crashkernel=1G,high" > will cause system stall as below: > > ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b] > ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7] > ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f] > ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb] > ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3] > ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b] > 143MB HIGHMEM available. > 879MB LOWMEM available. > mapped low ram: 0 - 36ffe000 > low ram: 0 - 36ffe000 > (stall here) > > The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX > on x86_32, the first high crash kernel memory reservation will fail, then > go into the "retry" loop and never came out as below. > > -> reserve_crashkernel_generic() and high is true > -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail > -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly > (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX). > > Fix it by prevent crashkernel=,high from being parsed successfully on 32bit > system with a architecture-defined macro. > > After this patch, the 'crashkernel=,high' for 32bit system can't succeed, > and it has no chance to call reserve_crashkernel_generic(), therefore this > issue on x86_32 is solved. > > Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code") > Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> > Signed-off-by: Baoquan He <bhe@redhat.com> Just adding my Suggested-by is fine. If multiple people cooperate on one patch, the Co-developed-by tag is needed. As a maintainer, I prefer to have the Suggested-by tag in this case. > Tested-by: Jinjie Ruan <ruanjinjie@huawei.com> You can't add Tested-by tag for your own patch. When you post patch, testing it is your obligation. Other than these tag adding concerns, this patch looks good to me. You can post v4 to update and add my: Acked-by: Baoquan He <bhe@redhat.com> > --- > v3: > - Fix it as Baoquan suggested. > - Update the commit message. > v2: > - Peel off the other two patches. > - Update the commit message and fix tag. > --- > arch/arm64/include/asm/crash_reserve.h | 2 ++ > arch/riscv/include/asm/crash_reserve.h | 2 ++ > arch/x86/include/asm/crash_reserve.h | 1 + > kernel/crash_reserve.c | 2 +- > 4 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/crash_reserve.h b/arch/arm64/include/asm/crash_reserve.h > index 4afe027a4e7b..bf362c1a612f 100644 > --- a/arch/arm64/include/asm/crash_reserve.h > +++ b/arch/arm64/include/asm/crash_reserve.h > @@ -7,4 +7,6 @@ > > #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit > #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) > + > +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH > #endif > diff --git a/arch/riscv/include/asm/crash_reserve.h b/arch/riscv/include/asm/crash_reserve.h > index 013962e63587..8d7a8fc1d459 100644 > --- a/arch/riscv/include/asm/crash_reserve.h > +++ b/arch/riscv/include/asm/crash_reserve.h > @@ -7,5 +7,7 @@ > #define CRASH_ADDR_LOW_MAX dma32_phys_limit > #define CRASH_ADDR_HIGH_MAX memblock_end_of_DRAM() > > +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH > + > extern phys_addr_t memblock_end_of_DRAM(void); > #endif > diff --git a/arch/x86/include/asm/crash_reserve.h b/arch/x86/include/asm/crash_reserve.h > index 7835b2cdff04..24c2327f9a16 100644 > --- a/arch/x86/include/asm/crash_reserve.h > +++ b/arch/x86/include/asm/crash_reserve.h > @@ -26,6 +26,7 @@ extern unsigned long swiotlb_size_or_default(void); > #else > # define CRASH_ADDR_LOW_MAX SZ_4G > # define CRASH_ADDR_HIGH_MAX SZ_64T > +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH > #endif > > # define DEFAULT_CRASH_KERNEL_LOW_SIZE crash_low_size_default() > diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c > index 5b2722a93a48..c5213f123e19 100644 > --- a/kernel/crash_reserve.c > +++ b/kernel/crash_reserve.c > @@ -306,7 +306,7 @@ int __init parse_crashkernel(char *cmdline, > /* crashkernel=X[@offset] */ > ret = __parse_crashkernel(cmdline, system_ram, crash_size, > crash_base, NULL); > -#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION > +#ifdef HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH > /* > * If non-NULL 'high' passed in and no normal crashkernel > * setting detected, try parsing crashkernel=,high|low. > -- > 2.34.1 >
On 2024/7/18 19:10, Baoquan He wrote: > On 07/18/24 at 11:54am, Jinjie Ruan wrote: >> On x86_32 Qemu machine with 1GB memory, the cmdline "crashkernel=1G,high" >> will cause system stall as below: >> >> ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b] >> ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7] >> ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f] >> ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb] >> ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3] >> ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b] >> 143MB HIGHMEM available. >> 879MB LOWMEM available. >> mapped low ram: 0 - 36ffe000 >> low ram: 0 - 36ffe000 >> (stall here) >> >> The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX >> on x86_32, the first high crash kernel memory reservation will fail, then >> go into the "retry" loop and never came out as below. >> >> -> reserve_crashkernel_generic() and high is true >> -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail >> -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly >> (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX). >> >> Fix it by prevent crashkernel=,high from being parsed successfully on 32bit >> system with a architecture-defined macro. >> >> After this patch, the 'crashkernel=,high' for 32bit system can't succeed, >> and it has no chance to call reserve_crashkernel_generic(), therefore this >> issue on x86_32 is solved. >> >> Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code") >> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> >> Signed-off-by: Baoquan He <bhe@redhat.com> > > Just adding my Suggested-by is fine. If multiple people cooperate on one > patch, the Co-developed-by tag is needed. As a maintainer, I prefer to > have the Suggested-by tag in this case. OK, thank you very much! > >> Tested-by: Jinjie Ruan <ruanjinjie@huawei.com> > > You can't add Tested-by tag for your own patch. When you post patch, > testing it is your obligation. > > Other than these tag adding concerns, this patch looks good to me. You > can post v4 to update and add my: Thank you, I'll fix it next version. > > Acked-by: Baoquan He <bhe@redhat.com>> >> --- >> v3: >> - Fix it as Baoquan suggested. >> - Update the commit message. >> v2: >> - Peel off the other two patches. >> - Update the commit message and fix tag. >> --- >> arch/arm64/include/asm/crash_reserve.h | 2 ++ >> arch/riscv/include/asm/crash_reserve.h | 2 ++ >> arch/x86/include/asm/crash_reserve.h | 1 + >> kernel/crash_reserve.c | 2 +- >> 4 files changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/include/asm/crash_reserve.h b/arch/arm64/include/asm/crash_reserve.h >> index 4afe027a4e7b..bf362c1a612f 100644 >> --- a/arch/arm64/include/asm/crash_reserve.h >> +++ b/arch/arm64/include/asm/crash_reserve.h >> @@ -7,4 +7,6 @@ >> >> #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit >> #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) >> + >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> #endif >> diff --git a/arch/riscv/include/asm/crash_reserve.h b/arch/riscv/include/asm/crash_reserve.h >> index 013962e63587..8d7a8fc1d459 100644 >> --- a/arch/riscv/include/asm/crash_reserve.h >> +++ b/arch/riscv/include/asm/crash_reserve.h >> @@ -7,5 +7,7 @@ >> #define CRASH_ADDR_LOW_MAX dma32_phys_limit >> #define CRASH_ADDR_HIGH_MAX memblock_end_of_DRAM() >> >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> + >> extern phys_addr_t memblock_end_of_DRAM(void); >> #endif >> diff --git a/arch/x86/include/asm/crash_reserve.h b/arch/x86/include/asm/crash_reserve.h >> index 7835b2cdff04..24c2327f9a16 100644 >> --- a/arch/x86/include/asm/crash_reserve.h >> +++ b/arch/x86/include/asm/crash_reserve.h >> @@ -26,6 +26,7 @@ extern unsigned long swiotlb_size_or_default(void); >> #else >> # define CRASH_ADDR_LOW_MAX SZ_4G >> # define CRASH_ADDR_HIGH_MAX SZ_64T >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> #endif >> >> # define DEFAULT_CRASH_KERNEL_LOW_SIZE crash_low_size_default() >> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c >> index 5b2722a93a48..c5213f123e19 100644 >> --- a/kernel/crash_reserve.c >> +++ b/kernel/crash_reserve.c >> @@ -306,7 +306,7 @@ int __init parse_crashkernel(char *cmdline, >> /* crashkernel=X[@offset] */ >> ret = __parse_crashkernel(cmdline, system_ram, crash_size, >> crash_base, NULL); >> -#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION >> +#ifdef HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> /* >> * If non-NULL 'high' passed in and no normal crashkernel >> * setting detected, try parsing crashkernel=,high|low. >> -- >> 2.34.1 >> > >
On 2024/7/18 19:10, Baoquan He wrote: > On 07/18/24 at 11:54am, Jinjie Ruan wrote: >> On x86_32 Qemu machine with 1GB memory, the cmdline "crashkernel=1G,high" >> will cause system stall as below: >> >> ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b] >> ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7] >> ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f] >> ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb] >> ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3] >> ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b] >> 143MB HIGHMEM available. >> 879MB LOWMEM available. >> mapped low ram: 0 - 36ffe000 >> low ram: 0 - 36ffe000 >> (stall here) >> >> The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX >> on x86_32, the first high crash kernel memory reservation will fail, then >> go into the "retry" loop and never came out as below. >> >> -> reserve_crashkernel_generic() and high is true >> -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail >> -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly >> (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX). >> >> Fix it by prevent crashkernel=,high from being parsed successfully on 32bit >> system with a architecture-defined macro. >> >> After this patch, the 'crashkernel=,high' for 32bit system can't succeed, >> and it has no chance to call reserve_crashkernel_generic(), therefore this >> issue on x86_32 is solved. >> >> Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code") >> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> >> Signed-off-by: Baoquan He <bhe@redhat.com> > > Just adding my Suggested-by is fine. If multiple people cooperate on one > patch, the Co-developed-by tag is needed. As a maintainer, I prefer to > have the Suggested-by tag in this case. Hi, Baoquan, I wonder if riscv32 have a similar problem, but I'm not sure. > >> Tested-by: Jinjie Ruan <ruanjinjie@huawei.com> > > You can't add Tested-by tag for your own patch. When you post patch, > testing it is your obligation. > > Other than these tag adding concerns, this patch looks good to me. You > can post v4 to update and add my: > > Acked-by: Baoquan He <bhe@redhat.com> > >> --- >> v3: >> - Fix it as Baoquan suggested. >> - Update the commit message. >> v2: >> - Peel off the other two patches. >> - Update the commit message and fix tag. >> --- >> arch/arm64/include/asm/crash_reserve.h | 2 ++ >> arch/riscv/include/asm/crash_reserve.h | 2 ++ >> arch/x86/include/asm/crash_reserve.h | 1 + >> kernel/crash_reserve.c | 2 +- >> 4 files changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/include/asm/crash_reserve.h b/arch/arm64/include/asm/crash_reserve.h >> index 4afe027a4e7b..bf362c1a612f 100644 >> --- a/arch/arm64/include/asm/crash_reserve.h >> +++ b/arch/arm64/include/asm/crash_reserve.h >> @@ -7,4 +7,6 @@ >> >> #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit >> #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) >> + >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> #endif >> diff --git a/arch/riscv/include/asm/crash_reserve.h b/arch/riscv/include/asm/crash_reserve.h >> index 013962e63587..8d7a8fc1d459 100644 >> --- a/arch/riscv/include/asm/crash_reserve.h >> +++ b/arch/riscv/include/asm/crash_reserve.h >> @@ -7,5 +7,7 @@ >> #define CRASH_ADDR_LOW_MAX dma32_phys_limit >> #define CRASH_ADDR_HIGH_MAX memblock_end_of_DRAM() >> >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> + >> extern phys_addr_t memblock_end_of_DRAM(void); >> #endif >> diff --git a/arch/x86/include/asm/crash_reserve.h b/arch/x86/include/asm/crash_reserve.h >> index 7835b2cdff04..24c2327f9a16 100644 >> --- a/arch/x86/include/asm/crash_reserve.h >> +++ b/arch/x86/include/asm/crash_reserve.h >> @@ -26,6 +26,7 @@ extern unsigned long swiotlb_size_or_default(void); >> #else >> # define CRASH_ADDR_LOW_MAX SZ_4G >> # define CRASH_ADDR_HIGH_MAX SZ_64T >> +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> #endif >> >> # define DEFAULT_CRASH_KERNEL_LOW_SIZE crash_low_size_default() >> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c >> index 5b2722a93a48..c5213f123e19 100644 >> --- a/kernel/crash_reserve.c >> +++ b/kernel/crash_reserve.c >> @@ -306,7 +306,7 @@ int __init parse_crashkernel(char *cmdline, >> /* crashkernel=X[@offset] */ >> ret = __parse_crashkernel(cmdline, system_ram, crash_size, >> crash_base, NULL); >> -#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION >> +#ifdef HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH >> /* >> * If non-NULL 'high' passed in and no normal crashkernel >> * setting detected, try parsing crashkernel=,high|low. >> -- >> 2.34.1 >> > >
diff --git a/arch/arm64/include/asm/crash_reserve.h b/arch/arm64/include/asm/crash_reserve.h index 4afe027a4e7b..bf362c1a612f 100644 --- a/arch/arm64/include/asm/crash_reserve.h +++ b/arch/arm64/include/asm/crash_reserve.h @@ -7,4 +7,6 @@ #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) + +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH #endif diff --git a/arch/riscv/include/asm/crash_reserve.h b/arch/riscv/include/asm/crash_reserve.h index 013962e63587..8d7a8fc1d459 100644 --- a/arch/riscv/include/asm/crash_reserve.h +++ b/arch/riscv/include/asm/crash_reserve.h @@ -7,5 +7,7 @@ #define CRASH_ADDR_LOW_MAX dma32_phys_limit #define CRASH_ADDR_HIGH_MAX memblock_end_of_DRAM() +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH + extern phys_addr_t memblock_end_of_DRAM(void); #endif diff --git a/arch/x86/include/asm/crash_reserve.h b/arch/x86/include/asm/crash_reserve.h index 7835b2cdff04..24c2327f9a16 100644 --- a/arch/x86/include/asm/crash_reserve.h +++ b/arch/x86/include/asm/crash_reserve.h @@ -26,6 +26,7 @@ extern unsigned long swiotlb_size_or_default(void); #else # define CRASH_ADDR_LOW_MAX SZ_4G # define CRASH_ADDR_HIGH_MAX SZ_64T +#define HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH #endif # define DEFAULT_CRASH_KERNEL_LOW_SIZE crash_low_size_default() diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c index 5b2722a93a48..c5213f123e19 100644 --- a/kernel/crash_reserve.c +++ b/kernel/crash_reserve.c @@ -306,7 +306,7 @@ int __init parse_crashkernel(char *cmdline, /* crashkernel=X[@offset] */ ret = __parse_crashkernel(cmdline, system_ram, crash_size, crash_base, NULL); -#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION +#ifdef HAVE_ARCH_CRASHKERNEL_RESERVATION_HIGH /* * If non-NULL 'high' passed in and no normal crashkernel * setting detected, try parsing crashkernel=,high|low.