diff mbox series

[v11,3/5] arm64: kdump: reimplement crashkernel=X

Message ID 20200801130856.86625-4-chenzhou10@huawei.com (mailing list archive)
State New, archived
Headers show
Series support reserving crashkernel above 4G on arm64 kdump | expand

Commit Message

chenzhou Aug. 1, 2020, 1:08 p.m. UTC
There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which
will fail when there is no enough low memory.
2. If reserving crashkernel above 4G, in this case, crash dump
kernel will boot failure because there is no low memory available
for allocation.
3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
if the memory reserved for crash dump kernel falled in ZONE_DMA32,
the devices in crash dump kernel need to use ZONE_DMA will alloc
fail.

To solve these issues, change the behavior of crashkernel=X.
crashkernel=X tries low allocation in ZONE_DMA, and fall back to
high allocation if it fails.

If requized size X is too large and leads to very little free memory
in ZONE_DMA after low allocation, the system may not work normally.
So add a threshold and go for high allocation directly if the required
size is too large. The value of threshold is set as the half of
the low memory.

If crash_base is outside ZONE_DMA, try to allocate at least 256M in
ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
specified size low memory.

For non-RPi4 platforms, change ZONE_DMA memtioned above to ZONE_DMA32.

Another minor change, there may be two regions reserved for crash
dump kernel, in order to distinct from the high region and make no
effect to the use of existing kexec-tools, rename the low region as
"Crash kernel (low)".

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
---
 arch/arm64/include/asm/kexec.h |  4 +++
 arch/arm64/kernel/setup.c      |  8 +++++-
 arch/arm64/mm/init.c           | 51 ++++++++++++++++++++++++++++++----
 3 files changed, 57 insertions(+), 6 deletions(-)

Comments

Catalin Marinas Sept. 2, 2020, 5:09 p.m. UTC | #1
On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> There are following issues in arm64 kdump:
> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> will fail when there is no enough low memory.
> 2. If reserving crashkernel above 4G, in this case, crash dump
> kernel will boot failure because there is no low memory available
> for allocation.
> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> the devices in crash dump kernel need to use ZONE_DMA will alloc
> fail.
> 
> To solve these issues, change the behavior of crashkernel=X.
> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> high allocation if it fails.
> 
> If requized size X is too large and leads to very little free memory
> in ZONE_DMA after low allocation, the system may not work normally.
> So add a threshold and go for high allocation directly if the required
> size is too large. The value of threshold is set as the half of
> the low memory.
> 
> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> specified size low memory.

Except for the threshold to keep zone ZONE_DMA memory,
reserve_crashkernel() looks very close to the x86 version. Shall we try
to make this generic as well? In the first instance, you could avoid the
threshold check if it takes an explicit ",high" option.
chenzhou Sept. 3, 2020, 11:26 a.m. UTC | #2
Hi Catalin,


On 2020/9/3 1:09, Catalin Marinas wrote:
> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>> There are following issues in arm64 kdump:
>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>> will fail when there is no enough low memory.
>> 2. If reserving crashkernel above 4G, in this case, crash dump
>> kernel will boot failure because there is no low memory available
>> for allocation.
>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>> fail.
>>
>> To solve these issues, change the behavior of crashkernel=X.
>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>> high allocation if it fails.
>>
>> If requized size X is too large and leads to very little free memory
>> in ZONE_DMA after low allocation, the system may not work normally.
>> So add a threshold and go for high allocation directly if the required
>> size is too large. The value of threshold is set as the half of
>> the low memory.
>>
>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>> specified size low memory.
> Except for the threshold to keep zone ZONE_DMA memory,
> reserve_crashkernel() looks very close to the x86 version. Shall we try
> to make this generic as well? In the first instance, you could avoid the
> threshold check if it takes an explicit ",high" option.
Ok, i will try to do this.

I look into the function reserve_crashkernel() of x86 and found the start address is
CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.

I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?

Thanks,
Chen Zhou
chenzhou Sept. 3, 2020, 1:18 p.m. UTC | #3
On 2020/9/3 19:26, chenzhou wrote:
> Hi Catalin,
>
>
> On 2020/9/3 1:09, Catalin Marinas wrote:
>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>> There are following issues in arm64 kdump:
>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>> will fail when there is no enough low memory.
>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>> kernel will boot failure because there is no low memory available
>>> for allocation.
>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>> fail.
>>>
>>> To solve these issues, change the behavior of crashkernel=X.
>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>> high allocation if it fails.
>>>
>>> If requized size X is too large and leads to very little free memory
>>> in ZONE_DMA after low allocation, the system may not work normally.
>>> So add a threshold and go for high allocation directly if the required
>>> size is too large. The value of threshold is set as the half of
>>> the low memory.
>>>
>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>> specified size low memory.
>> Except for the threshold to keep zone ZONE_DMA memory,
>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>> to make this generic as well? In the first instance, you could avoid the
>> threshold check if it takes an explicit ",high" option.
> Ok, i will try to do this.
>
> I look into the function reserve_crashkernel() of x86 and found the start address is
> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>
> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
Besides, in function reserve_crashkernel_low() of x86, the start address is 0.

>
> Thanks,
> Chen Zhou
>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
> .
>
Dave Young Sept. 4, 2020, 3:04 a.m. UTC | #4
On 09/03/20 at 07:26pm, chenzhou wrote:
> Hi Catalin,
> 
> 
> On 2020/9/3 1:09, Catalin Marinas wrote:
> > On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> >> There are following issues in arm64 kdump:
> >> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> >> will fail when there is no enough low memory.
> >> 2. If reserving crashkernel above 4G, in this case, crash dump
> >> kernel will boot failure because there is no low memory available
> >> for allocation.
> >> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> >> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> >> the devices in crash dump kernel need to use ZONE_DMA will alloc
> >> fail.
> >>
> >> To solve these issues, change the behavior of crashkernel=X.
> >> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> >> high allocation if it fails.
> >>
> >> If requized size X is too large and leads to very little free memory
> >> in ZONE_DMA after low allocation, the system may not work normally.
> >> So add a threshold and go for high allocation directly if the required
> >> size is too large. The value of threshold is set as the half of
> >> the low memory.
> >>
> >> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> >> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> >> specified size low memory.
> > Except for the threshold to keep zone ZONE_DMA memory,
> > reserve_crashkernel() looks very close to the x86 version. Shall we try
> > to make this generic as well? In the first instance, you could avoid the
> > threshold check if it takes an explicit ",high" option.
> Ok, i will try to do this.
> 
> I look into the function reserve_crashkernel() of x86 and found the start address is
> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
> 
> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?

Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
config PHYSICAL_ALIGN
        hex "Alignment value to which kernel should be aligned"
        default "0x200000"
        range 0x2000 0x1000000 if X86_32
        range 0x200000 0x1000000 if X86_64

According to above, I think the 16M should come from the largest value
But the default value is 2M,  with smaller value reservation can have
more chance to succeed.

It seems we still need arch specific CRASH_ALIGN, but the initial
version you added the #ifdef for different arches, can you move the
macro to arch specific headers?

Thanks
Dave
Dave Young Sept. 4, 2020, 3:10 a.m. UTC | #5
On 09/04/20 at 11:04am, Dave Young wrote:
> On 09/03/20 at 07:26pm, chenzhou wrote:
> > Hi Catalin,
> > 
> > 
> > On 2020/9/3 1:09, Catalin Marinas wrote:
> > > On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> > >> There are following issues in arm64 kdump:
> > >> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> > >> will fail when there is no enough low memory.
> > >> 2. If reserving crashkernel above 4G, in this case, crash dump
> > >> kernel will boot failure because there is no low memory available
> > >> for allocation.
> > >> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> > >> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> > >> the devices in crash dump kernel need to use ZONE_DMA will alloc
> > >> fail.
> > >>
> > >> To solve these issues, change the behavior of crashkernel=X.
> > >> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> > >> high allocation if it fails.
> > >>
> > >> If requized size X is too large and leads to very little free memory
> > >> in ZONE_DMA after low allocation, the system may not work normally.
> > >> So add a threshold and go for high allocation directly if the required
> > >> size is too large. The value of threshold is set as the half of
> > >> the low memory.
> > >>
> > >> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> > >> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> > >> specified size low memory.
> > > Except for the threshold to keep zone ZONE_DMA memory,
> > > reserve_crashkernel() looks very close to the x86 version. Shall we try
> > > to make this generic as well? In the first instance, you could avoid the
> > > threshold check if it takes an explicit ",high" option.
> > Ok, i will try to do this.
> > 
> > I look into the function reserve_crashkernel() of x86 and found the start address is
> > CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
> > 
> > I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
> 
> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
> config PHYSICAL_ALIGN
>         hex "Alignment value to which kernel should be aligned"
>         default "0x200000"
>         range 0x2000 0x1000000 if X86_32
>         range 0x200000 0x1000000 if X86_64
> 
> According to above, I think the 16M should come from the largest value
> But the default value is 2M,  with smaller value reservation can have
> more chance to succeed.
> 
> It seems we still need arch specific CRASH_ALIGN, but the initial
> version you added the #ifdef for different arches, can you move the
> macro to arch specific headers?

And just keep the x86 align value as is, I can try to change the x86
value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
cleaner.

> 
> Thanks
> Dave
chenzhou Sept. 4, 2020, 4:02 a.m. UTC | #6
On 2020/9/4 11:10, Dave Young wrote:
> On 09/04/20 at 11:04am, Dave Young wrote:
>> On 09/03/20 at 07:26pm, chenzhou wrote:
>>> Hi Catalin,
>>>
>>>
>>> On 2020/9/3 1:09, Catalin Marinas wrote:
>>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>>>> There are following issues in arm64 kdump:
>>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>>>> will fail when there is no enough low memory.
>>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>>>> kernel will boot failure because there is no low memory available
>>>>> for allocation.
>>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>>>> fail.
>>>>>
>>>>> To solve these issues, change the behavior of crashkernel=X.
>>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>>>> high allocation if it fails.
>>>>>
>>>>> If requized size X is too large and leads to very little free memory
>>>>> in ZONE_DMA after low allocation, the system may not work normally.
>>>>> So add a threshold and go for high allocation directly if the required
>>>>> size is too large. The value of threshold is set as the half of
>>>>> the low memory.
>>>>>
>>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>>>> specified size low memory.
>>>> Except for the threshold to keep zone ZONE_DMA memory,
>>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>>>> to make this generic as well? In the first instance, you could avoid the
>>>> threshold check if it takes an explicit ",high" option.
>>> Ok, i will try to do this.
>>>
>>> I look into the function reserve_crashkernel() of x86 and found the start address is
>>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>>>
>>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
>> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
>> config PHYSICAL_ALIGN
>>         hex "Alignment value to which kernel should be aligned"
>>         default "0x200000"
>>         range 0x2000 0x1000000 if X86_32
>>         range 0x200000 0x1000000 if X86_64
>>
>> According to above, I think the 16M should come from the largest value
>> But the default value is 2M,  with smaller value reservation can have
>> more chance to succeed.
>>
>> It seems we still need arch specific CRASH_ALIGN, but the initial
>> version you added the #ifdef for different arches, can you move the
>> macro to arch specific headers?
> And just keep the x86 align value as is, I can try to change the x86
> value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
> cleaner.
Ok. I have no question about the value of macro CRASH_ALIGN,
instead the lower bound of memblock_find_in_range().

For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
    ...
    crash_base = memblock_find_in_range(CRASH_ALIGN,
                                                CRASH_ADDR_LOW_MAX,
                                                crash_size, CRASH_ALIGN);
    ...
   
in reserve_crashkernel_low(),with no this restriction.
    ...
    low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
    ...

How about all making memblock_find_in_range() search from the start of memory?
If it is ok, i will do like this in the generic version.

Thanks,
Chen Zhou
>
>> Thanks
>> Dave
>
> .
>
Dave Young Sept. 4, 2020, 4:16 a.m. UTC | #7
On 09/04/20 at 12:02pm, chenzhou wrote:
> 
> 
> On 2020/9/4 11:10, Dave Young wrote:
> > On 09/04/20 at 11:04am, Dave Young wrote:
> >> On 09/03/20 at 07:26pm, chenzhou wrote:
> >>> Hi Catalin,
> >>>
> >>>
> >>> On 2020/9/3 1:09, Catalin Marinas wrote:
> >>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
> >>>>> There are following issues in arm64 kdump:
> >>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
> >>>>> will fail when there is no enough low memory.
> >>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
> >>>>> kernel will boot failure because there is no low memory available
> >>>>> for allocation.
> >>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
> >>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
> >>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
> >>>>> fail.
> >>>>>
> >>>>> To solve these issues, change the behavior of crashkernel=X.
> >>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
> >>>>> high allocation if it fails.
> >>>>>
> >>>>> If requized size X is too large and leads to very little free memory
> >>>>> in ZONE_DMA after low allocation, the system may not work normally.
> >>>>> So add a threshold and go for high allocation directly if the required
> >>>>> size is too large. The value of threshold is set as the half of
> >>>>> the low memory.
> >>>>>
> >>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
> >>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
> >>>>> specified size low memory.
> >>>> Except for the threshold to keep zone ZONE_DMA memory,
> >>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
> >>>> to make this generic as well? In the first instance, you could avoid the
> >>>> threshold check if it takes an explicit ",high" option.
> >>> Ok, i will try to do this.
> >>>
> >>> I look into the function reserve_crashkernel() of x86 and found the start address is
> >>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
> >>>
> >>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
> >> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
> >> config PHYSICAL_ALIGN
> >>         hex "Alignment value to which kernel should be aligned"
> >>         default "0x200000"
> >>         range 0x2000 0x1000000 if X86_32
> >>         range 0x200000 0x1000000 if X86_64
> >>
> >> According to above, I think the 16M should come from the largest value
> >> But the default value is 2M,  with smaller value reservation can have
> >> more chance to succeed.
> >>
> >> It seems we still need arch specific CRASH_ALIGN, but the initial
> >> version you added the #ifdef for different arches, can you move the
> >> macro to arch specific headers?
> > And just keep the x86 align value as is, I can try to change the x86
> > value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
> > cleaner.
> Ok. I have no question about the value of macro CRASH_ALIGN,
> instead the lower bound of memblock_find_in_range().
> 
> For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
>     ...
>     crash_base = memblock_find_in_range(CRASH_ALIGN,
>                                                 CRASH_ADDR_LOW_MAX,
>                                                 crash_size, CRASH_ALIGN);
>     ...
>    
> in reserve_crashkernel_low(),with no this restriction.
>     ...
>     low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
>     ...
> 
> How about all making memblock_find_in_range() search from the start of memory?
> If it is ok, i will do like this in the generic version.

I feel starting with CRASH_ALIGN sounds better, can you just search from
CRASH_ALIGN in generic version?

Thanks
Dave
chenzhou Sept. 4, 2020, 6:39 a.m. UTC | #8
On 2020/9/4 12:16, Dave Young wrote:
> On 09/04/20 at 12:02pm, chenzhou wrote:
>>
>> On 2020/9/4 11:10, Dave Young wrote:
>>> On 09/04/20 at 11:04am, Dave Young wrote:
>>>> On 09/03/20 at 07:26pm, chenzhou wrote:
>>>>> Hi Catalin,
>>>>>
>>>>>
>>>>> On 2020/9/3 1:09, Catalin Marinas wrote:
>>>>>> On Sat, Aug 01, 2020 at 09:08:54PM +0800, Chen Zhou wrote:
>>>>>>> There are following issues in arm64 kdump:
>>>>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which
>>>>>>> will fail when there is no enough low memory.
>>>>>>> 2. If reserving crashkernel above 4G, in this case, crash dump
>>>>>>> kernel will boot failure because there is no low memory available
>>>>>>> for allocation.
>>>>>>> 3. Since commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32"),
>>>>>>> if the memory reserved for crash dump kernel falled in ZONE_DMA32,
>>>>>>> the devices in crash dump kernel need to use ZONE_DMA will alloc
>>>>>>> fail.
>>>>>>>
>>>>>>> To solve these issues, change the behavior of crashkernel=X.
>>>>>>> crashkernel=X tries low allocation in ZONE_DMA, and fall back to
>>>>>>> high allocation if it fails.
>>>>>>>
>>>>>>> If requized size X is too large and leads to very little free memory
>>>>>>> in ZONE_DMA after low allocation, the system may not work normally.
>>>>>>> So add a threshold and go for high allocation directly if the required
>>>>>>> size is too large. The value of threshold is set as the half of
>>>>>>> the low memory.
>>>>>>>
>>>>>>> If crash_base is outside ZONE_DMA, try to allocate at least 256M in
>>>>>>> ZONE_DMA automatically. "crashkernel=Y,low" can be used to allocate
>>>>>>> specified size low memory.
>>>>>> Except for the threshold to keep zone ZONE_DMA memory,
>>>>>> reserve_crashkernel() looks very close to the x86 version. Shall we try
>>>>>> to make this generic as well? In the first instance, you could avoid the
>>>>>> threshold check if it takes an explicit ",high" option.
>>>>> Ok, i will try to do this.
>>>>>
>>>>> I look into the function reserve_crashkernel() of x86 and found the start address is
>>>>> CRASH_ALIGN in function memblock_find_in_range(), which is different with arm64.
>>>>>
>>>>> I don't figure out why is CRASH_ALIGN in x86, is there any specific reason?
>>>> Hmm, took another look at the option CONFIG_PHYSICAL_ALIGN
>>>> config PHYSICAL_ALIGN
>>>>         hex "Alignment value to which kernel should be aligned"
>>>>         default "0x200000"
>>>>         range 0x2000 0x1000000 if X86_32
>>>>         range 0x200000 0x1000000 if X86_64
>>>>
>>>> According to above, I think the 16M should come from the largest value
>>>> But the default value is 2M,  with smaller value reservation can have
>>>> more chance to succeed.
>>>>
>>>> It seems we still need arch specific CRASH_ALIGN, but the initial
>>>> version you added the #ifdef for different arches, can you move the
>>>> macro to arch specific headers?
>>> And just keep the x86 align value as is, I can try to change the x86
>>> value later to CONFIG_PHYSICAL_ALIGN, in this way this series can be
>>> cleaner.
>> Ok. I have no question about the value of macro CRASH_ALIGN,
>> instead the lower bound of memblock_find_in_range().
>>
>> For x86, in reserve_crashkernel(),restrict the lower bound of the range to CRASH_ALIGN,
>>     ...
>>     crash_base = memblock_find_in_range(CRASH_ALIGN,
>>                                                 CRASH_ADDR_LOW_MAX,
>>                                                 crash_size, CRASH_ALIGN);
>>     ...
>>    
>> in reserve_crashkernel_low(),with no this restriction.
>>     ...
>>     low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
>>     ...
>>
>> How about all making memblock_find_in_range() search from the start of memory?
>> If it is ok, i will do like this in the generic version.
> I feel starting with CRASH_ALIGN sounds better, can you just search from
> CRASH_ALIGN in generic version?
ok.
>
> Thanks
> Dave
>
>
> .
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 1a2f27f12794..92ed53d0bf21 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -28,7 +28,11 @@ 
 /* 2M alignment for crash kernel regions */
 #define CRASH_ALIGN	SZ_2M
 
+#ifdef CONFIG_ZONE_DMA
+#define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
+#else
 #define CRASH_ADDR_LOW_MAX	arm64_dma32_phys_limit
+#endif
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 93b3844cf442..4dc51a2ac012 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -238,7 +238,13 @@  static void __init request_standard_resources(void)
 		    kernel_data.end <= res->end)
 			request_resource(res, &kernel_data);
 #ifdef CONFIG_KEXEC_CORE
-		/* Userspace will find "Crash kernel" region in /proc/iomem. */
+		/*
+		 * Userspace will find "Crash kernel" region in /proc/iomem.
+		 * Note: the low region is renamed as Crash kernel (low).
+		 */
+		if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+				crashk_low_res.end <= res->end)
+			request_resource(res, &crashk_low_res);
 		if (crashk_res.end && crashk_res.start >= res->start &&
 		    crashk_res.end <= res->end)
 			request_resource(res, &crashk_res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index a3d0193f6a0a..53c8916fd32f 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -70,6 +70,14 @@  phys_addr_t arm64_dma_phys_limit __ro_after_init;
 phys_addr_t arm64_dma32_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+
+/*
+ * Add a threshold for required memory size of crashkernel. If required memory
+ * size is greater than threshold, just go for high allocation directly. The
+ * value of threshold is set as half of the total low memory.
+ */
+#define REQUIRED_MEMORY_THRESHOLD	(memblock_mem_size(CRASH_ADDR_LOW_MAX >> \
+			PAGE_SHIFT) >> 1)
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -90,11 +98,22 @@  static void __init reserve_crashkernel(void)
 
 	crash_size = PAGE_ALIGN(crash_size);
 
-	if (crash_base == 0) {
-		/* Current arm64 boot protocol requires 2MB alignment */
-		crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX,
-				crash_size, CRASH_ALIGN);
-		if (crash_base == 0) {
+	if (!crash_base) {
+		/*
+		 * Current arm64 boot protocol requires 2MB alignment.
+		 * If required memory size is greater than threshold, just go
+		 * for high allocation directly.
+		 * If required memory size is less than or equal to threshold,
+		 * try low allocation firstly, and then fall back to high allocation
+		 * if it fails.
+		 */
+		if (crash_size <= REQUIRED_MEMORY_THRESHOLD)
+			crash_base = memblock_find_in_range(0, CRASH_ADDR_LOW_MAX,
+					crash_size, CRASH_ALIGN);
+		if (!crash_base)
+			crash_base = memblock_find_in_range(0, MEMBLOCK_ALLOC_ACCESSIBLE,
+					crash_size, SZ_2M);
+		if (!crash_base) {
 			pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 				crash_size);
 			return;
@@ -118,6 +137,28 @@  static void __init reserve_crashkernel(void)
 	}
 	memblock_reserve(crash_base, crash_size);
 
+	if (crash_base >= CRASH_ADDR_LOW_MAX) {
+		const char *rename = "Crash kernel (low)";
+
+		if (reserve_crashkernel_low()) {
+			memblock_free(crash_base, crash_size);
+			return;
+		}
+
+		/*
+		 * In order to distinct from the high region and make no effect
+		 * to the use of existing kexec-tools, rename the low region as
+		 * "Crash kernel (low)".
+		 */
+		crashk_low_res.name = rename;
+		/*
+		 * The low region is intended to be used for crash dump kernel
+		 * devices, just mark the low region as "nomap" simply.
+		 */
+		memblock_mark_nomap(crashk_low_res.start,
+				    resource_size(&crashk_low_res));
+	}
+
 	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
 		crash_base, crash_base + crash_size, crash_size >> 20);