Message ID | 20190507035058.63992-1-chenzhou10@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | support reserving crashkernel above 4G on arm64 kdump | expand |
+Cc kexec-list. Hi Chen, I think we are still in the quiet period of the merge cycle, but this is a change which will be useful for systems like HPE Apollo where we are looking at reserving crashkernel across a larger range. Some comments inline and in respective patch threads.. On 05/07/2019 09:20 AM, Chen Zhou wrote: > This patch series enable reserving crashkernel on high memory in arm64. Please fix the patch subject, it should be v5. Also please Cc the kexec-list (kexec@lists.infradead.org) for future versions to allow wider review of the patchset. > We use crashkernel=X to reserve crashkernel below 4G, which will fail > when there is no enough memory. Currently, crashkernel=Y@X can be used > to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers > are requierd, capture kernel will boot failure because of no low memory. ... ^^ required s/capture kernel will boot failure because of no low memory./capture kernel boot will fail because there is no low memory available for allocation. > When crashkernel is reserved above 4G in memory, kernel should reserve > some amount of low memory for swiotlb and some DMA buffers. So there may > be two crash kernel regions, one is below 4G, the other is above 4G. Then > Crash dump kernel reads more than one crash kernel regions via a dtb > property under node /chosen, > linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>. Please use consistent naming for the second kernel, better to use crash dump kernel. I have tested this on my HPE Apollo machine and with crashkernel=886M,high syntax, I can get the board to reserve a larger memory range for the crashkernel (i.e. 886M): # dmesg | grep -i crash [ 0.000000] kexec_core: Reserving 256MB of low memory at 3560MB for crashkernel (System low RAM: 2029MB) [ 0.000000] crashkernel reserved: 0x0000000bc5a00000 - 0x0000000bfd000000 (886 MB) kexec/kdump can also work also work fine on the board. So, with the changes suggested in this cover letter and individual patches, please feel free to add: Reviewed-and-Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Thanks, Bhupesh > Besides, we need to modify kexec-tools: > arm64: support more than one crash kernel regions(see [1]) > > I post this patch series about one month ago. The previous changes and > discussions can be retrived from: > > Changes since [v4] > - reimplement memblock_cap_memory_ranges for multiple ranges by Mike. > > Changes since [v3] > - Add memblock_cap_memory_ranges back for multiple ranges. > - Fix some compiling warnings. > > Changes since [v2] > - Split patch "arm64: kdump: support reserving crashkernel above 4G" as > two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate > patch. > > Changes since [v1]: > - Move common reserve_crashkernel_low() code into kernel/kexec_core.c. > - Remove memblock_cap_memory_ranges() i added in v1 and implement that > in fdt_enforce_memory_region(). > There are at most two crash kernel regions, for two crash kernel regions > case, we cap the memory range [min(regs[*].start), max(regs[*].end)] > and then remove the memory range in the middle. > > [1]: http://lists.infradead.org/pipermail/kexec/2019-April/022792.html > [v1]: https://lkml.org/lkml/2019/4/2/1174 > [v2]: https://lkml.org/lkml/2019/4/9/86 > [v3]: https://lkml.org/lkml/2019/4/9/306 > [v4]: https://lkml.org/lkml/2019/4/15/273 > > Chen Zhou (3): > x86: kdump: move reserve_crashkernel_low() into kexec_core.c > arm64: kdump: support reserving crashkernel above 4G > kdump: update Documentation about crashkernel on arm64 > > Mike Rapoport (1): > memblock: extend memblock_cap_memory_range to multiple ranges > > Documentation/admin-guide/kernel-parameters.txt | 6 +-- > arch/arm64/include/asm/kexec.h | 3 ++ > arch/arm64/kernel/setup.c | 3 ++ > arch/arm64/mm/init.c | 72 +++++++++++++++++++------ > arch/x86/include/asm/kexec.h | 3 ++ > arch/x86/kernel/setup.c | 66 +++-------------------- > include/linux/kexec.h | 5 ++ > include/linux/memblock.h | 2 +- > kernel/kexec_core.c | 56 +++++++++++++++++++ > mm/memblock.c | 44 +++++++-------- > 10 files changed, 157 insertions(+), 103 deletions(-) >
Hi Bhupesh, On 2019/5/15 13:06, Bhupesh Sharma wrote: > +Cc kexec-list. > > Hi Chen, > > I think we are still in the quiet period of the merge cycle, but this is a change which will be useful for systems like HPE Apollo where we are looking at reserving crashkernel across a larger range. > > Some comments inline and in respective patch threads.. > > On 05/07/2019 09:20 AM, Chen Zhou wrote: >> This patch series enable reserving crashkernel on high memory in arm64. > > Please fix the patch subject, it should be v5. > Also please Cc the kexec-list (kexec@lists.infradead.org) for future versions to allow wider review of the patchset. > >> We use crashkernel=X to reserve crashkernel below 4G, which will fail >> when there is no enough memory. Currently, crashkernel=Y@X can be used >> to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers >> are requierd, capture kernel will boot failure because of no low memory. > > ... ^^ required > > s/capture kernel will boot failure because of no low memory./capture kernel boot will fail because there is no low memory available for allocation. > >> When crashkernel is reserved above 4G in memory, kernel should reserve >> some amount of low memory for swiotlb and some DMA buffers. So there may >> be two crash kernel regions, one is below 4G, the other is above 4G. Then >> Crash dump kernel reads more than one crash kernel regions via a dtb >> property under node /chosen, >> linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>. > > Please use consistent naming for the second kernel, better to use crash dump kernel. > > I have tested this on my HPE Apollo machine and with crashkernel=886M,high syntax, I can get the board to reserve a larger memory range for the crashkernel (i.e. 886M): > > # dmesg | grep -i crash > [ 0.000000] kexec_core: Reserving 256MB of low memory at 3560MB for crashkernel (System low RAM: 2029MB) > [ 0.000000] crashkernel reserved: 0x0000000bc5a00000 - 0x0000000bfd000000 (886 MB) > > kexec/kdump can also work also work fine on the board. > > So, with the changes suggested in this cover letter and individual patches, please feel free to add: > > Reviewed-and-Tested-by: Bhupesh Sharma <bhsharma@redhat.com> > > Thanks, > Bhupesh > Thanks for you review and test. I will fix these later. Thanks, Chen Zhou >> Besides, we need to modify kexec-tools: >> arm64: support more than one crash kernel regions(see [1]) >> >> I post this patch series about one month ago. The previous changes and >> discussions can be retrived from: >> >> Changes since [v4] >> - reimplement memblock_cap_memory_ranges for multiple ranges by Mike. >> >> Changes since [v3] >> - Add memblock_cap_memory_ranges back for multiple ranges. >> - Fix some compiling warnings. >> >> Changes since [v2] >> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as >> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate >> patch. >> >> Changes since [v1]: >> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c. >> - Remove memblock_cap_memory_ranges() i added in v1 and implement that >> in fdt_enforce_memory_region(). >> There are at most two crash kernel regions, for two crash kernel regions >> case, we cap the memory range [min(regs[*].start), max(regs[*].end)] >> and then remove the memory range in the middle. >> >> [1]: http://lists.infradead.org/pipermail/kexec/2019-April/022792.html >> [v1]: https://lkml.org/lkml/2019/4/2/1174 >> [v2]: https://lkml.org/lkml/2019/4/9/86 >> [v3]: https://lkml.org/lkml/2019/4/9/306 >> [v4]: https://lkml.org/lkml/2019/4/15/273 >> >> Chen Zhou (3): >> x86: kdump: move reserve_crashkernel_low() into kexec_core.c >> arm64: kdump: support reserving crashkernel above 4G >> kdump: update Documentation about crashkernel on arm64 >> >> Mike Rapoport (1): >> memblock: extend memblock_cap_memory_range to multiple ranges >> >> Documentation/admin-guide/kernel-parameters.txt | 6 +-- >> arch/arm64/include/asm/kexec.h | 3 ++ >> arch/arm64/kernel/setup.c | 3 ++ >> arch/arm64/mm/init.c | 72 +++++++++++++++++++------ >> arch/x86/include/asm/kexec.h | 3 ++ >> arch/x86/kernel/setup.c | 66 +++-------------------- >> include/linux/kexec.h | 5 ++ >> include/linux/memblock.h | 2 +- >> kernel/kexec_core.c | 56 +++++++++++++++++++ >> mm/memblock.c | 44 +++++++-------- >> 10 files changed, 157 insertions(+), 103 deletions(-) >> > > > . >
Hi Catalin, Sorry to ping you. What's your suggestion about this patch series? I am looking forward to your replay. Thanks, Chen Zhou On 2019/5/16 11:19, Chen Zhou wrote: > Hi Bhupesh, > > On 2019/5/15 13:06, Bhupesh Sharma wrote: >> +Cc kexec-list. >> >> Hi Chen, >> >> I think we are still in the quiet period of the merge cycle, but this is a change which will be useful for systems like HPE Apollo where we are looking at reserving crashkernel across a larger range. >> >> Some comments inline and in respective patch threads.. >> >> On 05/07/2019 09:20 AM, Chen Zhou wrote: >>> This patch series enable reserving crashkernel on high memory in arm64. >> >> Please fix the patch subject, it should be v5. >> Also please Cc the kexec-list (kexec@lists.infradead.org) for future versions to allow wider review of the patchset. >> >>> We use crashkernel=X to reserve crashkernel below 4G, which will fail >>> when there is no enough memory. Currently, crashkernel=Y@X can be used >>> to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers >>> are requierd, capture kernel will boot failure because of no low memory. >> >> ... ^^ required >> >> s/capture kernel will boot failure because of no low memory./capture kernel boot will fail because there is no low memory available for allocation. >> >>> When crashkernel is reserved above 4G in memory, kernel should reserve >>> some amount of low memory for swiotlb and some DMA buffers. So there may >>> be two crash kernel regions, one is below 4G, the other is above 4G. Then >>> Crash dump kernel reads more than one crash kernel regions via a dtb >>> property under node /chosen, >>> linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>. >> >> Please use consistent naming for the second kernel, better to use crash dump kernel. >> >> I have tested this on my HPE Apollo machine and with crashkernel=886M,high syntax, I can get the board to reserve a larger memory range for the crashkernel (i.e. 886M): >> >> # dmesg | grep -i crash >> [ 0.000000] kexec_core: Reserving 256MB of low memory at 3560MB for crashkernel (System low RAM: 2029MB) >> [ 0.000000] crashkernel reserved: 0x0000000bc5a00000 - 0x0000000bfd000000 (886 MB) >> >> kexec/kdump can also work also work fine on the board. >> >> So, with the changes suggested in this cover letter and individual patches, please feel free to add: >> >> Reviewed-and-Tested-by: Bhupesh Sharma <bhsharma@redhat.com> >> >> Thanks, >> Bhupesh >> > > Thanks for you review and test. I will fix these later. > > Thanks, > Chen Zhou > >>> Besides, we need to modify kexec-tools: >>> arm64: support more than one crash kernel regions(see [1]) >>> >>> I post this patch series about one month ago. The previous changes and >>> discussions can be retrived from: >>> >>> Changes since [v4] >>> - reimplement memblock_cap_memory_ranges for multiple ranges by Mike. >>> >>> Changes since [v3] >>> - Add memblock_cap_memory_ranges back for multiple ranges. >>> - Fix some compiling warnings. >>> >>> Changes since [v2] >>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as >>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate >>> patch. >>> >>> Changes since [v1]: >>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c. >>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that >>> in fdt_enforce_memory_region(). >>> There are at most two crash kernel regions, for two crash kernel regions >>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)] >>> and then remove the memory range in the middle. >>> >>> [1]: http://lists.infradead.org/pipermail/kexec/2019-April/022792.html >>> [v1]: https://lkml.org/lkml/2019/4/2/1174 >>> [v2]: https://lkml.org/lkml/2019/4/9/86 >>> [v3]: https://lkml.org/lkml/2019/4/9/306 >>> [v4]: https://lkml.org/lkml/2019/4/15/273 >>> >>> Chen Zhou (3): >>> x86: kdump: move reserve_crashkernel_low() into kexec_core.c >>> arm64: kdump: support reserving crashkernel above 4G >>> kdump: update Documentation about crashkernel on arm64 >>> >>> Mike Rapoport (1): >>> memblock: extend memblock_cap_memory_range to multiple ranges >>> >>> Documentation/admin-guide/kernel-parameters.txt | 6 +-- >>> arch/arm64/include/asm/kexec.h | 3 ++ >>> arch/arm64/kernel/setup.c | 3 ++ >>> arch/arm64/mm/init.c | 72 +++++++++++++++++++------ >>> arch/x86/include/asm/kexec.h | 3 ++ >>> arch/x86/kernel/setup.c | 66 +++-------------------- >>> include/linux/kexec.h | 5 ++ >>> include/linux/memblock.h | 2 +- >>> kernel/kexec_core.c | 56 +++++++++++++++++++ >>> mm/memblock.c | 44 +++++++-------- >>> 10 files changed, 157 insertions(+), 103 deletions(-) >>> >> >> >> . >> > > > . >
Hi! On 07/05/2019 04:50, Chen Zhou wrote: > We use crashkernel=X to reserve crashkernel below 4G, which will fail > when there is no enough memory. Currently, crashkernel=Y@X can be used > to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers > are requierd, capture kernel will boot failure because of no low memory. > When crashkernel is reserved above 4G in memory, kernel should reserve > some amount of low memory for swiotlb and some DMA buffers. So there may > be two crash kernel regions, one is below 4G, the other is above 4G. This is a good argument for supporting the 'crashkernel=...,low' version. What is the 'crashkernel=...,high' version for? Wouldn't it be simpler to relax the ARCH_LOW_ADDRESS_LIMIT if we see 'crashkernel=...,low' in the kernel cmdline? I don't see what the 'crashkernel=...,high' variant is giving us, it just complicates the flow of reserve_crashkernel(). If we called reserve_crashkernel_low() at the beginning of reserve_crashkernel() we could use crashk_low_res.end to change some limit variable from ARCH_LOW_ADDRESS_LIMIT to memblock_end_of_DRAM(). I think this is a simpler change that gives you what you want. > Then > Crash dump kernel reads more than one crash kernel regions via a dtb > property under node /chosen, > linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>. Won't this break if your kdump kernel doesn't know what the extra parameters are? Or if it expects two ranges, but only gets one? These DT properties should be treated as ABI between kernel versions, we can't really change it like this. I think the 'low' region is an optional-extra, that is never mapped by the first kernel. I think the simplest thing to do is to add an 'linux,low-memory-range' that we memblock_add() after memblock_cap_memory_range() has been called. If its missing, or the new kernel doesn't know what its for, everything keeps working. > Besides, we need to modify kexec-tools: > arm64: support more than one crash kernel regions(see [1]) > I post this patch series about one month ago. The previous changes and > discussions can be retrived from: Ah, this wasn't obvious as you've stopped numbering the series. Please label the next one 'v6' so that we can describe this as 'v5'. (duplicate numbering would be even more confusing!) Thanks, James
On 2019/6/6 0:32, James Morse wrote: > Hi! > > On 07/05/2019 04:50, Chen Zhou wrote: >> We use crashkernel=X to reserve crashkernel below 4G, which will fail >> when there is no enough memory. Currently, crashkernel=Y@X can be used >> to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers >> are requierd, capture kernel will boot failure because of no low memory. > >> When crashkernel is reserved above 4G in memory, kernel should reserve >> some amount of low memory for swiotlb and some DMA buffers. So there may >> be two crash kernel regions, one is below 4G, the other is above 4G. > > This is a good argument for supporting the 'crashkernel=...,low' version. > What is the 'crashkernel=...,high' version for? > > Wouldn't it be simpler to relax the ARCH_LOW_ADDRESS_LIMIT if we see 'crashkernel=...,low' > in the kernel cmdline? > > I don't see what the 'crashkernel=...,high' variant is giving us, it just complicates the > flow of reserve_crashkernel(). > > If we called reserve_crashkernel_low() at the beginning of reserve_crashkernel() we could > use crashk_low_res.end to change some limit variable from ARCH_LOW_ADDRESS_LIMIT to > memblock_end_of_DRAM(). > I think this is a simpler change that gives you what you want. According to your suggestions, we should do like this: 1. call reserve_crashkernel_low() at the beginning of reserve_crashkernel() 2. mark the low region as 'nomap' 3. use crashk_low_res.end to change some limit variable from ARCH_LOW_ADDRESS_LIMIT to memblock_end_of_DRAM() 4. rename crashk_low_res as "Crash kernel (low)" for arm64 5. add an 'linux,low-memory-range' node in DT Do i understand correctly? > > >> Then >> Crash dump kernel reads more than one crash kernel regions via a dtb >> property under node /chosen, >> linux,usable-memory-range = <BASE1 SIZE1 [BASE2 SIZE2]>. > > Won't this break if your kdump kernel doesn't know what the extra parameters are? > Or if it expects two ranges, but only gets one? These DT properties should be treated as > ABI between kernel versions, we can't really change it like this. > > I think the 'low' region is an optional-extra, that is never mapped by the first kernel. I > think the simplest thing to do is to add an 'linux,low-memory-range' that we > memblock_add() after memblock_cap_memory_range() has been called. > If its missing, or the new kernel doesn't know what its for, everything keeps working. > > >> Besides, we need to modify kexec-tools: >> arm64: support more than one crash kernel regions(see [1]) > >> I post this patch series about one month ago. The previous changes and >> discussions can be retrived from: > > Ah, this wasn't obvious as you've stopped numbering the series. Please label the next one > 'v6' so that we can describe this as 'v5'. (duplicate numbering would be even more confusing!) > ok. > > Thanks, > > James > > . > Thanks, Chen Zhou
Hi Chen Zhou, On 13/06/2019 12:27, Chen Zhou wrote: > On 2019/6/6 0:32, James Morse wrote: >> On 07/05/2019 04:50, Chen Zhou wrote: >>> We use crashkernel=X to reserve crashkernel below 4G, which will fail >>> when there is no enough memory. Currently, crashkernel=Y@X can be used >>> to reserve crashkernel above 4G, in this case, if swiotlb or DMA buffers >>> are requierd, capture kernel will boot failure because of no low memory. >> >>> When crashkernel is reserved above 4G in memory, kernel should reserve >>> some amount of low memory for swiotlb and some DMA buffers. So there may >>> be two crash kernel regions, one is below 4G, the other is above 4G. >> >> This is a good argument for supporting the 'crashkernel=...,low' version. >> What is the 'crashkernel=...,high' version for? >> >> Wouldn't it be simpler to relax the ARCH_LOW_ADDRESS_LIMIT if we see 'crashkernel=...,low' >> in the kernel cmdline? >> >> I don't see what the 'crashkernel=...,high' variant is giving us, it just complicates the >> flow of reserve_crashkernel(). >> >> If we called reserve_crashkernel_low() at the beginning of reserve_crashkernel() we could >> use crashk_low_res.end to change some limit variable from ARCH_LOW_ADDRESS_LIMIT to >> memblock_end_of_DRAM(). >> I think this is a simpler change that gives you what you want. > > According to your suggestions, we should do like this: > 1. call reserve_crashkernel_low() at the beginning of reserve_crashkernel() > 2. mark the low region as 'nomap' > 3. use crashk_low_res.end to change some limit variable from ARCH_LOW_ADDRESS_LIMIT to > memblock_end_of_DRAM() > 4. rename crashk_low_res as "Crash kernel (low)" for arm64 > 5. add an 'linux,low-memory-range' node in DT (This bit would happen in kexec-tools) > Do i understand correctly? Yes, I think this is simpler and still gives you what you want. It also leaves the existing behaviour unchanged, which helps with keeping compatibility with existing user-space and older kdump kernels. Thanks, James