Message ID | 1515080048-9154-1-git-send-email-poonam.aggrwal@nxp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Poonam, On 04/01/18 15:34, Poonam Aggrwal wrote: > Address for both crashkernel memory and elfcorehdr can be assigned > statically. For crashkernel memory it is via crashkernel=SIZE@ADDRESS > and elfcorehdr_addr via by kexec-utils in dump kernel device tree. There are some crashkernel=SIZE@ADDRESS values that the user can supply that we must reject. The obvious one is if it overlaps with the kernel text. (this patch won't spot this). We need to read the hardware's reserved regions from the DT before we allocate the crashkernel region, for example if the bootloader reserved a chunk of memory for a frame-buffer, I shouldn't be able to use that as crashkernel memory. (you shouldn't need to specify an address, doing so prevents the kernel from picking memory it can use) > So memory should be reserved for both the above areas before any > other memory reservations are done. Otherwise overlaps may happen and > memory allocations will fail leading to failure in booting the > dump capture kernel. > Signed-off-by: Guanhua <guanhua.gao@nxp.com> The first signed-off-by should be the patch author. If you want to credit your colleagues you can use 'CC', or they can give reviewed-by/acked-by to the patch. > Signed-off-by: Poonam Aggrwal <poonam.aggrwal@nxp.com> > Signed-off-by: Abhimanyu Saini <abhimanyu.saini@nxp.com> And the last signed-of-by should be from the person posting to the mailing list. > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 00e7b90..24ce15d 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) > * Register the kernel text, kernel data, initrd, and initial > * pagetables with memblock. > */ Please don't insert this code between 'memblock_reserve(__pa_symbol(_text), _end - _text);' and the comment that describes it. > + > + /* make these the first reservations to avoid chances of > + * overlap > + */ Nit: comment style > + reserve_elfcorehdr(); (Moving reserve_elfcorehdr() makes sense, but..) > + reserve_crashkernel(); reserve_crashkernel() does the allocation for the crashkernels reserved memory. I expect this to always fail in the kdump kernel because there isn't enough memory. (fdt_enforce_memory_region() at the top of this function calls memblock_cap_memory_range()). Moving this allocation above the early_init_fdt_scan_reserved_mem() below means we may allocate memory for the crashdump that is in use by firmware/hardware and described as reserved in the DT. > memblock_reserve(__pa_symbol(_text), _end - _text); > #ifdef CONFIG_BLK_DEV_INITRD > if (initrd_start) { > @@ -472,10 +480,6 @@ void __init arm64_memblock_init(void) > else > arm64_dma_phys_limit = PHYS_MASK + 1; > > - reserve_crashkernel(); > - > - reserve_elfcorehdr(); > - > high_memory = __va(memblock_end_of_DRAM() - 1) + 1; > > dma_contiguous_reserve(arm64_dma_phys_limit); > Thanks, James
Thanks James for the feedback. Please find replies inline. Regards Poonam > -----Original Message----- > From: James Morse [mailto:james.morse@arm.com] > Sent: Friday, January 05, 2018 4:37 PM > To: Poonam Aggrwal <poonam.aggrwal@nxp.com>; linux-arm- > kernel@lists.infradead.org > Cc: takahiro.akashi@linaro.org; will.deacon@arm.com; G.h. Gao > <guanhua.gao@nxp.com>; Abhimanyu Saini <abhimanyu.saini@nxp.com> > Subject: Re: [PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before > any reservation > > Hi Poonam, > > On 04/01/18 15:34, Poonam Aggrwal wrote: > > Address for both crashkernel memory and elfcorehdr can be assigned > > statically. For crashkernel memory it is via crashkernel=SIZE@ADDRESS > > and elfcorehdr_addr via by kexec-utils in dump kernel device tree. > > There are some crashkernel=SIZE@ADDRESS values that the user can supply that > we must reject. The obvious one is if it overlaps with the kernel text. (this patch > won't spot this). We need to read the hardware's reserved regions from the DT > before we allocate the crashkernel region, for example if the bootloader > reserved a chunk of memory for a frame-buffer, I shouldn't be able to use that > as crashkernel memory. > > (you shouldn't need to specify an address, doing so prevents the kernel from > picking memory it can use) > > > > So memory should be reserved for both the above areas before any other > > memory reservations are done. Otherwise overlaps may happen and memory > > allocations will fail leading to failure in booting the dump capture > > kernel. > > > Signed-off-by: Guanhua <guanhua.gao@nxp.com> > > The first signed-off-by should be the patch author. > If you want to credit your colleagues you can use 'CC', or they can give > reviewed-by/acked-by to the patch. > > > > Signed-off-by: Poonam Aggrwal <poonam.aggrwal@nxp.com> > > Signed-off-by: Abhimanyu Saini <abhimanyu.saini@nxp.com> > > And the last signed-of-by should be from the person posting to the mailing list. > Sure will correct this. > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index > > 00e7b90..24ce15d 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) > > * Register the kernel text, kernel data, initrd, and initial > > * pagetables with memblock. > > */ > > Please don't insert this code between 'memblock_reserve(__pa_symbol(_text), > _end > - _text);' and the comment that describes it. Thanks for catching this, will fix it. > > > > + > > + /* make these the first reservations to avoid chances of > > + * overlap > > + */ > > Nit: comment style Right, will fix it. > > > > + reserve_elfcorehdr(); > > (Moving reserve_elfcorehdr() makes sense, but..) > > > > + reserve_crashkernel(); > > reserve_crashkernel() does the allocation for the crashkernels reserved memory. > I expect this to always fail in the kdump kernel because there isn't enough > memory. (fdt_enforce_memory_region() at the top of this function calls > memblock_cap_memory_range()). > > Moving this allocation above the early_init_fdt_scan_reserved_mem() below > means we may allocate memory for the crashdump that is in use by > firmware/hardware and described as reserved in the DT. Yeah, this is a good point. So ideally the address of the crash kernel should be diligently provided by the user based on the system. > > > > memblock_reserve(__pa_symbol(_text), _end - _text); #ifdef > > CONFIG_BLK_DEV_INITRD > > if (initrd_start) { > > @@ -472,10 +480,6 @@ void __init arm64_memblock_init(void) > > else > > arm64_dma_phys_limit = PHYS_MASK + 1; > > > > - reserve_crashkernel(); > > - > > - reserve_elfcorehdr(); > > - > > high_memory = __va(memblock_end_of_DRAM() - 1) + 1; > > > > dma_contiguous_reserve(arm64_dma_phys_limit); > > > > > Thanks, > > James
Hi Poonam, On 08/01/18 04:31, Poonam Aggrwal wrote: > James Morse wrote: >> On 04/01/18 15:34, Poonam Aggrwal wrote: >>> Address for both crashkernel memory and elfcorehdr can be assigned >>> statically. For crashkernel memory it is via crashkernel=SIZE@ADDRESS >>> and elfcorehdr_addr via by kexec-utils in dump kernel device tree. >> >> There are some crashkernel=SIZE@ADDRESS values that the user can supply that >> we must reject. The obvious one is if it overlaps with the kernel text. (this patch >> won't spot this). We need to read the hardware's reserved regions from the DT >> before we allocate the crashkernel region, for example if the bootloader >> reserved a chunk of memory for a frame-buffer, I shouldn't be able to use that >> as crashkernel memory. >> >> (you shouldn't need to specify an address, doing so prevents the kernel from >> picking memory it can use) >> >> >>> So memory should be reserved for both the above areas before any other >>> memory reservations are done. Otherwise overlaps may happen and memory >>> allocations will fail leading to failure in booting the dump capture >>> kernel. >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index >>> 00e7b90..24ce15d 100644 >>> --- a/arch/arm64/mm/init.c >>> +++ b/arch/arm64/mm/init.c >>> @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) >>> + reserve_elfcorehdr(); >> >> (Moving reserve_elfcorehdr() makes sense, but..) >> >> >>> + reserve_crashkernel(); >> >> reserve_crashkernel() does the allocation for the crashkernels reserved memory. >> I expect this to always fail in the kdump kernel because there isn't enough >> memory. (fdt_enforce_memory_region() at the top of this function calls >> memblock_cap_memory_range()). >> >> Moving this allocation above the early_init_fdt_scan_reserved_mem() below >> means we may allocate memory for the crashdump that is in use by >> firmware/hardware and described as reserved in the DT. > Yeah, this is a good point. So ideally the address of the crash kernel should > be diligently provided by the user based on the system. Even better: the region to store the crash kernel in should be chosen by the kernel. When using kdump I boot with 'crashkernel=1G', the kernel chooses where to place the reserved region. Even if I specified a reasonable physical address, the efistub may relocate the kernel over the top during boot as part of its KASLR work. (Why does anyone ever need to specify an offset here?) Thanks, James
Hi James Regards Poonam > -----Original Message----- > From: James Morse [mailto:james.morse@arm.com] > Sent: Friday, January 12, 2018 11:29 PM > To: Poonam Aggrwal <poonam.aggrwal@nxp.com> > Cc: linux-arm-kernel@lists.infradead.org; takahiro.akashi@linaro.org; > will.deacon@arm.com; G.h. Gao <guanhua.gao@nxp.com>; Abhimanyu Saini > <abhimanyu.saini@nxp.com>; kexec@lists.infradead.org > Subject: Re: [PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before > any reservation > > Hi Poonam, > > On 08/01/18 04:31, Poonam Aggrwal wrote: > > James Morse wrote: > >> On 04/01/18 15:34, Poonam Aggrwal wrote: > >>> Address for both crashkernel memory and elfcorehdr can be assigned > >>> statically. For crashkernel memory it is via > >>> crashkernel=SIZE@ADDRESS and elfcorehdr_addr via by kexec-utils in dump > kernel device tree. > >> > >> There are some crashkernel=SIZE@ADDRESS values that the user can > >> supply that we must reject. The obvious one is if it overlaps with > >> the kernel text. (this patch won't spot this). We need to read the > >> hardware's reserved regions from the DT before we allocate the > >> crashkernel region, for example if the bootloader reserved a chunk of > >> memory for a frame-buffer, I shouldn't be able to use that as crashkernel > memory. > >> > >> (you shouldn't need to specify an address, doing so prevents the > >> kernel from picking memory it can use) > >> > >> > >>> So memory should be reserved for both the above areas before any > >>> other memory reservations are done. Otherwise overlaps may happen > >>> and memory allocations will fail leading to failure in booting the > >>> dump capture kernel. > > >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index > >>> 00e7b90..24ce15d 100644 > >>> --- a/arch/arm64/mm/init.c > >>> +++ b/arch/arm64/mm/init.c > >>> @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) > > >>> + reserve_elfcorehdr(); > >> > >> (Moving reserve_elfcorehdr() makes sense, but..) > >> > >> > >>> + reserve_crashkernel(); > >> > >> reserve_crashkernel() does the allocation for the crashkernels reserved > memory. > >> I expect this to always fail in the kdump kernel because there isn't > >> enough memory. (fdt_enforce_memory_region() at the top of this > >> function calls memblock_cap_memory_range()). > >> > >> Moving this allocation above the early_init_fdt_scan_reserved_mem() > >> below means we may allocate memory for the crashdump that is in use > >> by firmware/hardware and described as reserved in the DT. > > > Yeah, this is a good point. So ideally the address of the crash kernel > > should be diligently provided by the user based on the system. > > Even better: the region to store the crash kernel in should be chosen by the > kernel. > When using kdump I boot with 'crashkernel=1G', the kernel chooses where to > place the reserved region. Right. This is a commonly used. Even if I specified a reasonable physical address, the > efistub may relocate the kernel over the top during boot as part of its KASLR > work. Agree > > (Why does anyone ever need to specify an offset here?) offset is normally an optional argument. Request Takahiro to provide his inputs. Does this imply any updates in kexec design/implementation/documentation? > > > Thanks, > > James
Hi Poonam, James On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal@nxp.com> wrote: > Hi James > > Regards > Poonam > >> -----Original Message----- >> From: James Morse [mailto:james.morse@arm.com] >> Sent: Friday, January 12, 2018 11:29 PM >> To: Poonam Aggrwal <poonam.aggrwal@nxp.com> >> Cc: linux-arm-kernel@lists.infradead.org; takahiro.akashi@linaro.org; >> will.deacon@arm.com; G.h. Gao <guanhua.gao@nxp.com>; Abhimanyu Saini >> <abhimanyu.saini@nxp.com>; kexec@lists.infradead.org >> Subject: Re: [PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before >> any reservation >> >> Hi Poonam, >> >> On 08/01/18 04:31, Poonam Aggrwal wrote: >> > James Morse wrote: >> >> On 04/01/18 15:34, Poonam Aggrwal wrote: >> >>> Address for both crashkernel memory and elfcorehdr can be assigned >> >>> statically. For crashkernel memory it is via >> >>> crashkernel=SIZE@ADDRESS and elfcorehdr_addr via by kexec-utils in dump >> kernel device tree. >> >> >> >> There are some crashkernel=SIZE@ADDRESS values that the user can >> >> supply that we must reject. The obvious one is if it overlaps with >> >> the kernel text. (this patch won't spot this). We need to read the >> >> hardware's reserved regions from the DT before we allocate the >> >> crashkernel region, for example if the bootloader reserved a chunk of >> >> memory for a frame-buffer, I shouldn't be able to use that as crashkernel >> memory. >> >> >> >> (you shouldn't need to specify an address, doing so prevents the >> >> kernel from picking memory it can use) >> >> >> >> >> >>> So memory should be reserved for both the above areas before any >> >>> other memory reservations are done. Otherwise overlaps may happen >> >>> and memory allocations will fail leading to failure in booting the >> >>> dump capture kernel. >> >> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index >> >>> 00e7b90..24ce15d 100644 >> >>> --- a/arch/arm64/mm/init.c >> >>> +++ b/arch/arm64/mm/init.c >> >>> @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) >> >> >>> + reserve_elfcorehdr(); >> >> >> >> (Moving reserve_elfcorehdr() makes sense, but..) >> >> >> >> >> >>> + reserve_crashkernel(); >> >> >> >> reserve_crashkernel() does the allocation for the crashkernels reserved >> memory. >> >> I expect this to always fail in the kdump kernel because there isn't >> >> enough memory. (fdt_enforce_memory_region() at the top of this >> >> function calls memblock_cap_memory_range()). >> >> >> >> Moving this allocation above the early_init_fdt_scan_reserved_mem() >> >> below means we may allocate memory for the crashdump that is in use >> >> by firmware/hardware and described as reserved in the DT. >> >> > Yeah, this is a good point. So ideally the address of the crash kernel >> > should be diligently provided by the user based on the system. >> >> Even better: the region to store the crash kernel in should be chosen by the >> kernel. >> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to >> place the reserved region. > Right. This is a commonly used. > Even if I specified a reasonable physical address, the >> efistub may relocate the kernel over the top during boot as part of its KASLR >> work. > Agree >> >> (Why does anyone ever need to specify an offset here?) > offset is normally an optional argument. Request Takahiro to provide his inputs. Does this imply any updates in kexec design/implementation/documentation? offset is a optional argument. For relocatable kernels (and kernels which support KASLR), specifying offset is normally not needed. Please refer to the 'Extended crashkernel syntax' documentation (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>): Extended crashkernel syntax =========================== While the "crashkernel=size[@offset]" syntax is sufficient for most configurations, sometimes it's handy to have the reserved memory dependent on the value of System RAM -- that's mostly for distributors that pre-setup the kernel command line to avoid a unbootable system after some memory has been removed from the machine. The syntax is: crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset] range=start-[end] For example: crashkernel=512M-2G:64M,2G-:128M This would mean: 1) if the RAM is smaller than 512M, then don't reserve anything (this is the "rescue" case) 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M 3) if the RAM size is larger than 2G, then reserve 128M As James mentioned for arm64, in case of relocatable/kaslr kernels, the efistub may relocate the kernel over the top during boot as part of its KASLR. So, the offset field may make more sense for non-relocatable/static kernels, but for newer kernels, its better to use the 'Extended crashkernel syntax' syntax which is also supported by newer distribution versions. For e.g. see ubuntu trusty kdump-config man page - <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>: kdump kernel relocation address does not match crashkernel= parameter: For non-relocatable architectures, the kdump kernel must be built with a predetermined start address. This message indicates that the start address of the kdump kernel and the start address in the crashkernel= parameter do not match. Regards, Bhupesh
On Mon, Jan 15, 2018 at 10:14:05AM +0530, Bhupesh SHARMA wrote: > Hi Poonam, James > > On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal@nxp.com> wrote: > > Hi James > > > > Regards > > Poonam > > > >> -----Original Message----- > >> From: James Morse [mailto:james.morse@arm.com] > >> Sent: Friday, January 12, 2018 11:29 PM > >> To: Poonam Aggrwal <poonam.aggrwal@nxp.com> > >> Cc: linux-arm-kernel@lists.infradead.org; takahiro.akashi@linaro.org; > >> will.deacon@arm.com; G.h. Gao <guanhua.gao@nxp.com>; Abhimanyu Saini > >> <abhimanyu.saini@nxp.com>; kexec@lists.infradead.org > >> Subject: Re: [PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before > >> any reservation > >> > >> Hi Poonam, > >> > >> On 08/01/18 04:31, Poonam Aggrwal wrote: > >> > James Morse wrote: > >> >> On 04/01/18 15:34, Poonam Aggrwal wrote: > >> >>> Address for both crashkernel memory and elfcorehdr can be assigned > >> >>> statically. For crashkernel memory it is via > >> >>> crashkernel=SIZE@ADDRESS and elfcorehdr_addr via by kexec-utils in dump > >> kernel device tree. > >> >> > >> >> There are some crashkernel=SIZE@ADDRESS values that the user can > >> >> supply that we must reject. The obvious one is if it overlaps with > >> >> the kernel text. (this patch won't spot this). We need to read the > >> >> hardware's reserved regions from the DT before we allocate the > >> >> crashkernel region, for example if the bootloader reserved a chunk of > >> >> memory for a frame-buffer, I shouldn't be able to use that as crashkernel > >> memory. > >> >> > >> >> (you shouldn't need to specify an address, doing so prevents the > >> >> kernel from picking memory it can use) > >> >> > >> >> > >> >>> So memory should be reserved for both the above areas before any > >> >>> other memory reservations are done. Otherwise overlaps may happen > >> >>> and memory allocations will fail leading to failure in booting the > >> >>> dump capture kernel. > >> > >> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index > >> >>> 00e7b90..24ce15d 100644 > >> >>> --- a/arch/arm64/mm/init.c > >> >>> +++ b/arch/arm64/mm/init.c > >> >>> @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) > >> > >> >>> + reserve_elfcorehdr(); > >> >> > >> >> (Moving reserve_elfcorehdr() makes sense, but..) > >> >> > >> >> > >> >>> + reserve_crashkernel(); > >> >> > >> >> reserve_crashkernel() does the allocation for the crashkernels reserved > >> memory. > >> >> I expect this to always fail in the kdump kernel because there isn't > >> >> enough memory. (fdt_enforce_memory_region() at the top of this > >> >> function calls memblock_cap_memory_range()). > >> >> > >> >> Moving this allocation above the early_init_fdt_scan_reserved_mem() > >> >> below means we may allocate memory for the crashdump that is in use > >> >> by firmware/hardware and described as reserved in the DT. > >> > >> > Yeah, this is a good point. So ideally the address of the crash kernel > >> > should be diligently provided by the user based on the system. > >> > >> Even better: the region to store the crash kernel in should be chosen by the > >> kernel. > >> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to > >> place the reserved region. > > Right. This is a commonly used. > > Even if I specified a reasonable physical address, the > >> efistub may relocate the kernel over the top during boot as part of its KASLR > >> work. > > Agree > >> > >> (Why does anyone ever need to specify an offset here?) > > offset is normally an optional argument. Request Takahiro to provide his inputs. Does this imply any updates in kexec design/implementation/documentation? > > offset is a optional argument. For relocatable kernels (and kernels > which support KASLR), specifying offset is normally not needed. > > Please refer to the 'Extended crashkernel syntax' documentation > (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>): > > Extended crashkernel syntax > =========================== > > While the "crashkernel=size[@offset]" syntax is sufficient for most > configurations, sometimes it's handy to have the reserved memory dependent > on the value of System RAM -- that's mostly for distributors that pre-setup > the kernel command line to avoid a unbootable system after some memory has > been removed from the machine. > > The syntax is: > > crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset] > range=start-[end] > > For example: > > crashkernel=512M-2G:64M,2G-:128M > > This would mean: > > 1) if the RAM is smaller than 512M, then don't reserve anything > (this is the "rescue" case) > 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M > 3) if the RAM size is larger than 2G, then reserve 128M > > As James mentioned for arm64, in case of relocatable/kaslr kernels, > the efistub may relocate the kernel over the top during boot as part > of its KASLR. It would be sad if we couldn't specify kaslr and kdump at the same time. Since kaslr will skip any of memory regions whose attributes are not CONVENTIONAL_MEMORY for allocating a relocated kernel image, we will be able to have a dedicated range of memory reserved for kdump. In this case, using an "offset" in "crashkernel=" will be crucial. (I don't know how we can notify uefi of the region though.) > So, the offset field may make more sense for > non-relocatable/static kernels, but for newer kernels, its better to > use the 'Extended crashkernel syntax' syntax which is also supported > by newer distribution versions. How better is it for the case? I don't know exactly what you mean by "newer kernel/distribution", but kdump on arm64 supports this feature from the day one. (It is basically independent from architectures.) Thanks, -Takahiro AKASHI > For e.g. see ubuntu trusty kdump-config man page - > <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>: > > kdump kernel relocation address does not match crashkernel= parameter: > For non-relocatable architectures, the kdump kernel must be > built with a predetermined start address. This message > indicates that the start address of the kdump kernel and the > start address in the crashkernel= parameter do not match. > > Regards, > Bhupesh
Hello, On 16/01/18 07:07, takahiro.akashi@linaro.org wrote: > On Mon, Jan 15, 2018 at 10:14:05AM +0530, Bhupesh SHARMA wrote: >> On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal@nxp.com> wrote: >>>> On 08/01/18 04:31, Poonam Aggrwal wrote: >>>>> Yeah, this is a good point. So ideally the address of the crash kernel >>>>> should be diligently provided by the user based on the system. >>>> Even better: the region to store the crash kernel in should be chosen by the >>>> kernel. >>>> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to >>>> place the reserved region. >>>> Even if I specified a reasonable physical address, the >>>> efistub may relocate the kernel over the top during boot as part of its KASLR >>>> work. >>> Agree >>>> >>>> (Why does anyone ever need to specify an offset here?) >>> offset is normally an optional argument. Request Takahiro to provide his inputs. Does this imply any updates in kexec design/implementation/documentation? >> >> offset is a optional argument. For relocatable kernels (and kernels >> which support KASLR), specifying offset is normally not needed. >> >> Please refer to the 'Extended crashkernel syntax' documentation >> (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>): >> >> Extended crashkernel syntax >> =========================== >> >> While the "crashkernel=size[@offset]" syntax is sufficient for most >> configurations, sometimes it's handy to have the reserved memory dependent >> on the value of System RAM -- that's mostly for distributors that pre-setup >> the kernel command line to avoid a unbootable system after some memory has >> been removed from the machine. This is to let the distribution provide a single value that works on machines with very little RAM, and machines that need gigabytes in order to boot. Where does specifying the offset/absolute-address come in? >> As James mentioned for arm64, in case of relocatable/kaslr kernels, >> the efistub may relocate the kernel over the top during boot as part >> of its KASLR. > > It would be sad if we couldn't specify kaslr and kdump at the same time. We can. The problem comes when you specify an absolute-address that should be reserved for user-space to eventually load the kdump kernel into. This is fragile for a number of reasons. > Since kaslr will skip any of memory regions whose attributes are not > CONVENTIONAL_MEMORY for allocating a relocated kernel image, we will be > able to have a dedicated range of memory reserved for kdump. > In this case, using an "offset" in "crashkernel=" will be crucial. > > (I don't know how we can notify uefi of the region though.) I'm confused. panic()->kdump:boot doesn't go via UEFI. It passes information in the DT:/chosen that may have been generated by the EFIStub, but it doesn't (and must not) change the EFI memory map. We need to decide where the crashdump kernel region is when the first-kernel generates its page tables, as the protect/unprotect mechanism wants to be able to unmap them. There is no reason for UEFI to know about the kdump region, BootServices are long-gone by the time its location has to be decided. >> So, the offset field may make more sense for >> non-relocatable/static kernels, but for newer kernels, its better to >> use the 'Extended crashkernel syntax' syntax which is also supported >> by newer distribution versions. (this extended crashkernel syntax looks like a tangent: its about specifying one-value string to specify a reservation-size on both small-memory and large-memory machines). I don't think 'relocatable kernels' are relevant here. The KASLR series changed the kernel to no longer run from the linear map, so where in the linear map we allocate memory for the crash-kernel to boot from can't matter. These changes were merged before kexec/kdump support was added. Even before this change, (I recall that:) the kernel would discard memory below its text. This isn't a problem as kexec-tools (typically) locates the kernel at the bottom of the region, and physical memory outside this range isn't accessible anyway because of the "linux,usable-memory" property. When we do want to access it, we remap it using the vmcore helpers. I think this @offset must be for kernels that have to run from a physical/virtual address that is known at compile time. We don't have this problem on arm64, and specifying @offset makes kdump less reliable: | cannot reserve crashkernel: region overlaps reserved memory > How better is it for the case? > > I don't know exactly what you mean by "newer kernel/distribution", but > kdump on arm64 supports this feature from the day one. > (It is basically independent from architectures.) Support @offset? Yes, its core code allowing this. >> For e.g. see ubuntu trusty kdump-config man page - >> <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>: >> >> kdump kernel relocation address does not match crashkernel= parameter: >> For non-relocatable architectures, the kdump kernel must be >> built with a predetermined start address. This message >> indicates that the start address of the kdump kernel and the >> start address in the crashkernel= parameter do not match. arm64 doesn't have this issue. The 'predetermined start address' is a relative value stored in the header. This is so the bootloader can place the kernel anywhere in memory and still have it boot. I think we're in the weeds here: adding @offset to your 'crashkernel=' cmdline option tells the kernel you know this address will be free and not-reserved when it comes to reservation time. This isn't generally true. Unless you wrote the DT and the bootloader, you can't know this. Wasn't this patch moving the elfcorehdr reservation up to be before any dynamic reservations, to prevent them overlapping? Thanks, James
Hi, On Fri, Jan 19, 2018 at 5:46 PM, James Morse <james.morse@arm.com> wrote: > Hello, > > On 16/01/18 07:07, takahiro.akashi@linaro.org wrote: >> On Mon, Jan 15, 2018 at 10:14:05AM +0530, Bhupesh SHARMA wrote: >>> On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal@nxp.com> wrote: >>>>> On 08/01/18 04:31, Poonam Aggrwal wrote: >>>>>> Yeah, this is a good point. So ideally the address of the crash kernel >>>>>> should be diligently provided by the user based on the system. > >>>>> Even better: the region to store the crash kernel in should be chosen by the >>>>> kernel. >>>>> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to >>>>> place the reserved region. >>>>> Even if I specified a reasonable physical address, the >>>>> efistub may relocate the kernel over the top during boot as part of its KASLR >>>>> work. > >>>> Agree >>>>> >>>>> (Why does anyone ever need to specify an offset here?) >>>> offset is normally an optional argument. Request Takahiro to provide his inputs. Does this imply any updates in kexec design/implementation/documentation? >>> >>> offset is a optional argument. For relocatable kernels (and kernels >>> which support KASLR), specifying offset is normally not needed. >>> >>> Please refer to the 'Extended crashkernel syntax' documentation >>> (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>): >>> >>> Extended crashkernel syntax >>> =========================== >>> >>> While the "crashkernel=size[@offset]" syntax is sufficient for most >>> configurations, sometimes it's handy to have the reserved memory dependent >>> on the value of System RAM -- that's mostly for distributors that pre-setup >>> the kernel command line to avoid a unbootable system after some memory has >>> been removed from the machine. > > This is to let the distribution provide a single value that works on machines > with very little RAM, and machines that need gigabytes in order to boot. > > Where does specifying the offset/absolute-address come in? It comes into picture where we know at compile time where we want to place the crashkernel at. I remember using it on a few arm 32-bit machines but they had a static and well defined layout of the memory map from the available system RAM, and required crashkernel to be placed at predefined offset to prevent 'undefined' behaviour. I don't think this is relevant though now with the newer KASLR enabled kernels where we can RANDOMIZE the _text and also the module load area. So using a static offset value in this case would be risky to say the least ... > >>> As James mentioned for arm64, in case of relocatable/kaslr kernels, >>> the efistub may relocate the kernel over the top during boot as part >>> of its KASLR. >> >> It would be sad if we couldn't specify kaslr and kdump at the same time. > > We can. The problem comes when you specify an absolute-address that should be > reserved for user-space to eventually load the kdump kernel into. This is > fragile for a number of reasons. ... Fedora allows using kdump and KASLR both, so I am not sure that's a problem. >> Since kaslr will skip any of memory regions whose attributes are not >> CONVENTIONAL_MEMORY for allocating a relocated kernel image, we will be >> able to have a dedicated range of memory reserved for kdump. >> In this case, using an "offset" in "crashkernel=" will be crucial. >> >> (I don't know how we can notify uefi of the region though.) > > I'm confused. panic()->kdump:boot doesn't go via UEFI. It passes information in > the DT:/chosen that may have been generated by the EFIStub, but it doesn't (and > must not) change the EFI memory map. > > We need to decide where the crashdump kernel region is when the first-kernel > generates its page tables, as the protect/unprotect mechanism wants to be able > to unmap them. > There is no reason for UEFI to know about the kdump region, BootServices are > long-gone by the time its location has to be decided. >>> So, the offset field may make more sense for >>> non-relocatable/static kernels, but for newer kernels, its better to >>> use the 'Extended crashkernel syntax' syntax which is also supported >>> by newer distribution versions. > > (this extended crashkernel syntax looks like a tangent: its about specifying > one-value string to specify a reservation-size on both small-memory and > large-memory machines). > > I don't think 'relocatable kernels' are relevant here. The KASLR series changed > the kernel to no longer run from the linear map, so where in the linear map we > allocate memory for the crash-kernel to boot from can't matter. These changes > were merged before kexec/kdump support was added. Agreed. > Even before this change, (I recall that:) the kernel would discard memory below > its text. This isn't a problem as kexec-tools (typically) locates the kernel at > the bottom of the region, and physical memory outside this range isn't > accessible anyway because of the "linux,usable-memory" property. When we do want > to access it, we remap it using the vmcore helpers. > > I think this @offset must be for kernels that have to run from a > physical/virtual address that is known at compile time. We don't have this > problem on arm64, and specifying @offset makes kdump less reliable: > | cannot reserve crashkernel: region overlaps reserved memory > > >> How better is it for the case? >> >> I don't know exactly what you mean by "newer kernel/distribution", but >> kdump on arm64 supports this feature from the day one. >> (It is basically independent from architectures.) > > Support @offset? Yes, its core code allowing this. > > >>> For e.g. see ubuntu trusty kdump-config man page - >>> <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>: >>> >>> kdump kernel relocation address does not match crashkernel= parameter: >>> For non-relocatable architectures, the kdump kernel must be >>> built with a predetermined start address. This message >>> indicates that the start address of the kdump kernel and the >>> start address in the crashkernel= parameter do not match. > > arm64 doesn't have this issue. The 'predetermined start address' is a relative > value stored in the header. This is so the bootloader can place the kernel > anywhere in memory and still have it boot. > > I think we're in the weeds here: adding @offset to your 'crashkernel=' cmdline > option tells the kernel you know this address will be free and not-reserved when > it comes to reservation time. This isn't generally true. > Unless you wrote the DT and the bootloader, you can't know this. Agreed. See my comments above. What I meant was that using @offset with newer kernels/distributions which support KASLR can lead to undefined behavior. What I tried to capture from the ubuntu trusty man-page was that "kdump kernel must be built with a predetermined start address for non-relocatable architectures". Normally we use the same primary kernel as the kdump kernel, so if the primary kernels is Relocatable + KASLR enabled, the kdump kernel would support the same as well. In that case assuming a fixed offset at which the kdump kernel can be loaded at or assuming a static start address for the kdump kernel can lead to issues. Regards, Bhupesh > Wasn't this patch moving the elfcorehdr reservation up to be before any dynamic > reservations, to prevent them overlapping? > > > Thanks, > > James
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 00e7b90..24ce15d 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -453,6 +453,14 @@ void __init arm64_memblock_init(void) * Register the kernel text, kernel data, initrd, and initial * pagetables with memblock. */ + + /* make these the first reservations to avoid chances of + * overlap + */ + reserve_elfcorehdr(); + + reserve_crashkernel(); + memblock_reserve(__pa_symbol(_text), _end - _text); #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start) { @@ -472,10 +480,6 @@ void __init arm64_memblock_init(void) else arm64_dma_phys_limit = PHYS_MASK + 1; - reserve_crashkernel(); - - reserve_elfcorehdr(); - high_memory = __va(memblock_end_of_DRAM() - 1) + 1; dma_contiguous_reserve(arm64_dma_phys_limit);