diff mbox series

[v2] arm64: Expand the static memblock memory table

Message ID 20220517114309.10228-1-zhouguanghui1@huawei.com (mailing list archive)
State New
Headers show
Series [v2] arm64: Expand the static memblock memory table | expand

Commit Message

Zhou Guanghui May 17, 2022, 11:43 a.m. UTC
In a system using HBM, a multi-bit ECC error occurs, and the BIOS
saves the corresponding area (for example, 2 MB). When the system
restarts next time, these areas are isolated and not reported or
reported as EFI_UNUSABLE_MEMORY. Both of them lead to an increase
in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
a larger number of memblocks.

For example, if the EFI_UNUSABLE_MEMORY type is reported:
...
memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0
...

If the size of the init memblock regions is exceeded before the
array size can be resized, the excess memory will be lost.

Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
---
 arch/arm64/include/asm/memory.h |  9 +++++++++
 mm/memblock.c                   | 14 +++++++++-----
 2 files changed, 18 insertions(+), 5 deletions(-)

Comments

Mike Rapoport May 17, 2022, 12:38 p.m. UTC | #1
Hi,

On Tue, May 17, 2022 at 11:43:09AM +0000, Zhou Guanghui wrote:
> In a system using HBM, a multi-bit ECC error occurs, and the BIOS
> saves the corresponding area (for example, 2 MB). When the system
> restarts next time, these areas are isolated and not reported or
> reported as EFI_UNUSABLE_MEMORY. Both of them lead to an increase
> in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
> a larger number of memblocks.

I'd slightly rephrase the description here:

In a system using HBM, a multi-bit ECC error may occur, and the BIOS will
mark the corresponding area (for example, 2 MB)i as unusable.  When the
system restarts next time, these areas are not reported or reported as
EFI_UNUSABLE_MEMORY. Both cases lead to an increase in the number of
memblocks, whereas EFI_UNUSABLE_MEMORY leads to a larger number of
memblocks.
 
> For example, if the EFI_UNUSABLE_MEMORY type is reported:
> ...
> memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
> memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
> memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
> memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
> memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
> memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
> memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
> memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0
> ...
> 
> If the size of the init memblock regions is exceeded before the
> array size can be resized, the excess memory will be lost.

And here I'd make it something like:

The EFI memory map is parsed to construct the memblock arrays before the
memblock arrays can be resized. As the result, memory regions beyond
INIT_MEMBLOCK_REGIONS are lost.

Allow overriding memblock.memory array size with architecture defined
INIT_MEMBLOCK_MEMORY_REGIONS and make arm64 to set
INIT_MEMBLOCK_MEMORY_REGIONS to 1024 when CONFIG_EFI is enabled.

> 
> Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>

With changelog updates along those lines

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  arch/arm64/include/asm/memory.h |  9 +++++++++
>  mm/memblock.c                   | 14 +++++++++-----
>  2 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0af70d9abede..eda61c0389c4 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -364,6 +364,15 @@ void dump_mem_limit(void);
>  # define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS + 1)
>  #endif
>  
> +/*
> + * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
> + * memory block into multiple parts. As a result, the number of memory regions
> + * is large.
> + */
> +#ifdef CONFIG_EFI
> +#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
> +#endif
> +
>  #include <asm-generic/memory_model.h>
>  
>  #endif /* __ASM_MEMORY_H */
> diff --git a/mm/memblock.c b/mm/memblock.c
> index e4f03a6e8e56..7c63571a69d7 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -29,6 +29,10 @@
>  # define INIT_MEMBLOCK_RESERVED_REGIONS		INIT_MEMBLOCK_REGIONS
>  #endif
>  
> +#ifndef INIT_MEMBLOCK_MEMORY_REGIONS
> +#define INIT_MEMBLOCK_MEMORY_REGIONS		INIT_MEMBLOCK_REGIONS
> +#endif
> +
>  /**
>   * DOC: memblock overview
>   *
> @@ -55,9 +59,9 @@
>   * the allocator metadata. The "memory" and "reserved" types are nicely
>   * wrapped with struct memblock. This structure is statically
>   * initialized at build time. The region arrays are initially sized to
> - * %INIT_MEMBLOCK_REGIONS for "memory" and %INIT_MEMBLOCK_RESERVED_REGIONS
> - * for "reserved". The region array for "physmem" is initially sized to
> - * %INIT_PHYSMEM_REGIONS.
> + * %INIT_MEMBLOCK_MEMORY_REGIONS for "memory" and
> + * %INIT_MEMBLOCK_RESERVED_REGIONS for "reserved". The region array
> + * for "physmem" is initially sized to %INIT_PHYSMEM_REGIONS.
>   * The memblock_allow_resize() enables automatic resizing of the region
>   * arrays during addition of new regions. This feature should be used
>   * with care so that memory allocated for the region array will not
> @@ -102,7 +106,7 @@ unsigned long min_low_pfn;
>  unsigned long max_pfn;
>  unsigned long long max_possible_pfn;
>  
> -static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
> +static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_MEMORY_REGIONS] __initdata_memblock;
>  static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_RESERVED_REGIONS] __initdata_memblock;
>  #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
>  static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS];
> @@ -111,7 +115,7 @@ static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS
>  struct memblock memblock __initdata_memblock = {
>  	.memory.regions		= memblock_memory_init_regions,
>  	.memory.cnt		= 1,	/* empty dummy entry */
> -	.memory.max		= INIT_MEMBLOCK_REGIONS,
> +	.memory.max		= INIT_MEMBLOCK_MEMORY_REGIONS,
>  	.memory.name		= "memory",
>  
>  	.reserved.regions	= memblock_reserved_init_regions,
> -- 
> 2.17.1
>
Anshuman Khandual May 18, 2022, 4:39 a.m. UTC | #2
Hi Zhou,

A small nit.

This changes generic memblock to accommodate arm64 specific scenario.
Keeping the subject line as 'mm/memblock: ...' might be better.

On 5/17/22 17:13, Zhou Guanghui wrote:
> In a system using HBM, a multi-bit ECC error occurs, and the BIOS
> saves the corresponding area (for example, 2 MB). When the system
> restarts next time, these areas are isolated and not reported or
> reported as EFI_UNUSABLE_MEMORY. Both of them lead to an increase

Which cases dont get reported rather than as EFI_UNUSABLE_MEMORY ? Is
this supported on arm64 platform via mainline kernel ?

> in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
> a larger number of memblocks.
> 
> For example, if the EFI_UNUSABLE_MEMORY type is reported:
> ...
> memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
> memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
> memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
> memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
> memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
> memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
> memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
> memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
> memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
> memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0

Got it.

> ...
> 
> If the size of the init memblock regions is exceeded before the
> array size can be resized, the excess memory will be lost.

Could you please elaborate more on why additional memblock regions can
not be accommodated via memblock array resizing ?

> 
> Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
> ---
>  arch/arm64/include/asm/memory.h |  9 +++++++++
>  mm/memblock.c                   | 14 +++++++++-----
>  2 files changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0af70d9abede..eda61c0389c4 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -364,6 +364,15 @@ void dump_mem_limit(void);
>  # define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS + 1)
>  #endif
>  
> +/*
> + * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
> + * memory block into multiple parts. As a result, the number of memory regions
> + * is large.
> + */
> +#ifdef CONFIG_EFI

Could not memblock regions tagged with MEMBLOCK_NOMAP flag not present
on non-EFI systems ? Just wondering, are there not some other scenarios
which will also require expanded static memblock array.

> +#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
> +#endif
> +
>  #include <asm-generic/memory_model.h>
>  
>  #endif /* __ASM_MEMORY_H */
> diff --git a/mm/memblock.c b/mm/memblock.c
> index e4f03a6e8e56..7c63571a69d7 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -29,6 +29,10 @@
>  # define INIT_MEMBLOCK_RESERVED_REGIONS		INIT_MEMBLOCK_REGIONS
>  #endif
>  
> +#ifndef INIT_MEMBLOCK_MEMORY_REGIONS
> +#define INIT_MEMBLOCK_MEMORY_REGIONS		INIT_MEMBLOCK_REGIONS
> +#endif

Why create an additional macro INIT_MEMBLOCK_MEMORY_REGIONS ? Why cannot
INIT_MEMBLOCK_REGIONS be defined in the platform directly like the other
macro INIT_MEMBLOCK_RESERVED_REGIONS ?

> +
>  /**
>   * DOC: memblock overview
>   *
> @@ -55,9 +59,9 @@
>   * the allocator metadata. The "memory" and "reserved" types are nicely
>   * wrapped with struct memblock. This structure is statically
>   * initialized at build time. The region arrays are initially sized to
> - * %INIT_MEMBLOCK_REGIONS for "memory" and %INIT_MEMBLOCK_RESERVED_REGIONS
> - * for "reserved". The region array for "physmem" is initially sized to
> - * %INIT_PHYSMEM_REGIONS.
> + * %INIT_MEMBLOCK_MEMORY_REGIONS for "memory" and
> + * %INIT_MEMBLOCK_RESERVED_REGIONS for "reserved". The region array
> + * for "physmem" is initially sized to %INIT_PHYSMEM_REGIONS.
>   * The memblock_allow_resize() enables automatic resizing of the region
>   * arrays during addition of new regions. This feature should be used
>   * with care so that memory allocated for the region array will not
> @@ -102,7 +106,7 @@ unsigned long min_low_pfn;
>  unsigned long max_pfn;
>  unsigned long long max_possible_pfn;
>  
> -static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
> +static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_MEMORY_REGIONS] __initdata_memblock;
>  static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_RESERVED_REGIONS] __initdata_memblock;
>  #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
>  static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS];
> @@ -111,7 +115,7 @@ static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS
>  struct memblock memblock __initdata_memblock = {
>  	.memory.regions		= memblock_memory_init_regions,
>  	.memory.cnt		= 1,	/* empty dummy entry */
> -	.memory.max		= INIT_MEMBLOCK_REGIONS,
> +	.memory.max		= INIT_MEMBLOCK_MEMORY_REGIONS,
>  	.memory.name		= "memory",
>  
>  	.reserved.regions	= memblock_reserved_init_regions,

- Anshuman
Zhou Guanghui May 27, 2022, 8:56 a.m. UTC | #3
Hi Anshuman,

在 2022/5/18 12:40, Anshuman Khandual 写道:
> Hi Zhou,
> 
> A small nit.
> 
> This changes generic memblock to accommodate arm64 specific scenario.
> Keeping the subject line as 'mm/memblock: ...' might be better.
> 

I will add memblock to the subject line.

> On 5/17/22 17:13, Zhou Guanghui wrote:
>> In a system using HBM, a multi-bit ECC error occurs, and the BIOS
>> saves the corresponding area (for example, 2 MB). When the system
>> restarts next time, these areas are isolated and not reported or
>> reported as EFI_UNUSABLE_MEMORY. Both of them lead to an increase
> 
> Which cases dont get reported rather than as EFI_UNUSABLE_MEMORY ? Is
> this supported on arm64 platform via mainline kernel ?
> 

The BIOS determines how to report the memory area that cannot be used to 
the kernel. Do not report the memory area to the kernel or inform the 
kernel that the memory area is unusable.

>> in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
>> a larger number of memblocks.
>>
>> For example, if the EFI_UNUSABLE_MEMORY type is reported:
>> ...
>> memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
>> memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>> memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
>> memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>> memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
>> memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>> memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
>> memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
>> memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>> memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
>> memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>> memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
>> memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>> memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0
> 
> Got it.
> 
>> ...
>>
>> If the size of the init memblock regions is exceeded before the
>> array size can be resized, the excess memory will be lost.
> 
> Could you please elaborate more on why additional memblock regions can
> not be accommodated via memblock array resizing ?
> 

As described in the memblock_double_array function: We don't allow 
resizing until we know about the reserved regions of memory that aren' 
not suitable for allocation.

>>
>> Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
>> ---
>>   arch/arm64/include/asm/memory.h |  9 +++++++++
>>   mm/memblock.c                   | 14 +++++++++-----
>>   2 files changed, 18 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 0af70d9abede..eda61c0389c4 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -364,6 +364,15 @@ void dump_mem_limit(void);
>>   # define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS + 1)
>>   #endif
>>   
>> +/*
>> + * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
>> + * memory block into multiple parts. As a result, the number of memory regions
>> + * is large.
>> + */
>> +#ifdef CONFIG_EFI
> 
> Could not memblock regions tagged with MEMBLOCK_NOMAP flag not present
> on non-EFI systems ? Just wondering, are there not some other scenarios
> which will also require expanded static memblock array.

Systems using devicetree can also have "no-map" memory. However, in this 
case, the expanded static memblock array is required only when a large 
number of such no-map reserved memories are manually added. I don't know 
if any users will do that.

Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml

As to whether other scenarios also require expanded static memblock 
arrays, I really don't know.

> 
>> +#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
>> +#endif
>> +
>>   #include <asm-generic/memory_model.h>
>>   
>>   #endif /* __ASM_MEMORY_H */
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index e4f03a6e8e56..7c63571a69d7 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -29,6 +29,10 @@
>>   # define INIT_MEMBLOCK_RESERVED_REGIONS		INIT_MEMBLOCK_REGIONS
>>   #endif
>>   
>> +#ifndef INIT_MEMBLOCK_MEMORY_REGIONS
>> +#define INIT_MEMBLOCK_MEMORY_REGIONS		INIT_MEMBLOCK_REGIONS
>> +#endif
> 
> Why create an additional macro INIT_MEMBLOCK_MEMORY_REGIONS ? Why cannot
> INIT_MEMBLOCK_REGIONS be defined in the platform directly like the other
> macro INIT_MEMBLOCK_RESERVED_REGIONS ?
> 

The number of reserved memblocks does not need to be increased.

>> +
>>   /**
>>    * DOC: memblock overview
>>    *
>> @@ -55,9 +59,9 @@
>>    * the allocator metadata. The "memory" and "reserved" types are nicely
>>    * wrapped with struct memblock. This structure is statically
>>    * initialized at build time. The region arrays are initially sized to
>> - * %INIT_MEMBLOCK_REGIONS for "memory" and %INIT_MEMBLOCK_RESERVED_REGIONS
>> - * for "reserved". The region array for "physmem" is initially sized to
>> - * %INIT_PHYSMEM_REGIONS.
>> + * %INIT_MEMBLOCK_MEMORY_REGIONS for "memory" and
>> + * %INIT_MEMBLOCK_RESERVED_REGIONS for "reserved". The region array
>> + * for "physmem" is initially sized to %INIT_PHYSMEM_REGIONS.
>>    * The memblock_allow_resize() enables automatic resizing of the region
>>    * arrays during addition of new regions. This feature should be used
>>    * with care so that memory allocated for the region array will not
>> @@ -102,7 +106,7 @@ unsigned long min_low_pfn;
>>   unsigned long max_pfn;
>>   unsigned long long max_possible_pfn;
>>   
>> -static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
>> +static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_MEMORY_REGIONS] __initdata_memblock;
>>   static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_RESERVED_REGIONS] __initdata_memblock;
>>   #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
>>   static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS];
>> @@ -111,7 +115,7 @@ static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS
>>   struct memblock memblock __initdata_memblock = {
>>   	.memory.regions		= memblock_memory_init_regions,
>>   	.memory.cnt		= 1,	/* empty dummy entry */
>> -	.memory.max		= INIT_MEMBLOCK_REGIONS,
>> +	.memory.max		= INIT_MEMBLOCK_MEMORY_REGIONS,
>>   	.memory.name		= "memory",
>>   
>>   	.reserved.regions	= memblock_reserved_init_regions,
> 
> - Anshuman
> 

Thanks, Anshuman
Anshuman Khandual June 3, 2022, 10:54 a.m. UTC | #4
On 5/27/22 14:26, Zhouguanghui (OS Kernel) wrote:
> Hi Anshuman,
> 
> 在 2022/5/18 12:40, Anshuman Khandual 写道:
>> Hi Zhou,
>>
>> A small nit.
>>
>> This changes generic memblock to accommodate arm64 specific scenario.
>> Keeping the subject line as 'mm/memblock: ...' might be better.
>>
> 
> I will add memblock to the subject line.
> 
>> On 5/17/22 17:13, Zhou Guanghui wrote:
>>> In a system using HBM, a multi-bit ECC error occurs, and the BIOS
>>> saves the corresponding area (for example, 2 MB). When the system
>>> restarts next time, these areas are isolated and not reported or
>>> reported as EFI_UNUSABLE_MEMORY. Both of them lead to an increase
>>
>> Which cases dont get reported rather than as EFI_UNUSABLE_MEMORY ? Is
>> this supported on arm64 platform via mainline kernel ?
>>
> 
> The BIOS determines how to report the memory area that cannot be used to 
> the kernel. Do not report the memory area to the kernel or inform the 
> kernel that the memory area is unusable.

Right, but just curious whether there are real systems in the field with
this feature running mainline kernel ? OR this is just being future proof.

> 
>>> in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
>>> a larger number of memblocks.
>>>
>>> For example, if the EFI_UNUSABLE_MEMORY type is reported:
>>> ...
>>> memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
>>> memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>>> memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
>>> memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>>> memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
>>> memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
>>> memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
>>> memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
>>> memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>>> memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
>>> memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>>> memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
>>> memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
>>> memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0
>>
>> Got it.
>>
>>> ...
>>>
>>> If the size of the init memblock regions is exceeded before the
>>> array size can be resized, the excess memory will be lost.
>>
>> Could you please elaborate more on why additional memblock regions can
>> not be accommodated via memblock array resizing ?
>>
> 
> As described in the memblock_double_array function: We don't allow 
> resizing until we know about the reserved regions of memory that aren' 
> not suitable for allocation.
> 
>>>
>>> Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com>
>>> ---
>>>   arch/arm64/include/asm/memory.h |  9 +++++++++
>>>   mm/memblock.c                   | 14 +++++++++-----
>>>   2 files changed, 18 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index 0af70d9abede..eda61c0389c4 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -364,6 +364,15 @@ void dump_mem_limit(void);
>>>   # define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS + 1)
>>>   #endif
>>>   
>>> +/*
>>> + * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
>>> + * memory block into multiple parts. As a result, the number of memory regions
>>> + * is large.
>>> + */
>>> +#ifdef CONFIG_EFI
>>
>> Could not memblock regions tagged with MEMBLOCK_NOMAP flag not present
>> on non-EFI systems ? Just wondering, are there not some other scenarios
>> which will also require expanded static memblock array.
> 
> Systems using devicetree can also have "no-map" memory. However, in this 
> case, the expanded static memblock array is required only when a large 
> number of such no-map reserved memories are manually added. I don't know 
> if any users will do that.
> 
> Documentation/devicetree/bindings/reserved-memory/reserved-memory.yaml
> 
> As to whether other scenarios also require expanded static memblock 
> arrays, I really don't know.

In that case could this comment here be more specific about this increased
static array size, being applicable only for MEMBLOCK_NOMAP regions on EFI
system with EFI_UNUSABLE_MEMORY tagging support. Is there an way to narrow
this down further wrt EFI_UNUSABLE_MEMORY, rather than blanket EFI ?

+/*
+ * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
+ * memory block into multiple parts. As a result, the number of memory regions
+ * is large.
+ */
+#ifdef CONFIG_EFI
+#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
+#endif
+

> 
>>
>>> +#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
>>> +#endif
>>> +
>>>   #include <asm-generic/memory_model.h>
>>>   
>>>   #endif /* __ASM_MEMORY_H */
>>> diff --git a/mm/memblock.c b/mm/memblock.c
>>> index e4f03a6e8e56..7c63571a69d7 100644
>>> --- a/mm/memblock.c
>>> +++ b/mm/memblock.c
>>> @@ -29,6 +29,10 @@
>>>   # define INIT_MEMBLOCK_RESERVED_REGIONS		INIT_MEMBLOCK_REGIONS
>>>   #endif
>>>   
>>> +#ifndef INIT_MEMBLOCK_MEMORY_REGIONS
>>> +#define INIT_MEMBLOCK_MEMORY_REGIONS		INIT_MEMBLOCK_REGIONS
>>> +#endif
>>
>> Why create an additional macro INIT_MEMBLOCK_MEMORY_REGIONS ? Why cannot
>> INIT_MEMBLOCK_REGIONS be defined in the platform directly like the other
>> macro INIT_MEMBLOCK_RESERVED_REGIONS ?
>>
> 
> The number of reserved memblocks does not need to be increased.

Got it.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0af70d9abede..eda61c0389c4 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -364,6 +364,15 @@  void dump_mem_limit(void);
 # define INIT_MEMBLOCK_RESERVED_REGIONS	(INIT_MEMBLOCK_REGIONS + NR_CPUS + 1)
 #endif
 
+/*
+ * memory regions which marked with flag MEMBLOCK_NOMAP may divide a continuous
+ * memory block into multiple parts. As a result, the number of memory regions
+ * is large.
+ */
+#ifdef CONFIG_EFI
+#define INIT_MEMBLOCK_MEMORY_REGIONS	1024
+#endif
+
 #include <asm-generic/memory_model.h>
 
 #endif /* __ASM_MEMORY_H */
diff --git a/mm/memblock.c b/mm/memblock.c
index e4f03a6e8e56..7c63571a69d7 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -29,6 +29,10 @@ 
 # define INIT_MEMBLOCK_RESERVED_REGIONS		INIT_MEMBLOCK_REGIONS
 #endif
 
+#ifndef INIT_MEMBLOCK_MEMORY_REGIONS
+#define INIT_MEMBLOCK_MEMORY_REGIONS		INIT_MEMBLOCK_REGIONS
+#endif
+
 /**
  * DOC: memblock overview
  *
@@ -55,9 +59,9 @@ 
  * the allocator metadata. The "memory" and "reserved" types are nicely
  * wrapped with struct memblock. This structure is statically
  * initialized at build time. The region arrays are initially sized to
- * %INIT_MEMBLOCK_REGIONS for "memory" and %INIT_MEMBLOCK_RESERVED_REGIONS
- * for "reserved". The region array for "physmem" is initially sized to
- * %INIT_PHYSMEM_REGIONS.
+ * %INIT_MEMBLOCK_MEMORY_REGIONS for "memory" and
+ * %INIT_MEMBLOCK_RESERVED_REGIONS for "reserved". The region array
+ * for "physmem" is initially sized to %INIT_PHYSMEM_REGIONS.
  * The memblock_allow_resize() enables automatic resizing of the region
  * arrays during addition of new regions. This feature should be used
  * with care so that memory allocated for the region array will not
@@ -102,7 +106,7 @@  unsigned long min_low_pfn;
 unsigned long max_pfn;
 unsigned long long max_possible_pfn;
 
-static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
+static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_MEMORY_REGIONS] __initdata_memblock;
 static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_RESERVED_REGIONS] __initdata_memblock;
 #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP
 static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS];
@@ -111,7 +115,7 @@  static struct memblock_region memblock_physmem_init_regions[INIT_PHYSMEM_REGIONS
 struct memblock memblock __initdata_memblock = {
 	.memory.regions		= memblock_memory_init_regions,
 	.memory.cnt		= 1,	/* empty dummy entry */
-	.memory.max		= INIT_MEMBLOCK_REGIONS,
+	.memory.max		= INIT_MEMBLOCK_MEMORY_REGIONS,
 	.memory.name		= "memory",
 
 	.reserved.regions	= memblock_reserved_init_regions,