diff mbox

[11/11] x86, mem_hotplug: Allocate memory near kernel image before SRAT is parsed.

Message ID 1377596268-31552-12-git-send-email-tangchen@cn.fujitsu.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

tangchen Aug. 27, 2013, 9:37 a.m. UTC
After memblock is ready, before SRAT is parsed, we should allocate memory
near the kernel image. So this patch does the following:

1. After memblock is ready, make memblock allocate memory from low address
   to high, and set the lowest limit to the end of kernel image.
2. After SRAT is parsed, make memblock behave as default, allocate memory
   from high address to low, and reset the lowest limit to 0.

This behavior is controlled by movablenode boot option.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
---
 arch/x86/kernel/setup.c |   37 +++++++++++++++++++++++++++++++++++++
 1 files changed, 37 insertions(+), 0 deletions(-)

Comments

Toshi Kani Sept. 4, 2013, 7:40 p.m. UTC | #1
On Tue, 2013-08-27 at 17:37 +0800, Tang Chen wrote:
> After memblock is ready, before SRAT is parsed, we should allocate memory
> near the kernel image. So this patch does the following:
> 
> 1. After memblock is ready, make memblock allocate memory from low address
>    to high, and set the lowest limit to the end of kernel image.
> 2. After SRAT is parsed, make memblock behave as default, allocate memory
>    from high address to low, and reset the lowest limit to 0.
> 
> This behavior is controlled by movablenode boot option.
> 
> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
> ---
>  arch/x86/kernel/setup.c |   37 +++++++++++++++++++++++++++++++++++++
>  1 files changed, 37 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index fa7b5f0..0b35bbd 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1087,6 +1087,31 @@ void __init setup_arch(char **cmdline_p)
>  	trim_platform_memory_ranges();
>  	trim_low_memory_range();
>  
> +#ifdef CONFIG_MOVABLE_NODE
> +	if (movablenode_enable_srat) {
> +		/*
> +		 * Memory used by the kernel cannot be hot-removed because Linux cannot
> +		 * migrate the kernel pages. When memory hotplug is enabled, we should
> +		 * prevent memblock from allocating memory for the kernel.
> +		 *
> +		 * ACPI SRAT records all hotpluggable memory ranges. But before SRAT is
> +		 * parsed, we don't know about it.
> +		 *
> +		 * The kernel image is loaded into memory at very early time. We cannot
> +		 * prevent this anyway. So on NUMA system, we set any node the kernel
> +		 * resides in as un-hotpluggable.
> +		 *
> +		 * Since on modern servers, one node could have double-digit gigabytes
> +		 * memory, we can assume the memory around the kernel image is also

Memory hotplug can be supported on virtualized environments, and we
should allow using SRAT on them as a next step.  In such environments,
memory hotplug will be performed on per memory device object basis for
workload balancing, and double-digit gigabytes is unlikely the case for
now.  So, I'd suggest it should instead state that all allocations are
kept small until SRAT is pursed.

> +		 * un-hotpluggable. So before SRAT is parsed, just allocate memory near
> +		 * the kernel image to try the best to keep the kernel away from
> +		 * hotpluggable memory.
> +		 */
> +		memblock_set_current_order(MEMBLOCK_ORDER_LOW_TO_HIGH);
> +		memblock_set_current_limit_low(__pa_symbol(_end));
> +	}
> +#endif /* CONFIG_MOVABLE_NODE */

Should the above block be put into init_mem_mapping() since it is
memblock initialization?  It is good to have some concise comments here,
though.

> +
>  	init_mem_mapping();
>  
>  	early_trap_pf_init();
> @@ -1127,6 +1152,18 @@ void __init setup_arch(char **cmdline_p)
>  	early_acpi_boot_init();
>  
>  	initmem_init();
> +
> +#ifdef CONFIG_MOVABLE_NODE
> +	if (movablenode_enable_srat) {
> +		/*
> +		 * When ACPI SRAT is parsed, which is done in initmem_init(), set
> +		 * memblock back to the default behavior.
> +		 */
> +		memblock_set_current_order(MEMBLOCK_ORDER_DEFAULT);
> +		memblock_set_current_limit_low(0);
> +	}
> +#endif /* CONFIG_MOVABLE_NODE */

Similarly, should this block be put into initmem_init() with some
comment here?

Thanks,
-Toshi

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fa7b5f0..0b35bbd 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1087,6 +1087,31 @@  void __init setup_arch(char **cmdline_p)
 	trim_platform_memory_ranges();
 	trim_low_memory_range();
 
+#ifdef CONFIG_MOVABLE_NODE
+	if (movablenode_enable_srat) {
+		/*
+		 * Memory used by the kernel cannot be hot-removed because Linux cannot
+		 * migrate the kernel pages. When memory hotplug is enabled, we should
+		 * prevent memblock from allocating memory for the kernel.
+		 *
+		 * ACPI SRAT records all hotpluggable memory ranges. But before SRAT is
+		 * parsed, we don't know about it.
+		 *
+		 * The kernel image is loaded into memory at very early time. We cannot
+		 * prevent this anyway. So on NUMA system, we set any node the kernel
+		 * resides in as un-hotpluggable.
+		 *
+		 * Since on modern servers, one node could have double-digit gigabytes
+		 * memory, we can assume the memory around the kernel image is also
+		 * un-hotpluggable. So before SRAT is parsed, just allocate memory near
+		 * the kernel image to try the best to keep the kernel away from
+		 * hotpluggable memory.
+		 */
+		memblock_set_current_order(MEMBLOCK_ORDER_LOW_TO_HIGH);
+		memblock_set_current_limit_low(__pa_symbol(_end));
+	}
+#endif /* CONFIG_MOVABLE_NODE */
+
 	init_mem_mapping();
 
 	early_trap_pf_init();
@@ -1127,6 +1152,18 @@  void __init setup_arch(char **cmdline_p)
 	early_acpi_boot_init();
 
 	initmem_init();
+
+#ifdef CONFIG_MOVABLE_NODE
+	if (movablenode_enable_srat) {
+		/*
+		 * When ACPI SRAT is parsed, which is done in initmem_init(), set
+		 * memblock back to the default behavior.
+		 */
+		memblock_set_current_order(MEMBLOCK_ORDER_DEFAULT);
+		memblock_set_current_limit_low(0);
+	}
+#endif /* CONFIG_MOVABLE_NODE */
+
 	memblock_find_dma_reserve();
 
 #ifdef CONFIG_KVM_GUEST