diff mbox

[1/2] arm64: avoid alloc memory on offline node

Message ID 87lgbk59gs.fsf@e105922-lin.cambridge.arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Punit Agrawal June 12, 2018, 3:08 p.m. UTC
Michal Hocko <mhocko@kernel.org> writes:

> On Mon 11-06-18 08:43:03, Bjorn Helgaas wrote:
>> On Mon, Jun 11, 2018 at 08:32:10PM +0800, Xie XiuQi wrote:
>> > Hi Michal,
>> > 
>> > On 2018/6/11 16:52, Michal Hocko wrote:
>> > > On Mon 11-06-18 11:23:18, Xie XiuQi wrote:
>> > >> Hi Michal,
>> > >>
>> > >> On 2018/6/7 20:21, Michal Hocko wrote:
>> > >>> On Thu 07-06-18 19:55:53, Hanjun Guo wrote:
>> > >>>> On 2018/6/7 18:55, Michal Hocko wrote:
>> > >>> [...]
>> > >>>>> I am not sure I have the full context but pci_acpi_scan_root calls
>> > >>>>> kzalloc_node(sizeof(*info), GFP_KERNEL, node)
>> > >>>>> and that should fall back to whatever node that is online. Offline node
>> > >>>>> shouldn't keep any pages behind. So there must be something else going
>> > >>>>> on here and the patch is not the right way to handle it. What does
>> > >>>>> faddr2line __alloc_pages_nodemask+0xf0 tells on this kernel?
>> > >>>>
>> > >>>> The whole context is:
>> > >>>>
>> > >>>> The system is booted with a NUMA node has no memory attaching to it
>> > >>>> (memory-less NUMA node), also with NR_CPUS less than CPUs presented
>> > >>>> in MADT, so CPUs on this memory-less node are not brought up, and
>> > >>>> this NUMA node will not be online (but SRAT presents this NUMA node);
>> > >>>>
>> > >>>> Devices attaching to this NUMA node such as PCI host bridge still
>> > >>>> return the valid NUMA node via _PXM, but actually that valid NUMA node
>> > >>>> is not online which lead to this issue.
>> > >>>
>> > >>> But we should have other numa nodes on the zonelists so the allocator
>> > >>> should fall back to other node. If the zonelist is not intiailized
>> > >>> properly, though, then this can indeed show up as a problem. Knowing
>> > >>> which exact place has blown up would help get a better picture...
>> > >>>
>> > >>
>> > >> I specific a non-exist node to allocate memory using kzalloc_node,
>> > >> and got this following error message.
>> > >>
>> > >> And I found out there is just a VM_WARN, but it does not prevent the memory
>> > >> allocation continue.
>> > >>
>> > >> This nid would be use to access NODE_DADA(nid), so if nid is invalid,
>> > >> it would cause oops here.
>> > >>
>> > >> 459 /*
>> > >> 460  * Allocate pages, preferring the node given as nid. The node must be valid and
>> > >> 461  * online. For more general interface, see alloc_pages_node().
>> > >> 462  */
>> > >> 463 static inline struct page *
>> > >> 464 __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
>> > >> 465 {
>> > >> 466         VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
>> > >> 467         VM_WARN_ON(!node_online(nid));
>> > >> 468
>> > >> 469         return __alloc_pages(gfp_mask, order, nid);
>> > >> 470 }
>> > >> 471
>> > >>
>> > >> (I wrote a ko, to allocate memory on a non-exist node using kzalloc_node().)
>> > > 
>> > > OK, so this is an artificialy broken code, right. You shouldn't get a
>> > > non-existent node via standard APIs AFAICS. The original report was
>> > > about an existing node which is offline AFAIU. That would be a different
>> > > case. If I am missing something and there are legitimate users that try
>> > > to allocate from non-existing nodes then we should handle that in
>> > > node_zonelist.
>> > 
>> > I think hanjun's comments may help to understood this question:
>> >  - NUMA node will be built if CPUs and (or) memory are valid on this NUMA
>> >  node;
>> > 
>> >  - But if we boot the system with memory-less node and also with
>> >  CONFIG_NR_CPUS less than CPUs in SRAT, for example, 64 CPUs total with 4
>> >  NUMA nodes, 16 CPUs on each NUMA node, if we boot with
>> >  CONFIG_NR_CPUS=48, then we will not built numa node for node 3, but with
>> >  devices on that numa node, alloc memory will be panic because NUMA node
>> >  3 is not a valid node.
>
> Hmm, but this is not a memory-less node. It sounds like a misconfigured
> kernel to me or the broken initialization. Each CPU should have a
> fallback numa node to be used.
>
>> > I triggered this BUG on arm64 platform, and I found a similar bug has
>> > been fixed on x86 platform. So I sent a similar patch for this bug.
>> > 
>> > Or, could we consider to fix it in the mm subsystem?
>> 
>> The patch below (b755de8dfdfe) seems like totally the wrong direction.
>> I don't think we want every caller of kzalloc_node() to have check for
>> node_online().
>
> absolutely.
>
>> Why would memory on an off-line node even be in the allocation pool?
>> I wouldn't expect that memory to be put in the pool until the node
>> comes online and the memory is accessible, so this sounds like some
>> kind of setup issue.
>> 
>> But I'm definitely not an mm person.
>
> Well, the standard way to handle memory less NUMA nodes is to simply
> fallback to the closest NUMA node. We even have an API for that
> (numa_mem_id).

CONFIG_HAVE_MEMORYLESS node is not enabled on arm64 which means we end
up returning the original node in the fallback path.

Xie, does the below patch help? I can submit a proper patch if this
fixes the issue for you.

-- >8 --
Subject: [PATCH] arm64/numa: Enable memoryless numa nodes

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
---
 arch/arm64/Kconfig   | 4 ++++
 arch/arm64/mm/numa.c | 2 ++
 2 files changed, 6 insertions(+)

Comments

Michal Hocko June 12, 2018, 3:20 p.m. UTC | #1
On Tue 12-06-18 16:08:03, Punit Agrawal wrote:
> Michal Hocko <mhocko@kernel.org> writes:
[...]
> > Well, the standard way to handle memory less NUMA nodes is to simply
> > fallback to the closest NUMA node. We even have an API for that
> > (numa_mem_id).
> 
> CONFIG_HAVE_MEMORYLESS node is not enabled on arm64 which means we end
> up returning the original node in the fallback path.

Yes this makes more sense.
Punit Agrawal June 13, 2018, 5:39 p.m. UTC | #2
Punit Agrawal <punit.agrawal@arm.com> writes:


[...]

>
> CONFIG_HAVE_MEMORYLESS node is not enabled on arm64 which means we end
> up returning the original node in the fallback path.
>
> Xie, does the below patch help? I can submit a proper patch if this
> fixes the issue for you.
>
> -- >8 --
> Subject: [PATCH] arm64/numa: Enable memoryless numa nodes
>
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> ---
>  arch/arm64/Kconfig   | 4 ++++
>  arch/arm64/mm/numa.c | 2 ++
>  2 files changed, 6 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index eb2cf4938f6d..5317e9aa93ab 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -756,6 +756,10 @@ config USE_PERCPU_NUMA_NODE_ID
>  	def_bool y
>  	depends on NUMA
>  
> +config HAVE_MEMORYLESS_NODES
> +       def_bool y
> +       depends on NUMA
> +
>  config HAVE_SETUP_PER_CPU_AREA
>  	def_bool y
>  	depends on NUMA
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index dad128ba98bf..c699dcfe93de 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -73,6 +73,8 @@ EXPORT_SYMBOL(cpumask_of_node);
>  static void map_cpu_to_node(unsigned int cpu, int nid)
>  {
>  	set_cpu_numa_node(cpu, nid);
> +	set_numa_mem(local_memory_node(nid));

Argh, this should be

        set_cpu_numa_mem(cpu, local_memory_node(nid));

There is not guarantee that map_cpu_to_node() will be called on the
local cpu.

Hanjun, Xie - can you try with the update please?

Thanks,
Punit

> +
>  	if (nid >= 0)
>  		cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
>  }
Hanjun Guo June 14, 2018, 6:23 a.m. UTC | #3
Hi Punit,

On 2018/6/14 1:39, Punit Agrawal wrote:
> Punit Agrawal <punit.agrawal@arm.com> writes:
> 
> 
> [...]
> 
>>
>> CONFIG_HAVE_MEMORYLESS node is not enabled on arm64 which means we end
>> up returning the original node in the fallback path.
>>
>> Xie, does the below patch help? I can submit a proper patch if this
>> fixes the issue for you.
>>
>> -- >8 --
>> Subject: [PATCH] arm64/numa: Enable memoryless numa nodes
>>
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> ---
>>  arch/arm64/Kconfig   | 4 ++++
>>  arch/arm64/mm/numa.c | 2 ++
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index eb2cf4938f6d..5317e9aa93ab 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -756,6 +756,10 @@ config USE_PERCPU_NUMA_NODE_ID
>>  	def_bool y
>>  	depends on NUMA
>>  
>> +config HAVE_MEMORYLESS_NODES
>> +       def_bool y
>> +       depends on NUMA
>> +
>>  config HAVE_SETUP_PER_CPU_AREA
>>  	def_bool y
>>  	depends on NUMA
>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>> index dad128ba98bf..c699dcfe93de 100644
>> --- a/arch/arm64/mm/numa.c
>> +++ b/arch/arm64/mm/numa.c
>> @@ -73,6 +73,8 @@ EXPORT_SYMBOL(cpumask_of_node);
>>  static void map_cpu_to_node(unsigned int cpu, int nid)
>>  {
>>  	set_cpu_numa_node(cpu, nid);
>> +	set_numa_mem(local_memory_node(nid));
> 
> Argh, this should be
> 
>         set_cpu_numa_mem(cpu, local_memory_node(nid));
> 
> There is not guarantee that map_cpu_to_node() will be called on the
> local cpu.
> 
> Hanjun, Xie - can you try with the update please?

Thanks for looking into this, we will try this tomorrow
(the hardware is occupied now) and update here.

Thanks
Hanjun
Xie XiuQi June 19, 2018, 12:03 p.m. UTC | #4
Hi Punit,


On 2018/6/14 1:39, Punit Agrawal wrote:
> Punit Agrawal <punit.agrawal@arm.com> writes:
> 
> 
> [...]
> 
>>
>> CONFIG_HAVE_MEMORYLESS node is not enabled on arm64 which means we end
>> up returning the original node in the fallback path.
>>
>> Xie, does the below patch help? I can submit a proper patch if this
>> fixes the issue for you.
>>
>> -- >8 --
>> Subject: [PATCH] arm64/numa: Enable memoryless numa nodes
>>
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> ---
>>  arch/arm64/Kconfig   | 4 ++++
>>  arch/arm64/mm/numa.c | 2 ++
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index eb2cf4938f6d..5317e9aa93ab 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -756,6 +756,10 @@ config USE_PERCPU_NUMA_NODE_ID
>>  	def_bool y
>>  	depends on NUMA
>>  
>> +config HAVE_MEMORYLESS_NODES
>> +       def_bool y
>> +       depends on NUMA
>> +
>>  config HAVE_SETUP_PER_CPU_AREA
>>  	def_bool y
>>  	depends on NUMA
>> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
>> index dad128ba98bf..c699dcfe93de 100644
>> --- a/arch/arm64/mm/numa.c
>> +++ b/arch/arm64/mm/numa.c
>> @@ -73,6 +73,8 @@ EXPORT_SYMBOL(cpumask_of_node);
>>  static void map_cpu_to_node(unsigned int cpu, int nid)
>>  {
>>  	set_cpu_numa_node(cpu, nid);
>> +	set_numa_mem(local_memory_node(nid));
> 
> Argh, this should be
> 
>         set_cpu_numa_mem(cpu, local_memory_node(nid));
> 
> There is not guarantee that map_cpu_to_node() will be called on the
> local cpu.
> 
> Hanjun, Xie - can you try with the update please?

I've tested this patch, but it does not help.
The boot message is attached.

I tested on a arm board with 128 cores 4 numa nodes, but I set CONFIG_NR_CPUS=72.
Then node 3 is not be created, because node 3 has no memory, and no cpu.
But some pci device may related to node 3, which be set in ACPI table.

165 /* Interface called from ACPI code to setup PCI host controller */
166 struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root *root)
167 {
168         int node = acpi_get_node(root->device->handle);
169         struct acpi_pci_generic_root_info *ri;
170         struct pci_bus *bus, *child;
171         struct acpi_pci_root_ops *root_ops;
172
            // this node may be not created.
177         ri = kzalloc_node(sizeof(*ri), GFP_KERNEL, node);
178         if (!ri)
179                 return NULL;
180
181         root_ops = kzalloc_node(sizeof(*root_ops), GFP_KERNEL, node);
182         if (!root_ops) {
183                 kfree(ri);
184                 return NULL;
185         }
186
187         ri->cfg = pci_acpi_setup_ecam_mapping(root);
188         if (!ri->cfg) {
189                 kfree(ri);
190                 kfree(root_ops);
191                 return NULL;
192         }


> 
> Thanks,
> Punit
> 
>> +
>>  	if (nid >= 0)
>>  		cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
>>  }
> 
> .
>
Michal Hocko June 19, 2018, 12:07 p.m. UTC | #5
On Tue 19-06-18 20:03:07, Xie XiuQi wrote:
[...]
> I tested on a arm board with 128 cores 4 numa nodes, but I set CONFIG_NR_CPUS=72.
> Then node 3 is not be created, because node 3 has no memory, and no cpu.
> But some pci device may related to node 3, which be set in ACPI table.

Could you double check that zonelists for node 3 are generated
correctly?
Xie XiuQi June 19, 2018, 12:40 p.m. UTC | #6
Hi Michal,

On 2018/6/19 20:07, Michal Hocko wrote:
> On Tue 19-06-18 20:03:07, Xie XiuQi wrote:
> [...]
>> I tested on a arm board with 128 cores 4 numa nodes, but I set CONFIG_NR_CPUS=72.
>> Then node 3 is not be created, because node 3 has no memory, and no cpu.
>> But some pci device may related to node 3, which be set in ACPI table.
> 
> Could you double check that zonelists for node 3 are generated
> correctly?
> 

zonelists for node 3 is not created at all.

Kernel parse SRAT table to create node info, but in this case,
SRAT table is parsed not completed. Only the first 72 items are parsed.
In SRAT table, we haven't seen node 3 information yet, because cpu_to_node_map[72] is too small.

[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30000 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30001 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30002 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30003 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30100 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30101 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30102 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30103 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30200 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30201 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30202 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30203 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30300 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30301 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30302 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30303 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30400 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30401 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30402 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30403 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30500 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30501 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30502 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30503 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30600 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30601 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30602 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30603 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30700 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30701 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30702 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x30703 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10000 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10001 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10002 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10003 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10100 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10101 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10102 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10103 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10200 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10201 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10202 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10203 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10300 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10301 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10302 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10303 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10400 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10401 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10402 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10403 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10500 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10501 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10502 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10503 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10600 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10601 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10602 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10603 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10700 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10701 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10702 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> MPIDR 0x10703 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70000 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70001 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70002 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70003 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70100 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70101 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70102 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: PXM 2 -> MPIDR 0x70103 -> Node 2
[    0.000000] ACPI: NUMA: SRAT: cpu_to_node_map[72] is too small, may not be able to use all cpus
[    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x2080000000-0x23ffffffff]
[    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[    0.000000] ACPI: SRAT: Node 2 PXM 2 [mem 0x402000000000-0x4023ffffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x23ffffe780-0x23ffffffff]
[    0.000000] NUMA: Initmem setup node 1 [<memory-less node>]
[    0.000000] NUMA: NODE_DATA [mem 0x4023fffed780-0x4023fffeefff]
[    0.000000] NUMA: NODE_DATA(1) on node 2
[    0.000000] NUMA: NODE_DATA [mem 0x4023fffebf00-0x4023fffed77f]
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000000000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x00004023ffffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000003942ffff]
[    0.000000]   node   0: [mem 0x0000000039430000-0x000000003956ffff]
[    0.000000]   node   0: [mem 0x0000000039570000-0x000000003963ffff]
[    0.000000]   node   0: [mem 0x0000000039640000-0x00000000396fffff]
[    0.000000]   node   0: [mem 0x0000000039700000-0x000000003971ffff]
[    0.000000]   node   0: [mem 0x0000000039720000-0x0000000039b6ffff]
[    0.000000]   node   0: [mem 0x0000000039b70000-0x000000003eb5ffff]
[    0.000000]   node   0: [mem 0x000000003eb60000-0x000000003eb8ffff]
[    0.000000]   node   0: [mem 0x000000003eb90000-0x000000003fbfffff]
[    0.000000]   node   0: [mem 0x0000002080000000-0x00000023ffffffff]
[    0.000000]   node   2: [mem 0x0000402000000000-0x00004023ffffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x00000023ffffffff]
[    0.000000] Could not find start_pfn for node 1
[    0.000000] Initmem setup node 1 [mem 0x0000000000000000-0x0000000000000000]
[    0.000000] Initmem setup node 2 [mem 0x0000402000000000-0x00004023ffffffff]
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index eb2cf4938f6d..5317e9aa93ab 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -756,6 +756,10 @@  config USE_PERCPU_NUMA_NODE_ID
 	def_bool y
 	depends on NUMA
 
+config HAVE_MEMORYLESS_NODES
+       def_bool y
+       depends on NUMA
+
 config HAVE_SETUP_PER_CPU_AREA
 	def_bool y
 	depends on NUMA
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index dad128ba98bf..c699dcfe93de 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -73,6 +73,8 @@  EXPORT_SYMBOL(cpumask_of_node);
 static void map_cpu_to_node(unsigned int cpu, int nid)
 {
 	set_cpu_numa_node(cpu, nid);
+	set_numa_mem(local_memory_node(nid));
+
 	if (nid >= 0)
 		cpumask_set_cpu(cpu, node_to_cpumask_map[nid]);
 }