diff mbox series

arm64: Add memory hotplug support

Message ID 0a0c7d3b8fbe1b2b399eeb6889027031335d2af1.1544453739.git.robin.murphy@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: Add memory hotplug support | expand

Commit Message

Robin Murphy Dec. 10, 2018, 3:29 p.m. UTC
Wire up the basic support for hot-adding memory. Since memory hotplug
is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
cross-check the presence of a section in the manner of the generic
implementation, before falling back to memblock to check for no-map
regions within a present section as before. By having arch_add_memory(()
create the linear mapping first, this then makes everything work in the
way that __add_section() expects.

We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
should be safe from races by virtue of the global device hotplug lock.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

Looks like I'm not going to have the whole pte_devmap story figured out
in time to land any ZONE_DEVICE support this cycle, but since this patch
also stands alone as a complete feature (and has ended up remarkably
simple and self-contained), I hope we might consider getting it merged
on its own merit.

Robin.

 arch/arm64/Kconfig   |  3 +++
 arch/arm64/mm/init.c |  8 ++++++++
 arch/arm64/mm/mmu.c  | 12 ++++++++++++
 arch/arm64/mm/numa.c | 10 ++++++++++
 4 files changed, 33 insertions(+)

Comments

Jonathan Cameron Dec. 11, 2018, 3:38 p.m. UTC | #1
On Mon, 10 Dec 2018 15:29:01 +0000
Robin Murphy <robin.murphy@arm.com> wrote:

> Wire up the basic support for hot-adding memory. Since memory hotplug
> is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
> cross-check the presence of a section in the manner of the generic
> implementation, before falling back to memblock to check for no-map
> regions within a present section as before. By having arch_add_memory(()
> create the linear mapping first, this then makes everything work in the
> way that __add_section() expects.
> 
> We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
> should be safe from races by virtue of the global device hotplug lock.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

This did come out even shorter than I expected and the "work around" for
the pfn_valid issue is neat and tidy.

I've done some really quick checks bolting this in place of the old
hotplug code  we were abusing with our NUMA handling (which is ripped out
of the x86 code).  Seems to work well though I wouldn't describe these as
proper testing as it was just a few cycles. It's somewhat of a pain to test
until you add remove as have to keep rebooting :)

The one case I have left that I'll try and run tomorrow is the one where
we probe memory on a NUMA node that is empty at boot.
There used to be a nasty work around needed for that path.  The core code
has changed a fair bit since then so may be fine now. I'd just
like to sanity check it. So for today I'll go with a resounding
'seems good'.

Thanks.

Jonathan
> ---
> 
> Looks like I'm not going to have the whole pte_devmap story figured out
> in time to land any ZONE_DEVICE support this cycle, but since this patch
> also stands alone as a complete feature (and has ended up remarkably
> simple and self-contained), I hope we might consider getting it merged
> on its own merit.
> 
> Robin.
> 
>  arch/arm64/Kconfig   |  3 +++
>  arch/arm64/mm/init.c |  8 ++++++++
>  arch/arm64/mm/mmu.c  | 12 ++++++++++++
>  arch/arm64/mm/numa.c | 10 ++++++++++
>  4 files changed, 33 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6d2b25f51bb3..7b855ae45747 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -261,6 +261,9 @@ config ZONE_DMA32
>  config HAVE_GENERIC_GUP
>  	def_bool y
>  
> +config ARCH_ENABLE_MEMORY_HOTPLUG
> +	def_bool y
> +
>  config SMP
>  	def_bool y
>  
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 2983e0fc1786..82e0b08f2e31 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -291,6 +291,14 @@ int pfn_valid(unsigned long pfn)
>  
>  	if ((addr >> PAGE_SHIFT) != pfn)
>  		return 0;
> +
> +#ifdef CONFIG_SPARSEMEM
> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +		return 0;
> +
> +	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
> +		return 0;
> +#endif
>  	return memblock_is_map_memory(addr);
>  }
>  EXPORT_SYMBOL(pfn_valid);
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index e1b2d58a311a..22379a74d289 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1044,3 +1044,15 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
>  	pmd_free(NULL, table);
>  	return 1;
>  }
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
> +		    bool want_memblock)
> +{
> +	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> +			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
> +
> +	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> +			   altmap, want_memblock);
> +}
> +#endif
> diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
> index 27a31efd9e8e..ae34e3a1cef1 100644
> --- a/arch/arm64/mm/numa.c
> +++ b/arch/arm64/mm/numa.c
> @@ -466,3 +466,13 @@ void __init arm64_numa_init(void)
>  
>  	numa_init(dummy_numa_init);
>  }
> +
> +/*
> + * We hope that we will be hotplugging memory on nodes we already know about,
> + * such that acpi_get_node() succeeds and we never fall back to this...
> + */
> +int memory_add_physaddr_to_nid(u64 addr)
> +{
> +	pr_warn("Unknown node for memory at 0x%llx, assuming node 0\n", addr);
> +	return 0;
> +}
Will Deacon Dec. 11, 2018, 4:36 p.m. UTC | #2
On Mon, Dec 10, 2018 at 03:29:01PM +0000, Robin Murphy wrote:
> Wire up the basic support for hot-adding memory. Since memory hotplug
> is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
> cross-check the presence of a section in the manner of the generic
> implementation, before falling back to memblock to check for no-map
> regions within a present section as before. By having arch_add_memory(()
> create the linear mapping first, this then makes everything work in the
> way that __add_section() expects.
> 
> We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
> should be safe from races by virtue of the global device hotplug lock.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
> 
> Looks like I'm not going to have the whole pte_devmap story figured out
> in time to land any ZONE_DEVICE support this cycle, but since this patch
> also stands alone as a complete feature (and has ended up remarkably
> simple and self-contained), I hope we might consider getting it merged
> on its own merit.
> 
> Robin.
> 
>  arch/arm64/Kconfig   |  3 +++
>  arch/arm64/mm/init.c |  8 ++++++++
>  arch/arm64/mm/mmu.c  | 12 ++++++++++++
>  arch/arm64/mm/numa.c | 10 ++++++++++
>  4 files changed, 33 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6d2b25f51bb3..7b855ae45747 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -261,6 +261,9 @@ config ZONE_DMA32
>  config HAVE_GENERIC_GUP
>  	def_bool y
>  
> +config ARCH_ENABLE_MEMORY_HOTPLUG
> +	def_bool y
> +
>  config SMP
>  	def_bool y
>  
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 2983e0fc1786..82e0b08f2e31 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -291,6 +291,14 @@ int pfn_valid(unsigned long pfn)
>  
>  	if ((addr >> PAGE_SHIFT) != pfn)
>  		return 0;
> +
> +#ifdef CONFIG_SPARSEMEM
> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +		return 0;
> +
> +	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
> +		return 0;

I'm a bit nervous about the call to __nr_to_section() here. How do we
ensure that the section number we're passing stays within the bounds of
the mem_section array?

> +#endif
>  	return memblock_is_map_memory(addr);
>  }
>  EXPORT_SYMBOL(pfn_valid);
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index e1b2d58a311a..22379a74d289 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1044,3 +1044,15 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
>  	pmd_free(NULL, table);
>  	return 1;
>  }
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
> +		    bool want_memblock)
> +{
> +	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> +			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
> +
> +	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> +			   altmap, want_memblock);
> +}

If we're mapping the new memory into the linear map, shouldn't we be
respecting rodata_full and debug page alloc by forcing page granularity
and tweaking the permissions?

Will
Robin Murphy Dec. 11, 2018, 5:21 p.m. UTC | #3
On 11/12/2018 16:36, Will Deacon wrote:
> On Mon, Dec 10, 2018 at 03:29:01PM +0000, Robin Murphy wrote:
>> Wire up the basic support for hot-adding memory. Since memory hotplug
>> is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
>> cross-check the presence of a section in the manner of the generic
>> implementation, before falling back to memblock to check for no-map
>> regions within a present section as before. By having arch_add_memory(()
>> create the linear mapping first, this then makes everything work in the
>> way that __add_section() expects.
>>
>> We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
>> should be safe from races by virtue of the global device hotplug lock.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>
>> Looks like I'm not going to have the whole pte_devmap story figured out
>> in time to land any ZONE_DEVICE support this cycle, but since this patch
>> also stands alone as a complete feature (and has ended up remarkably
>> simple and self-contained), I hope we might consider getting it merged
>> on its own merit.
>>
>> Robin.
>>
>>   arch/arm64/Kconfig   |  3 +++
>>   arch/arm64/mm/init.c |  8 ++++++++
>>   arch/arm64/mm/mmu.c  | 12 ++++++++++++
>>   arch/arm64/mm/numa.c | 10 ++++++++++
>>   4 files changed, 33 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 6d2b25f51bb3..7b855ae45747 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -261,6 +261,9 @@ config ZONE_DMA32
>>   config HAVE_GENERIC_GUP
>>   	def_bool y
>>   
>> +config ARCH_ENABLE_MEMORY_HOTPLUG
>> +	def_bool y
>> +
>>   config SMP
>>   	def_bool y
>>   
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 2983e0fc1786..82e0b08f2e31 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -291,6 +291,14 @@ int pfn_valid(unsigned long pfn)
>>   
>>   	if ((addr >> PAGE_SHIFT) != pfn)
>>   		return 0;
>> +
>> +#ifdef CONFIG_SPARSEMEM
>> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>> +		return 0;
>> +
>> +	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
>> +		return 0;
> 
> I'm a bit nervous about the call to __nr_to_section() here. How do we
> ensure that the section number we're passing stays within the bounds of
> the mem_section array?

The same way every other sparsemem user (apart from arch/arm) does, I 
guess - this is literally a copy-paste of the generic pfn_valid() 
implementation :/

Given the implementation of __nr_to_section() respective of how 
memory_present() and sparse_index_init() set up mem_section in the first 
place, I can't see how there can be a problem. You did see the bit 4 
lines above, right?

>> +#endif
>>   	return memblock_is_map_memory(addr);
>>   }
>>   EXPORT_SYMBOL(pfn_valid);
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index e1b2d58a311a..22379a74d289 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1044,3 +1044,15 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
>>   	pmd_free(NULL, table);
>>   	return 1;
>>   }
>> +
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
>> +		    bool want_memblock)
>> +{
>> +	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>> +			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
>> +
>> +	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>> +			   altmap, want_memblock);
>> +}
> 
> If we're mapping the new memory into the linear map, shouldn't we be
> respecting rodata_full and debug page alloc by forcing page granularity
> and tweaking the permissions?

Bah, James mentioned debug_pagealloc long ago, and I did have a slight 
nagging feeling that I was still missing something - yes, I need to fix 
the flags for that case. I'm not sure about rodata_full (do you mean 
STRICT_KERNEL_RWX?) since a section being added here won't contain 
kernel text nor data, and I can't seem to find anywhere that rodata 
options affect the linear mapping of plain free RAM.

Robin.
Will Deacon Dec. 11, 2018, 5:24 p.m. UTC | #4
On Tue, Dec 11, 2018 at 05:21:24PM +0000, Robin Murphy wrote:
> On 11/12/2018 16:36, Will Deacon wrote:
> > On Mon, Dec 10, 2018 at 03:29:01PM +0000, Robin Murphy wrote:
> > > Wire up the basic support for hot-adding memory. Since memory hotplug
> > > is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
> > > cross-check the presence of a section in the manner of the generic
> > > implementation, before falling back to memblock to check for no-map
> > > regions within a present section as before. By having arch_add_memory(()
> > > create the linear mapping first, this then makes everything work in the
> > > way that __add_section() expects.
> > > 
> > > We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
> > > should be safe from races by virtue of the global device hotplug lock.
> > > 
> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > > ---
> > > 
> > > Looks like I'm not going to have the whole pte_devmap story figured out
> > > in time to land any ZONE_DEVICE support this cycle, but since this patch
> > > also stands alone as a complete feature (and has ended up remarkably
> > > simple and self-contained), I hope we might consider getting it merged
> > > on its own merit.
> > > 
> > > Robin.
> > > 
> > >   arch/arm64/Kconfig   |  3 +++
> > >   arch/arm64/mm/init.c |  8 ++++++++
> > >   arch/arm64/mm/mmu.c  | 12 ++++++++++++
> > >   arch/arm64/mm/numa.c | 10 ++++++++++
> > >   4 files changed, 33 insertions(+)
> > > 
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index 6d2b25f51bb3..7b855ae45747 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -261,6 +261,9 @@ config ZONE_DMA32
> > >   config HAVE_GENERIC_GUP
> > >   	def_bool y
> > > +config ARCH_ENABLE_MEMORY_HOTPLUG
> > > +	def_bool y
> > > +
> > >   config SMP
> > >   	def_bool y
> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > > index 2983e0fc1786..82e0b08f2e31 100644
> > > --- a/arch/arm64/mm/init.c
> > > +++ b/arch/arm64/mm/init.c
> > > @@ -291,6 +291,14 @@ int pfn_valid(unsigned long pfn)
> > >   	if ((addr >> PAGE_SHIFT) != pfn)
> > >   		return 0;
> > > +
> > > +#ifdef CONFIG_SPARSEMEM
> > > +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> > > +		return 0;
> > > +
> > > +	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
> > > +		return 0;
> > 
> > I'm a bit nervous about the call to __nr_to_section() here. How do we
> > ensure that the section number we're passing stays within the bounds of
> > the mem_section array?
> 
> The same way every other sparsemem user (apart from arch/arm) does, I guess
> - this is literally a copy-paste of the generic pfn_valid() implementation
> :/

I don't trust the generic pfn_valid() at all :)

> Given the implementation of __nr_to_section() respective of how
> memory_present() and sparse_index_init() set up mem_section in the first
> place, I can't see how there can be a problem. You did see the bit 4 lines
> above, right?

D'oh, yes, I read that and then instantly forgot it. Ok, so that should be
fine.

> > > +#endif
> > >   	return memblock_is_map_memory(addr);
> > >   }
> > >   EXPORT_SYMBOL(pfn_valid);
> > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> > > index e1b2d58a311a..22379a74d289 100644
> > > --- a/arch/arm64/mm/mmu.c
> > > +++ b/arch/arm64/mm/mmu.c
> > > @@ -1044,3 +1044,15 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
> > >   	pmd_free(NULL, table);
> > >   	return 1;
> > >   }
> > > +
> > > +#ifdef CONFIG_MEMORY_HOTPLUG
> > > +int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
> > > +		    bool want_memblock)
> > > +{
> > > +	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> > > +			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
> > > +
> > > +	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
> > > +			   altmap, want_memblock);
> > > +}
> > 
> > If we're mapping the new memory into the linear map, shouldn't we be
> > respecting rodata_full and debug page alloc by forcing page granularity
> > and tweaking the permissions?
> 
> Bah, James mentioned debug_pagealloc long ago, and I did have a slight
> nagging feeling that I was still missing something - yes, I need to fix the
> flags for that case. I'm not sure about rodata_full (do you mean
> STRICT_KERNEL_RWX?) since a section being added here won't contain kernel
> text nor data, and I can't seem to find anywhere that rodata options affect
> the linear mapping of plain free RAM.

Ah, we've got code queued on for-next/core so that changing vmalloc()
permissions makes the same changes to the linear map.

Will
Robin Murphy Dec. 11, 2018, 5:37 p.m. UTC | #5
On 11/12/2018 17:24, Will Deacon wrote:
> On Tue, Dec 11, 2018 at 05:21:24PM +0000, Robin Murphy wrote:
>> On 11/12/2018 16:36, Will Deacon wrote:
>>> On Mon, Dec 10, 2018 at 03:29:01PM +0000, Robin Murphy wrote:
>>>> Wire up the basic support for hot-adding memory. Since memory hotplug
>>>> is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
>>>> cross-check the presence of a section in the manner of the generic
>>>> implementation, before falling back to memblock to check for no-map
>>>> regions within a present section as before. By having arch_add_memory(()
>>>> create the linear mapping first, this then makes everything work in the
>>>> way that __add_section() expects.
>>>>
>>>> We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
>>>> should be safe from races by virtue of the global device hotplug lock.
>>>>
>>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>>> ---
>>>>
>>>> Looks like I'm not going to have the whole pte_devmap story figured out
>>>> in time to land any ZONE_DEVICE support this cycle, but since this patch
>>>> also stands alone as a complete feature (and has ended up remarkably
>>>> simple and self-contained), I hope we might consider getting it merged
>>>> on its own merit.
>>>>
>>>> Robin.
>>>>
>>>>    arch/arm64/Kconfig   |  3 +++
>>>>    arch/arm64/mm/init.c |  8 ++++++++
>>>>    arch/arm64/mm/mmu.c  | 12 ++++++++++++
>>>>    arch/arm64/mm/numa.c | 10 ++++++++++
>>>>    4 files changed, 33 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>> index 6d2b25f51bb3..7b855ae45747 100644
>>>> --- a/arch/arm64/Kconfig
>>>> +++ b/arch/arm64/Kconfig
>>>> @@ -261,6 +261,9 @@ config ZONE_DMA32
>>>>    config HAVE_GENERIC_GUP
>>>>    	def_bool y
>>>> +config ARCH_ENABLE_MEMORY_HOTPLUG
>>>> +	def_bool y
>>>> +
>>>>    config SMP
>>>>    	def_bool y
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 2983e0fc1786..82e0b08f2e31 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -291,6 +291,14 @@ int pfn_valid(unsigned long pfn)
>>>>    	if ((addr >> PAGE_SHIFT) != pfn)
>>>>    		return 0;
>>>> +
>>>> +#ifdef CONFIG_SPARSEMEM
>>>> +	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>> +		return 0;
>>>> +
>>>> +	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
>>>> +		return 0;
>>>
>>> I'm a bit nervous about the call to __nr_to_section() here. How do we
>>> ensure that the section number we're passing stays within the bounds of
>>> the mem_section array?
>>
>> The same way every other sparsemem user (apart from arch/arm) does, I guess
>> - this is literally a copy-paste of the generic pfn_valid() implementation
>> :/
> 
> I don't trust the generic pfn_valid() at all :)
> 
>> Given the implementation of __nr_to_section() respective of how
>> memory_present() and sparse_index_init() set up mem_section in the first
>> place, I can't see how there can be a problem. You did see the bit 4 lines
>> above, right?
> 
> D'oh, yes, I read that and then instantly forgot it. Ok, so that should be
> fine.
> 
>>>> +#endif
>>>>    	return memblock_is_map_memory(addr);
>>>>    }
>>>>    EXPORT_SYMBOL(pfn_valid);
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index e1b2d58a311a..22379a74d289 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -1044,3 +1044,15 @@ int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
>>>>    	pmd_free(NULL, table);
>>>>    	return 1;
>>>>    }
>>>> +
>>>> +#ifdef CONFIG_MEMORY_HOTPLUG
>>>> +int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
>>>> +		    bool want_memblock)
>>>> +{
>>>> +	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>>>> +			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
>>>> +
>>>> +	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
>>>> +			   altmap, want_memblock);
>>>> +}
>>>
>>> If we're mapping the new memory into the linear map, shouldn't we be
>>> respecting rodata_full and debug page alloc by forcing page granularity
>>> and tweaking the permissions?
>>
>> Bah, James mentioned debug_pagealloc long ago, and I did have a slight
>> nagging feeling that I was still missing something - yes, I need to fix the
>> flags for that case. I'm not sure about rodata_full (do you mean
>> STRICT_KERNEL_RWX?) since a section being added here won't contain kernel
>> text nor data, and I can't seem to find anywhere that rodata options affect
>> the linear mapping of plain free RAM.
> 
> Ah, we've got code queued on for-next/core so that changing vmalloc()
> permissions makes the same changes to the linear map.

Gotcha, I see "arm64: mm: apply r/o permissions of VM areas to its 
linear alias as well" now, no wonder I couldn't find anything relevant 
in my rc3-based development branch. I'll rebase and add that case too.

Thanks,
Robin.
diff mbox series

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6d2b25f51bb3..7b855ae45747 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -261,6 +261,9 @@  config ZONE_DMA32
 config HAVE_GENERIC_GUP
 	def_bool y
 
+config ARCH_ENABLE_MEMORY_HOTPLUG
+	def_bool y
+
 config SMP
 	def_bool y
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 2983e0fc1786..82e0b08f2e31 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -291,6 +291,14 @@  int pfn_valid(unsigned long pfn)
 
 	if ((addr >> PAGE_SHIFT) != pfn)
 		return 0;
+
+#ifdef CONFIG_SPARSEMEM
+	if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+		return 0;
+
+	if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
+		return 0;
+#endif
 	return memblock_is_map_memory(addr);
 }
 EXPORT_SYMBOL(pfn_valid);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e1b2d58a311a..22379a74d289 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1044,3 +1044,15 @@  int pud_free_pmd_page(pud_t *pudp, unsigned long addr)
 	pmd_free(NULL, table);
 	return 1;
 }
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
+		    bool want_memblock)
+{
+	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
+			     size, PAGE_KERNEL, pgd_pgtable_alloc, 0);
+
+	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
+			   altmap, want_memblock);
+}
+#endif
diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index 27a31efd9e8e..ae34e3a1cef1 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -466,3 +466,13 @@  void __init arm64_numa_init(void)
 
 	numa_init(dummy_numa_init);
 }
+
+/*
+ * We hope that we will be hotplugging memory on nodes we already know about,
+ * such that acpi_get_node() succeeds and we never fall back to this...
+ */
+int memory_add_physaddr_to_nid(u64 addr)
+{
+	pr_warn("Unknown node for memory at 0x%llx, assuming node 0\n", addr);
+	return 0;
+}