Message ID | 1619077823-3819-1-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC,V2] mm: Enable generic pfn_valid() to handle early sections with memmap holes | expand |
On Thu, Apr 22, 2021 at 01:20:23PM +0530, Anshuman Khandual wrote: > Platforms like arm and arm64 have redefined pfn_valid() because their early > memory sections might have contained memmap holes after freeing parts of it > during boot, which should be skipped while validating a pfn for struct page > backing. This scenario on certain platforms where memmap is not continuous, > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > Then the generic pfn_valid() can be improved to accommodate such platforms. > This reduces overall code footprint and also improves maintainability. > > free_unused_memmap() and pfn_to_online_page() have been updated to include > such cases. This also exports memblock_is_memory() for all drivers that use > pfn_valid() but lack required visibility. After the new config is in place, > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@kernel.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > This patch applies on the latest mainline kernel after Mike's series > regarding arm64 based pfn_valid(). > > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > > Changes in RFC V2: > > - Dropped support for arm (32 bit) > - Replaced memblock_is_map_memory() check with memblock_is_memory() > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > - Updated pfn_to_online_page() per David > - Updated free_unused_memmap() to preserve existing semantics per Mike > - Exported memblock_is_memory() instead of memblock_is_map_memory() > > Changes in RFC V1: > > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > > arch/arm64/Kconfig | 2 +- > arch/arm64/include/asm/page.h | 1 - > arch/arm64/mm/init.c | 41 ----------------------------------- > include/linux/mmzone.h | 18 ++++++++++++++- > mm/Kconfig | 9 ++++++++ > mm/memblock.c | 8 +++++-- > mm/memory_hotplug.c | 5 +++++ > 7 files changed, 38 insertions(+), 46 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index b4a9b493ce72..4cdc3570ffa9 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -144,7 +144,6 @@ config ARM64 > select HAVE_ARCH_KGDB > select HAVE_ARCH_MMAP_RND_BITS > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > - select HAVE_ARCH_PFN_VALID > select HAVE_ARCH_PREL32_RELOCATIONS > select HAVE_ARCH_SECCOMP_FILTER > select HAVE_ARCH_STACKLEAK > @@ -167,6 +166,7 @@ config ARM64 > if $(cc-option,-fpatchable-function-entry=2) > select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > if DYNAMIC_FTRACE_WITH_REGS > + select HAVE_EARLY_SECTION_MEMMAP_HOLES > select HAVE_EFFICIENT_UNALIGNED_ACCESS > select HAVE_FAST_GUP > select HAVE_FTRACE_MCOUNT_RECORD > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 75ddfe671393..fcbef3eec4b2 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > -int pfn_valid(unsigned long pfn); > int pfn_is_map_memory(unsigned long pfn); > > #include <asm/memory.h> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index f431b38d0837..5731a11550d8 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > free_area_init(max_zone_pfns); > } > > -int pfn_valid(unsigned long pfn) > -{ > - phys_addr_t addr = PFN_PHYS(pfn); > - > - /* > - * Ensure the upper PAGE_SHIFT bits are clear in the > - * pfn. Else it might lead to false positives when > - * some of the upper bits are set, but the lower bits > - * match a valid pfn. > - */ > - if (PHYS_PFN(addr) != pfn) > - return 0; > - > -#ifdef CONFIG_SPARSEMEM > -{ > - struct mem_section *ms; > - > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > - return 0; > - > - ms = __pfn_to_section(pfn); > - if (!valid_section(ms)) > - return 0; > - > - /* > - * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_memory() check for ZONE_DEVICE based > - * addresses will always fail. Even the normal hotplugged > - * memory will never have MEMBLOCK_NOMAP flag set in their > - * memblock entries. Skip memblock search for all non early > - * memory sections covering all of hotplug memory including > - * both normal and ZONE_DEVICE based. > - */ > - if (!early_section(ms)) > - return pfn_section_valid(ms, pfn); > -} > -#endif > - return memblock_is_memory(addr); > -} > -EXPORT_SYMBOL(pfn_valid); > - > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 961f0eeefb62..18bf71665211 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > * > * Return: 1 for PFNs that have memory map entries and 0 otherwise > */ > +bool memblock_is_memory(phys_addr_t addr); > + > static inline int pfn_valid(unsigned long pfn) > { > + phys_addr_t addr = PFN_PHYS(pfn); > struct mem_section *ms; > > + /* > + * Ensure the upper PAGE_SHIFT bits are clear in the > + * pfn. Else it might lead to false positives when > + * some of the upper bits are set, but the lower bits > + * match a valid pfn. > + */ > + if (PHYS_PFN(addr) != pfn) > + return 0; > + > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __nr_to_section(pfn_to_section_nr(pfn)); > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > * Traditionally early sections always returned pfn_valid() for > * the entire section-sized span. > */ > - return early_section(ms) || pfn_section_valid(ms, pfn); > + if (early_section(ms)) > + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > + memblock_is_memory(pfn << PAGE_SHIFT) : 1; Nit: we already did addr = PFN_PHYS(pfn); a few lines above :) > + > + return pfn_section_valid(ms, pfn); > } > #endif > > diff --git a/mm/Kconfig b/mm/Kconfig > index 24c045b24b95..db7128111874 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -135,6 +135,15 @@ config HAVE_FAST_GUP > config ARCH_KEEP_MEMBLOCK > bool > > +config HAVE_EARLY_SECTION_MEMMAP_HOLES > + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP > + def_bool n > + help > + Early sections on certain platforms might have some memory ranges that > + are not backed with struct page mappings. When subscribed, this option > + enables special handling for those memory ranges in certain situations > + such as pfn_valid(). > + > # Keep arch NUMA mapping infrastructure post-init. > config NUMA_KEEP_MEMINFO > bool > diff --git a/mm/memblock.c b/mm/memblock.c > index 3abf2c3fea7f..93f8a9c8428d 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > return memblock_search(&memblock.memory, addr) != -1; > } > +EXPORT_SYMBOL(memblock_is_memory); Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK > bool __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) > unsigned long start, end, prev_end = 0; > int i; > > - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || > - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > + return; > + > + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && > + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) > return; Can you please add a comment that says that architecture should provide a way to detect holes in the memory map to be able to free its part? > /* > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 0cdbbfbc5757..8c78b6a3d888 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn) > * Save some code text when online_section() + > * pfn_section_valid() are sufficient. > */ > + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) { > + if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn))) > + return NULL; > + } > + > if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn)) > return NULL; > > -- > 2.20.1 >
On 22.04.21 09:50, Anshuman Khandual wrote: > Platforms like arm and arm64 have redefined pfn_valid() because their early > memory sections might have contained memmap holes after freeing parts of it > during boot, which should be skipped while validating a pfn for struct page > backing. This scenario on certain platforms where memmap is not continuous, > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > Then the generic pfn_valid() can be improved to accommodate such platforms. > This reduces overall code footprint and also improves maintainability. > > free_unused_memmap() and pfn_to_online_page() have been updated to include > such cases. This also exports memblock_is_memory() for all drivers that use > pfn_valid() but lack required visibility. After the new config is in place, > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@kernel.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > This patch applies on the latest mainline kernel after Mike's series > regarding arm64 based pfn_valid(). > > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > > Changes in RFC V2: > > - Dropped support for arm (32 bit) > - Replaced memblock_is_map_memory() check with memblock_is_memory() > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > - Updated pfn_to_online_page() per David > - Updated free_unused_memmap() to preserve existing semantics per Mike > - Exported memblock_is_memory() instead of memblock_is_map_memory() > > Changes in RFC V1: > > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > > arch/arm64/Kconfig | 2 +- > arch/arm64/include/asm/page.h | 1 - > arch/arm64/mm/init.c | 41 ----------------------------------- > include/linux/mmzone.h | 18 ++++++++++++++- > mm/Kconfig | 9 ++++++++ > mm/memblock.c | 8 +++++-- > mm/memory_hotplug.c | 5 +++++ > 7 files changed, 38 insertions(+), 46 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index b4a9b493ce72..4cdc3570ffa9 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -144,7 +144,6 @@ config ARM64 > select HAVE_ARCH_KGDB > select HAVE_ARCH_MMAP_RND_BITS > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > - select HAVE_ARCH_PFN_VALID > select HAVE_ARCH_PREL32_RELOCATIONS > select HAVE_ARCH_SECCOMP_FILTER > select HAVE_ARCH_STACKLEAK > @@ -167,6 +166,7 @@ config ARM64 > if $(cc-option,-fpatchable-function-entry=2) > select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > if DYNAMIC_FTRACE_WITH_REGS > + select HAVE_EARLY_SECTION_MEMMAP_HOLES > select HAVE_EFFICIENT_UNALIGNED_ACCESS > select HAVE_FAST_GUP > select HAVE_FTRACE_MCOUNT_RECORD > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 75ddfe671393..fcbef3eec4b2 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > -int pfn_valid(unsigned long pfn); > int pfn_is_map_memory(unsigned long pfn); > > #include <asm/memory.h> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index f431b38d0837..5731a11550d8 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > free_area_init(max_zone_pfns); > } > > -int pfn_valid(unsigned long pfn) > -{ > - phys_addr_t addr = PFN_PHYS(pfn); > - > - /* > - * Ensure the upper PAGE_SHIFT bits are clear in the > - * pfn. Else it might lead to false positives when > - * some of the upper bits are set, but the lower bits > - * match a valid pfn. > - */ > - if (PHYS_PFN(addr) != pfn) > - return 0; > - > -#ifdef CONFIG_SPARSEMEM > -{ > - struct mem_section *ms; > - > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > - return 0; > - > - ms = __pfn_to_section(pfn); > - if (!valid_section(ms)) > - return 0; > - > - /* > - * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_memory() check for ZONE_DEVICE based > - * addresses will always fail. Even the normal hotplugged > - * memory will never have MEMBLOCK_NOMAP flag set in their > - * memblock entries. Skip memblock search for all non early > - * memory sections covering all of hotplug memory including > - * both normal and ZONE_DEVICE based. > - */ > - if (!early_section(ms)) > - return pfn_section_valid(ms, pfn); > -} > -#endif > - return memblock_is_memory(addr); > -} > -EXPORT_SYMBOL(pfn_valid); > - > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 961f0eeefb62..18bf71665211 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > * > * Return: 1 for PFNs that have memory map entries and 0 otherwise > */ > +bool memblock_is_memory(phys_addr_t addr); > + > static inline int pfn_valid(unsigned long pfn) > { > + phys_addr_t addr = PFN_PHYS(pfn); > struct mem_section *ms; > > + /* > + * Ensure the upper PAGE_SHIFT bits are clear in the > + * pfn. Else it might lead to false positives when > + * some of the upper bits are set, but the lower bits > + * match a valid pfn. > + */ > + if (PHYS_PFN(addr) != pfn) > + return 0; > + > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __nr_to_section(pfn_to_section_nr(pfn)); > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > * Traditionally early sections always returned pfn_valid() for > * the entire section-sized span. > */ > - return early_section(ms) || pfn_section_valid(ms, pfn); > + if (early_section(ms)) > + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > + > + return pfn_section_valid(ms, pfn); > } > #endif > > diff --git a/mm/Kconfig b/mm/Kconfig > index 24c045b24b95..db7128111874 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -135,6 +135,15 @@ config HAVE_FAST_GUP > config ARCH_KEEP_MEMBLOCK > bool > > +config HAVE_EARLY_SECTION_MEMMAP_HOLES > + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP > + def_bool n > + help > + Early sections on certain platforms might have some memory ranges that > + are not backed with struct page mappings. When subscribed, this option > + enables special handling for those memory ranges in certain situations > + such as pfn_valid(). > + > # Keep arch NUMA mapping infrastructure post-init. > config NUMA_KEEP_MEMINFO > bool > diff --git a/mm/memblock.c b/mm/memblock.c > index 3abf2c3fea7f..93f8a9c8428d 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) > { > return memblock_search(&memblock.memory, addr) != -1; > } > +EXPORT_SYMBOL(memblock_is_memory); > > bool __init_memblock memblock_is_map_memory(phys_addr_t addr) > { > @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) > unsigned long start, end, prev_end = 0; > int i; > > - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || > - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > + return; > + > + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && > + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) > return; > > /* > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 0cdbbfbc5757..8c78b6a3d888 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn) > * Save some code text when online_section() + > * pfn_section_valid() are sufficient. > */ > + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) { > + if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn))) > + return NULL; > + } > + > if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn)) > return NULL; > > What about doing one step at a time and switching only over to generic pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap only with !CONFIG_SPARSEMEM) IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at all. Am i missing something or should that be possible?
On Thu, Apr 22, 2021 at 11:03:50AM +0200, David Hildenbrand wrote: > On 22.04.21 09:50, Anshuman Khandual wrote: > > Platforms like arm and arm64 have redefined pfn_valid() because their early > > memory sections might have contained memmap holes after freeing parts of it > > during boot, which should be skipped while validating a pfn for struct page > > backing. This scenario on certain platforms where memmap is not continuous, > > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > > Then the generic pfn_valid() can be improved to accommodate such platforms. > > This reduces overall code footprint and also improves maintainability. > > > > free_unused_memmap() and pfn_to_online_page() have been updated to include > > such cases. This also exports memblock_is_memory() for all drivers that use > > pfn_valid() but lack required visibility. After the new config is in place, > > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Will Deacon <will@kernel.org> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Mike Rapoport <rppt@kernel.org> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: linux-arm-kernel@lists.infradead.org > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-mm@kvack.org > > Suggested-by: David Hildenbrand <david@redhat.com> > > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > > --- > > This patch applies on the latest mainline kernel after Mike's series > > regarding arm64 based pfn_valid(). > > > > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > > > > Changes in RFC V2: > > > > - Dropped support for arm (32 bit) > > - Replaced memblock_is_map_memory() check with memblock_is_memory() > > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > > - Updated pfn_to_online_page() per David > > - Updated free_unused_memmap() to preserve existing semantics per Mike > > - Exported memblock_is_memory() instead of memblock_is_map_memory() > > > > Changes in RFC V1: > > > > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > > > > arch/arm64/Kconfig | 2 +- > > arch/arm64/include/asm/page.h | 1 - > > arch/arm64/mm/init.c | 41 ----------------------------------- > > include/linux/mmzone.h | 18 ++++++++++++++- > > mm/Kconfig | 9 ++++++++ > > mm/memblock.c | 8 +++++-- > > mm/memory_hotplug.c | 5 +++++ > > 7 files changed, 38 insertions(+), 46 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index b4a9b493ce72..4cdc3570ffa9 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -144,7 +144,6 @@ config ARM64 > > select HAVE_ARCH_KGDB > > select HAVE_ARCH_MMAP_RND_BITS > > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > > - select HAVE_ARCH_PFN_VALID > > select HAVE_ARCH_PREL32_RELOCATIONS > > select HAVE_ARCH_SECCOMP_FILTER > > select HAVE_ARCH_STACKLEAK > > @@ -167,6 +166,7 @@ config ARM64 > > if $(cc-option,-fpatchable-function-entry=2) > > select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > > if DYNAMIC_FTRACE_WITH_REGS > > + select HAVE_EARLY_SECTION_MEMMAP_HOLES > > select HAVE_EFFICIENT_UNALIGNED_ACCESS > > select HAVE_FAST_GUP > > select HAVE_FTRACE_MCOUNT_RECORD > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > index 75ddfe671393..fcbef3eec4b2 100644 > > --- a/arch/arm64/include/asm/page.h > > +++ b/arch/arm64/include/asm/page.h > > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > -int pfn_valid(unsigned long pfn); > > int pfn_is_map_memory(unsigned long pfn); > > #include <asm/memory.h> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index f431b38d0837..5731a11550d8 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > > free_area_init(max_zone_pfns); > > } > > -int pfn_valid(unsigned long pfn) > > -{ > > - phys_addr_t addr = PFN_PHYS(pfn); > > - > > - /* > > - * Ensure the upper PAGE_SHIFT bits are clear in the > > - * pfn. Else it might lead to false positives when > > - * some of the upper bits are set, but the lower bits > > - * match a valid pfn. > > - */ > > - if (PHYS_PFN(addr) != pfn) > > - return 0; > > - > > -#ifdef CONFIG_SPARSEMEM > > -{ > > - struct mem_section *ms; > > - > > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > - return 0; > > - > > - ms = __pfn_to_section(pfn); > > - if (!valid_section(ms)) > > - return 0; > > - > > - /* > > - * ZONE_DEVICE memory does not have the memblock entries. > > - * memblock_is_memory() check for ZONE_DEVICE based > > - * addresses will always fail. Even the normal hotplugged > > - * memory will never have MEMBLOCK_NOMAP flag set in their > > - * memblock entries. Skip memblock search for all non early > > - * memory sections covering all of hotplug memory including > > - * both normal and ZONE_DEVICE based. > > - */ > > - if (!early_section(ms)) > > - return pfn_section_valid(ms, pfn); > > -} > > -#endif > > - return memblock_is_memory(addr); > > -} > > -EXPORT_SYMBOL(pfn_valid); > > - > > int pfn_is_map_memory(unsigned long pfn) > > { > > phys_addr_t addr = PFN_PHYS(pfn); > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index 961f0eeefb62..18bf71665211 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > > * > > * Return: 1 for PFNs that have memory map entries and 0 otherwise > > */ > > +bool memblock_is_memory(phys_addr_t addr); > > + > > static inline int pfn_valid(unsigned long pfn) > > { > > + phys_addr_t addr = PFN_PHYS(pfn); > > struct mem_section *ms; > > + /* > > + * Ensure the upper PAGE_SHIFT bits are clear in the > > + * pfn. Else it might lead to false positives when > > + * some of the upper bits are set, but the lower bits > > + * match a valid pfn. > > + */ > > + if (PHYS_PFN(addr) != pfn) > > + return 0; > > + > > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > return 0; > > ms = __nr_to_section(pfn_to_section_nr(pfn)); > > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > > * Traditionally early sections always returned pfn_valid() for > > * the entire section-sized span. > > */ > > - return early_section(ms) || pfn_section_valid(ms, pfn); > > + if (early_section(ms)) > > + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > > + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > > + > > + return pfn_section_valid(ms, pfn); > > } > > #endif > > diff --git a/mm/Kconfig b/mm/Kconfig > > index 24c045b24b95..db7128111874 100644 > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -135,6 +135,15 @@ config HAVE_FAST_GUP > > config ARCH_KEEP_MEMBLOCK > > bool > > +config HAVE_EARLY_SECTION_MEMMAP_HOLES > > + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP > > + def_bool n > > + help > > + Early sections on certain platforms might have some memory ranges that > > + are not backed with struct page mappings. When subscribed, this option > > + enables special handling for those memory ranges in certain situations > > + such as pfn_valid(). > > + > > # Keep arch NUMA mapping infrastructure post-init. > > config NUMA_KEEP_MEMINFO > > bool > > diff --git a/mm/memblock.c b/mm/memblock.c > > index 3abf2c3fea7f..93f8a9c8428d 100644 > > --- a/mm/memblock.c > > +++ b/mm/memblock.c > > @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) > > { > > return memblock_search(&memblock.memory, addr) != -1; > > } > > +EXPORT_SYMBOL(memblock_is_memory); > > bool __init_memblock memblock_is_map_memory(phys_addr_t addr) > > { > > @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) > > unsigned long start, end, prev_end = 0; > > int i; > > - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || > > - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > > + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) > > + return; > > + > > + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && > > + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) > > return; > > /* > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > index 0cdbbfbc5757..8c78b6a3d888 100644 > > --- a/mm/memory_hotplug.c > > +++ b/mm/memory_hotplug.c > > @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn) > > * Save some code text when online_section() + > > * pfn_section_valid() are sufficient. > > */ > > + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) { > > + if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn))) > > + return NULL; > > + } > > + > > if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn)) > > return NULL; > > > > What about doing one step at a time and switching only over to generic > pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap only > with !CONFIG_SPARSEMEM) The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM. With FLATMEM it's wild west: $ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l 22 This would actually mean that we still need arm64::pfn_valid() for the FLATMEM case. > IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at > all. Am i missing something or should that be possible? We are back again to the question "should arm64 free its memmap". If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all. If the answer is yes, Anshuman's patch is way better than a custom pfn_valid().
On 22.04.21 11:42, Mike Rapoport wrote: > On Thu, Apr 22, 2021 at 11:03:50AM +0200, David Hildenbrand wrote: >> On 22.04.21 09:50, Anshuman Khandual wrote: >>> Platforms like arm and arm64 have redefined pfn_valid() because their early >>> memory sections might have contained memmap holes after freeing parts of it >>> during boot, which should be skipped while validating a pfn for struct page >>> backing. This scenario on certain platforms where memmap is not continuous, >>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. >>> Then the generic pfn_valid() can be improved to accommodate such platforms. >>> This reduces overall code footprint and also improves maintainability. >>> >>> free_unused_memmap() and pfn_to_online_page() have been updated to include >>> such cases. This also exports memblock_is_memory() for all drivers that use >>> pfn_valid() but lack required visibility. After the new config is in place, >>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. >>> >>> Cc: Catalin Marinas <catalin.marinas@arm.com> >>> Cc: Will Deacon <will@kernel.org> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Mike Rapoport <rppt@kernel.org> >>> Cc: David Hildenbrand <david@redhat.com> >>> Cc: linux-arm-kernel@lists.infradead.org >>> Cc: linux-kernel@vger.kernel.org >>> Cc: linux-mm@kvack.org >>> Suggested-by: David Hildenbrand <david@redhat.com> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>> --- >>> This patch applies on the latest mainline kernel after Mike's series >>> regarding arm64 based pfn_valid(). >>> >>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t >>> >>> Changes in RFC V2: >>> >>> - Dropped support for arm (32 bit) >>> - Replaced memblock_is_map_memory() check with memblock_is_memory() >>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() >>> - Updated pfn_to_online_page() per David >>> - Updated free_unused_memmap() to preserve existing semantics per Mike >>> - Exported memblock_is_memory() instead of memblock_is_map_memory() >>> >>> Changes in RFC V1: >>> >>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ >>> >>> arch/arm64/Kconfig | 2 +- >>> arch/arm64/include/asm/page.h | 1 - >>> arch/arm64/mm/init.c | 41 ----------------------------------- >>> include/linux/mmzone.h | 18 ++++++++++++++- >>> mm/Kconfig | 9 ++++++++ >>> mm/memblock.c | 8 +++++-- >>> mm/memory_hotplug.c | 5 +++++ >>> 7 files changed, 38 insertions(+), 46 deletions(-) >>> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index b4a9b493ce72..4cdc3570ffa9 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -144,7 +144,6 @@ config ARM64 >>> select HAVE_ARCH_KGDB >>> select HAVE_ARCH_MMAP_RND_BITS >>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >>> - select HAVE_ARCH_PFN_VALID >>> select HAVE_ARCH_PREL32_RELOCATIONS >>> select HAVE_ARCH_SECCOMP_FILTER >>> select HAVE_ARCH_STACKLEAK >>> @@ -167,6 +166,7 @@ config ARM64 >>> if $(cc-option,-fpatchable-function-entry=2) >>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ >>> if DYNAMIC_FTRACE_WITH_REGS >>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES >>> select HAVE_EFFICIENT_UNALIGNED_ACCESS >>> select HAVE_FAST_GUP >>> select HAVE_FTRACE_MCOUNT_RECORD >>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>> index 75ddfe671393..fcbef3eec4b2 100644 >>> --- a/arch/arm64/include/asm/page.h >>> +++ b/arch/arm64/include/asm/page.h >>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >>> typedef struct page *pgtable_t; >>> -int pfn_valid(unsigned long pfn); >>> int pfn_is_map_memory(unsigned long pfn); >>> #include <asm/memory.h> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>> index f431b38d0837..5731a11550d8 100644 >>> --- a/arch/arm64/mm/init.c >>> +++ b/arch/arm64/mm/init.c >>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >>> free_area_init(max_zone_pfns); >>> } >>> -int pfn_valid(unsigned long pfn) >>> -{ >>> - phys_addr_t addr = PFN_PHYS(pfn); >>> - >>> - /* >>> - * Ensure the upper PAGE_SHIFT bits are clear in the >>> - * pfn. Else it might lead to false positives when >>> - * some of the upper bits are set, but the lower bits >>> - * match a valid pfn. >>> - */ >>> - if (PHYS_PFN(addr) != pfn) >>> - return 0; >>> - >>> -#ifdef CONFIG_SPARSEMEM >>> -{ >>> - struct mem_section *ms; >>> - >>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>> - return 0; >>> - >>> - ms = __pfn_to_section(pfn); >>> - if (!valid_section(ms)) >>> - return 0; >>> - >>> - /* >>> - * ZONE_DEVICE memory does not have the memblock entries. >>> - * memblock_is_memory() check for ZONE_DEVICE based >>> - * addresses will always fail. Even the normal hotplugged >>> - * memory will never have MEMBLOCK_NOMAP flag set in their >>> - * memblock entries. Skip memblock search for all non early >>> - * memory sections covering all of hotplug memory including >>> - * both normal and ZONE_DEVICE based. >>> - */ >>> - if (!early_section(ms)) >>> - return pfn_section_valid(ms, pfn); >>> -} >>> -#endif >>> - return memblock_is_memory(addr); >>> -} >>> -EXPORT_SYMBOL(pfn_valid); >>> - >>> int pfn_is_map_memory(unsigned long pfn) >>> { >>> phys_addr_t addr = PFN_PHYS(pfn); >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>> index 961f0eeefb62..18bf71665211 100644 >>> --- a/include/linux/mmzone.h >>> +++ b/include/linux/mmzone.h >>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >>> * >>> * Return: 1 for PFNs that have memory map entries and 0 otherwise >>> */ >>> +bool memblock_is_memory(phys_addr_t addr); >>> + >>> static inline int pfn_valid(unsigned long pfn) >>> { >>> + phys_addr_t addr = PFN_PHYS(pfn); >>> struct mem_section *ms; >>> + /* >>> + * Ensure the upper PAGE_SHIFT bits are clear in the >>> + * pfn. Else it might lead to false positives when >>> + * some of the upper bits are set, but the lower bits >>> + * match a valid pfn. >>> + */ >>> + if (PHYS_PFN(addr) != pfn) >>> + return 0; >>> + >>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>> return 0; >>> ms = __nr_to_section(pfn_to_section_nr(pfn)); >>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) >>> * Traditionally early sections always returned pfn_valid() for >>> * the entire section-sized span. >>> */ >>> - return early_section(ms) || pfn_section_valid(ms, pfn); >>> + if (early_section(ms)) >>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? >>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; >>> + >>> + return pfn_section_valid(ms, pfn); >>> } >>> #endif >>> diff --git a/mm/Kconfig b/mm/Kconfig >>> index 24c045b24b95..db7128111874 100644 >>> --- a/mm/Kconfig >>> +++ b/mm/Kconfig >>> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP >>> config ARCH_KEEP_MEMBLOCK >>> bool >>> +config HAVE_EARLY_SECTION_MEMMAP_HOLES >>> + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP >>> + def_bool n >>> + help >>> + Early sections on certain platforms might have some memory ranges that >>> + are not backed with struct page mappings. When subscribed, this option >>> + enables special handling for those memory ranges in certain situations >>> + such as pfn_valid(). >>> + >>> # Keep arch NUMA mapping infrastructure post-init. >>> config NUMA_KEEP_MEMINFO >>> bool >>> diff --git a/mm/memblock.c b/mm/memblock.c >>> index 3abf2c3fea7f..93f8a9c8428d 100644 >>> --- a/mm/memblock.c >>> +++ b/mm/memblock.c >>> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) >>> { >>> return memblock_search(&memblock.memory, addr) != -1; >>> } >>> +EXPORT_SYMBOL(memblock_is_memory); >>> bool __init_memblock memblock_is_map_memory(phys_addr_t addr) >>> { >>> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) >>> unsigned long start, end, prev_end = 0; >>> int i; >>> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || >>> - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) >>> + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) >>> + return; >>> + >>> + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && >>> + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) >>> return; >>> /* >>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >>> index 0cdbbfbc5757..8c78b6a3d888 100644 >>> --- a/mm/memory_hotplug.c >>> +++ b/mm/memory_hotplug.c >>> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn) >>> * Save some code text when online_section() + >>> * pfn_section_valid() are sufficient. >>> */ >>> + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) { >>> + if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn))) >>> + return NULL; >>> + } >>> + >>> if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn)) >>> return NULL; >>> >> >> What about doing one step at a time and switching only over to generic >> pfn_valid() in case of CONFIG_SPARSEMEM? (meaning: freeing the memmap only >> with !CONFIG_SPARSEMEM) > > The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM. > With FLATMEM it's wild west: > > $ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l > 22 > > This would actually mean that we still need arm64::pfn_valid() for the > FLATMEM case. Right, which is the case on x86 etc. as well. (I was assuming that this patch was missing something) > >> IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at >> all. Am i missing something or should that be possible? > > We are back again to the question "should arm64 free its memmap". > If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all. > If the answer is yes, Anshuman's patch is way better than a custom > pfn_valid(). Well, I propose something in between: stop freeing with SPARSEMEM, continue freeing with FLATMEM.
On 4/22/21 3:12 PM, Mike Rapoport wrote: > The "generic" pfn_valid() is only available with CONFIG_SPARSEMEM. > With FLATMEM it's wild west: > > $ git grep -w "define pfn_valid" arch/*/include/asm/ | wc -l > 22 > > This would actually mean that we still need arm64::pfn_valid() for the > FLATMEM case. SPARSEMEM would be the only memory model going forward. https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210420093559.23168-1-catalin.marinas@arm.com/
On Thu, Apr 22, 2021 at 11:48:58AM +0200, David Hildenbrand wrote: > > > > > IOW, avoiding having to adjust generic pfn_valid()/pfn_to_online_page() at > > > all. Am i missing something or should that be possible? > > > > We are back again to the question "should arm64 free its memmap". > > If the answer is no, we don't need arm64::pfn_valid() for SPARSEMEM at all. > > If the answer is yes, Anshuman's patch is way better than a custom > > pfn_valid(). > > Well, I propose something in between: stop freeing with SPARSEMEM, continue > freeing with FLATMEM. I'm all for it.
On 4/22/21 1:57 PM, Mike Rapoport wrote: > On Thu, Apr 22, 2021 at 01:20:23PM +0530, Anshuman Khandual wrote: >> Platforms like arm and arm64 have redefined pfn_valid() because their early >> memory sections might have contained memmap holes after freeing parts of it >> during boot, which should be skipped while validating a pfn for struct page >> backing. This scenario on certain platforms where memmap is not continuous, >> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. >> Then the generic pfn_valid() can be improved to accommodate such platforms. >> This reduces overall code footprint and also improves maintainability. >> >> free_unused_memmap() and pfn_to_online_page() have been updated to include >> such cases. This also exports memblock_is_memory() for all drivers that use >> pfn_valid() but lack required visibility. After the new config is in place, >> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. >> >> Cc: Catalin Marinas <catalin.marinas@arm.com> >> Cc: Will Deacon <will@kernel.org> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Mike Rapoport <rppt@kernel.org> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: linux-arm-kernel@lists.infradead.org >> Cc: linux-kernel@vger.kernel.org >> Cc: linux-mm@kvack.org >> Suggested-by: David Hildenbrand <david@redhat.com> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- >> This patch applies on the latest mainline kernel after Mike's series >> regarding arm64 based pfn_valid(). >> >> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t >> >> Changes in RFC V2: >> >> - Dropped support for arm (32 bit) >> - Replaced memblock_is_map_memory() check with memblock_is_memory() >> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() >> - Updated pfn_to_online_page() per David >> - Updated free_unused_memmap() to preserve existing semantics per Mike >> - Exported memblock_is_memory() instead of memblock_is_map_memory() >> >> Changes in RFC V1: >> >> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ >> >> arch/arm64/Kconfig | 2 +- >> arch/arm64/include/asm/page.h | 1 - >> arch/arm64/mm/init.c | 41 ----------------------------------- >> include/linux/mmzone.h | 18 ++++++++++++++- >> mm/Kconfig | 9 ++++++++ >> mm/memblock.c | 8 +++++-- >> mm/memory_hotplug.c | 5 +++++ >> 7 files changed, 38 insertions(+), 46 deletions(-) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index b4a9b493ce72..4cdc3570ffa9 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -144,7 +144,6 @@ config ARM64 >> select HAVE_ARCH_KGDB >> select HAVE_ARCH_MMAP_RND_BITS >> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >> - select HAVE_ARCH_PFN_VALID >> select HAVE_ARCH_PREL32_RELOCATIONS >> select HAVE_ARCH_SECCOMP_FILTER >> select HAVE_ARCH_STACKLEAK >> @@ -167,6 +166,7 @@ config ARM64 >> if $(cc-option,-fpatchable-function-entry=2) >> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ >> if DYNAMIC_FTRACE_WITH_REGS >> + select HAVE_EARLY_SECTION_MEMMAP_HOLES >> select HAVE_EFFICIENT_UNALIGNED_ACCESS >> select HAVE_FAST_GUP >> select HAVE_FTRACE_MCOUNT_RECORD >> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >> index 75ddfe671393..fcbef3eec4b2 100644 >> --- a/arch/arm64/include/asm/page.h >> +++ b/arch/arm64/include/asm/page.h >> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >> >> typedef struct page *pgtable_t; >> >> -int pfn_valid(unsigned long pfn); >> int pfn_is_map_memory(unsigned long pfn); >> >> #include <asm/memory.h> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index f431b38d0837..5731a11550d8 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >> free_area_init(max_zone_pfns); >> } >> >> -int pfn_valid(unsigned long pfn) >> -{ >> - phys_addr_t addr = PFN_PHYS(pfn); >> - >> - /* >> - * Ensure the upper PAGE_SHIFT bits are clear in the >> - * pfn. Else it might lead to false positives when >> - * some of the upper bits are set, but the lower bits >> - * match a valid pfn. >> - */ >> - if (PHYS_PFN(addr) != pfn) >> - return 0; >> - >> -#ifdef CONFIG_SPARSEMEM >> -{ >> - struct mem_section *ms; >> - >> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> - return 0; >> - >> - ms = __pfn_to_section(pfn); >> - if (!valid_section(ms)) >> - return 0; >> - >> - /* >> - * ZONE_DEVICE memory does not have the memblock entries. >> - * memblock_is_memory() check for ZONE_DEVICE based >> - * addresses will always fail. Even the normal hotplugged >> - * memory will never have MEMBLOCK_NOMAP flag set in their >> - * memblock entries. Skip memblock search for all non early >> - * memory sections covering all of hotplug memory including >> - * both normal and ZONE_DEVICE based. >> - */ >> - if (!early_section(ms)) >> - return pfn_section_valid(ms, pfn); >> -} >> -#endif >> - return memblock_is_memory(addr); >> -} >> -EXPORT_SYMBOL(pfn_valid); >> - >> int pfn_is_map_memory(unsigned long pfn) >> { >> phys_addr_t addr = PFN_PHYS(pfn); >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> index 961f0eeefb62..18bf71665211 100644 >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >> * >> * Return: 1 for PFNs that have memory map entries and 0 otherwise >> */ >> +bool memblock_is_memory(phys_addr_t addr); >> + >> static inline int pfn_valid(unsigned long pfn) >> { >> + phys_addr_t addr = PFN_PHYS(pfn); >> struct mem_section *ms; >> >> + /* >> + * Ensure the upper PAGE_SHIFT bits are clear in the >> + * pfn. Else it might lead to false positives when >> + * some of the upper bits are set, but the lower bits >> + * match a valid pfn. >> + */ >> + if (PHYS_PFN(addr) != pfn) >> + return 0; >> + >> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> return 0; >> ms = __nr_to_section(pfn_to_section_nr(pfn)); >> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) >> * Traditionally early sections always returned pfn_valid() for >> * the entire section-sized span. >> */ >> - return early_section(ms) || pfn_section_valid(ms, pfn); >> + if (early_section(ms)) >> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? >> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > > Nit: we already did > > addr = PFN_PHYS(pfn); > > a few lines above :) Yeah, will use the addr directly here. > >> + >> + return pfn_section_valid(ms, pfn); >> } >> #endif >> >> diff --git a/mm/Kconfig b/mm/Kconfig >> index 24c045b24b95..db7128111874 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -135,6 +135,15 @@ config HAVE_FAST_GUP >> config ARCH_KEEP_MEMBLOCK >> bool >> >> +config HAVE_EARLY_SECTION_MEMMAP_HOLES >> + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP >> + def_bool n >> + help >> + Early sections on certain platforms might have some memory ranges that >> + are not backed with struct page mappings. When subscribed, this option >> + enables special handling for those memory ranges in certain situations >> + such as pfn_valid(). >> + >> # Keep arch NUMA mapping infrastructure post-init. >> config NUMA_KEEP_MEMINFO >> bool >> diff --git a/mm/memblock.c b/mm/memblock.c >> index 3abf2c3fea7f..93f8a9c8428d 100644 >> --- a/mm/memblock.c >> +++ b/mm/memblock.c >> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) >> { >> return memblock_search(&memblock.memory, addr) != -1; >> } >> +EXPORT_SYMBOL(memblock_is_memory); > > Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK CONFIG_ARCH_KEEP_MEMBLOCK ? Wrap it around the EXPORT_SYMBOL() or the entire function memblock_is_memory(). > >> bool __init_memblock memblock_is_map_memory(phys_addr_t addr) >> { >> @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) >> unsigned long start, end, prev_end = 0; >> int i; >> >> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || >> - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) >> + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) >> + return; >> + >> + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && >> + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) >> return; > > Can you please add a comment that says that architecture should provide a > way to detect holes in the memory map to be able to free its part? I did not get that completely. If CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES is not subscribed, then the platform must provide a method that identifies these memmap holes and process them accordingly in pfn_valid() just to avoid null pointer dereference ? Also the comment should be placed after the second return statement, when its going to free unused memmap segments.
On Thu, Apr 22, 2021 at 04:53:36PM +0530, Anshuman Khandual wrote: > On 4/22/21 1:57 PM, Mike Rapoport wrote: ... > >> diff --git a/mm/memblock.c b/mm/memblock.c > >> index 3abf2c3fea7f..93f8a9c8428d 100644 > >> --- a/mm/memblock.c > >> +++ b/mm/memblock.c > >> @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) > >> { > >> return memblock_search(&memblock.memory, addr) != -1; > >> } > >> +EXPORT_SYMBOL(memblock_is_memory); > > > > Please make it inside #ifdef CONFIG_ARCH_MEMBLOCK > CONFIG_ARCH_KEEP_MEMBLOCK ? Yeah, _KEEP went away somehow :) > Wrap it around the EXPORT_SYMBOL() or the entire function > memblock_is_memory(). EXPORT_SYMBOL(). Otherwise we'll have exported __init function.
On 4/22/21 1:20 PM, Anshuman Khandual wrote: > Platforms like arm and arm64 have redefined pfn_valid() because their early > memory sections might have contained memmap holes after freeing parts of it > during boot, which should be skipped while validating a pfn for struct page > backing. This scenario on certain platforms where memmap is not continuous, > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > Then the generic pfn_valid() can be improved to accommodate such platforms. > This reduces overall code footprint and also improves maintainability. > > free_unused_memmap() and pfn_to_online_page() have been updated to include > such cases. This also exports memblock_is_memory() for all drivers that use > pfn_valid() but lack required visibility. After the new config is in place, > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Mike Rapoport <rppt@kernel.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Cc: linux-mm@kvack.org > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- > This patch applies on the latest mainline kernel after Mike's series > regarding arm64 based pfn_valid(). > > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > > Changes in RFC V2: > > - Dropped support for arm (32 bit) > - Replaced memblock_is_map_memory() check with memblock_is_memory() > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > - Updated pfn_to_online_page() per David > - Updated free_unused_memmap() to preserve existing semantics per Mike > - Exported memblock_is_memory() instead of memblock_is_map_memory() > > Changes in RFC V1: > > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > > arch/arm64/Kconfig | 2 +- > arch/arm64/include/asm/page.h | 1 - > arch/arm64/mm/init.c | 41 ----------------------------------- > include/linux/mmzone.h | 18 ++++++++++++++- > mm/Kconfig | 9 ++++++++ > mm/memblock.c | 8 +++++-- > mm/memory_hotplug.c | 5 +++++ > 7 files changed, 38 insertions(+), 46 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index b4a9b493ce72..4cdc3570ffa9 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -144,7 +144,6 @@ config ARM64 > select HAVE_ARCH_KGDB > select HAVE_ARCH_MMAP_RND_BITS > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > - select HAVE_ARCH_PFN_VALID > select HAVE_ARCH_PREL32_RELOCATIONS > select HAVE_ARCH_SECCOMP_FILTER > select HAVE_ARCH_STACKLEAK > @@ -167,6 +166,7 @@ config ARM64 > if $(cc-option,-fpatchable-function-entry=2) > select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > if DYNAMIC_FTRACE_WITH_REGS > + select HAVE_EARLY_SECTION_MEMMAP_HOLES > select HAVE_EFFICIENT_UNALIGNED_ACCESS > select HAVE_FAST_GUP > select HAVE_FTRACE_MCOUNT_RECORD > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 75ddfe671393..fcbef3eec4b2 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > -int pfn_valid(unsigned long pfn); > int pfn_is_map_memory(unsigned long pfn); > > #include <asm/memory.h> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index f431b38d0837..5731a11550d8 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > free_area_init(max_zone_pfns); > } > > -int pfn_valid(unsigned long pfn) > -{ > - phys_addr_t addr = PFN_PHYS(pfn); > - > - /* > - * Ensure the upper PAGE_SHIFT bits are clear in the > - * pfn. Else it might lead to false positives when > - * some of the upper bits are set, but the lower bits > - * match a valid pfn. > - */ > - if (PHYS_PFN(addr) != pfn) > - return 0; > - > -#ifdef CONFIG_SPARSEMEM > -{ > - struct mem_section *ms; > - > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > - return 0; > - > - ms = __pfn_to_section(pfn); > - if (!valid_section(ms)) > - return 0; > - > - /* > - * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_memory() check for ZONE_DEVICE based > - * addresses will always fail. Even the normal hotplugged > - * memory will never have MEMBLOCK_NOMAP flag set in their > - * memblock entries. Skip memblock search for all non early > - * memory sections covering all of hotplug memory including > - * both normal and ZONE_DEVICE based. > - */ > - if (!early_section(ms)) > - return pfn_section_valid(ms, pfn); > -} > -#endif > - return memblock_is_memory(addr); > -} > -EXPORT_SYMBOL(pfn_valid); > - > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 961f0eeefb62..18bf71665211 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > * > * Return: 1 for PFNs that have memory map entries and 0 otherwise > */ > +bool memblock_is_memory(phys_addr_t addr); > + > static inline int pfn_valid(unsigned long pfn) > { > + phys_addr_t addr = PFN_PHYS(pfn); > struct mem_section *ms; > > + /* > + * Ensure the upper PAGE_SHIFT bits are clear in the > + * pfn. Else it might lead to false positives when > + * some of the upper bits are set, but the lower bits > + * match a valid pfn. > + */ > + if (PHYS_PFN(addr) != pfn) > + return 0; > + > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __nr_to_section(pfn_to_section_nr(pfn)); > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > * Traditionally early sections always returned pfn_valid() for > * the entire section-sized span. > */ > - return early_section(ms) || pfn_section_valid(ms, pfn); > + if (early_section(ms)) > + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > + > + return pfn_section_valid(ms, pfn); > } > #endif Hello David/Mike, Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly suggest. Thank you. - Anshuman
Hello Anshuman, On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: > > On 4/22/21 1:20 PM, Anshuman Khandual wrote: > > Platforms like arm and arm64 have redefined pfn_valid() because their early > > memory sections might have contained memmap holes after freeing parts of it > > during boot, which should be skipped while validating a pfn for struct page > > backing. This scenario on certain platforms where memmap is not continuous, > > could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > > Then the generic pfn_valid() can be improved to accommodate such platforms. > > This reduces overall code footprint and also improves maintainability. > > > > free_unused_memmap() and pfn_to_online_page() have been updated to include > > such cases. This also exports memblock_is_memory() for all drivers that use > > pfn_valid() but lack required visibility. After the new config is in place, > > drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > > > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Will Deacon <will@kernel.org> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Mike Rapoport <rppt@kernel.org> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: linux-arm-kernel@lists.infradead.org > > Cc: linux-kernel@vger.kernel.org > > Cc: linux-mm@kvack.org > > Suggested-by: David Hildenbrand <david@redhat.com> > > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > > --- > > This patch applies on the latest mainline kernel after Mike's series > > regarding arm64 based pfn_valid(). > > > > https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > > > > Changes in RFC V2: > > > > - Dropped support for arm (32 bit) > > - Replaced memblock_is_map_memory() check with memblock_is_memory() > > - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > > - Updated pfn_to_online_page() per David > > - Updated free_unused_memmap() to preserve existing semantics per Mike > > - Exported memblock_is_memory() instead of memblock_is_map_memory() > > > > Changes in RFC V1: > > > > - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > > > > arch/arm64/Kconfig | 2 +- > > arch/arm64/include/asm/page.h | 1 - > > arch/arm64/mm/init.c | 41 ----------------------------------- > > include/linux/mmzone.h | 18 ++++++++++++++- > > mm/Kconfig | 9 ++++++++ > > mm/memblock.c | 8 +++++-- > > mm/memory_hotplug.c | 5 +++++ > > 7 files changed, 38 insertions(+), 46 deletions(-) > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > index b4a9b493ce72..4cdc3570ffa9 100644 > > --- a/arch/arm64/Kconfig > > +++ b/arch/arm64/Kconfig > > @@ -144,7 +144,6 @@ config ARM64 > > select HAVE_ARCH_KGDB > > select HAVE_ARCH_MMAP_RND_BITS > > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > > - select HAVE_ARCH_PFN_VALID > > select HAVE_ARCH_PREL32_RELOCATIONS > > select HAVE_ARCH_SECCOMP_FILTER > > select HAVE_ARCH_STACKLEAK > > @@ -167,6 +166,7 @@ config ARM64 > > if $(cc-option,-fpatchable-function-entry=2) > > select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > > if DYNAMIC_FTRACE_WITH_REGS > > + select HAVE_EARLY_SECTION_MEMMAP_HOLES > > select HAVE_EFFICIENT_UNALIGNED_ACCESS > > select HAVE_FAST_GUP > > select HAVE_FTRACE_MCOUNT_RECORD > > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > > index 75ddfe671393..fcbef3eec4b2 100644 > > --- a/arch/arm64/include/asm/page.h > > +++ b/arch/arm64/include/asm/page.h > > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > > > typedef struct page *pgtable_t; > > > > -int pfn_valid(unsigned long pfn); > > int pfn_is_map_memory(unsigned long pfn); > > > > #include <asm/memory.h> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index f431b38d0837..5731a11550d8 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > > free_area_init(max_zone_pfns); > > } > > > > -int pfn_valid(unsigned long pfn) > > -{ > > - phys_addr_t addr = PFN_PHYS(pfn); > > - > > - /* > > - * Ensure the upper PAGE_SHIFT bits are clear in the > > - * pfn. Else it might lead to false positives when > > - * some of the upper bits are set, but the lower bits > > - * match a valid pfn. > > - */ > > - if (PHYS_PFN(addr) != pfn) > > - return 0; > > - > > -#ifdef CONFIG_SPARSEMEM > > -{ > > - struct mem_section *ms; > > - > > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > - return 0; > > - > > - ms = __pfn_to_section(pfn); > > - if (!valid_section(ms)) > > - return 0; > > - > > - /* > > - * ZONE_DEVICE memory does not have the memblock entries. > > - * memblock_is_memory() check for ZONE_DEVICE based > > - * addresses will always fail. Even the normal hotplugged > > - * memory will never have MEMBLOCK_NOMAP flag set in their > > - * memblock entries. Skip memblock search for all non early > > - * memory sections covering all of hotplug memory including > > - * both normal and ZONE_DEVICE based. > > - */ > > - if (!early_section(ms)) > > - return pfn_section_valid(ms, pfn); > > -} > > -#endif > > - return memblock_is_memory(addr); > > -} > > -EXPORT_SYMBOL(pfn_valid); > > - > > int pfn_is_map_memory(unsigned long pfn) > > { > > phys_addr_t addr = PFN_PHYS(pfn); > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > > index 961f0eeefb62..18bf71665211 100644 > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > > * > > * Return: 1 for PFNs that have memory map entries and 0 otherwise > > */ > > +bool memblock_is_memory(phys_addr_t addr); > > + > > static inline int pfn_valid(unsigned long pfn) > > { > > + phys_addr_t addr = PFN_PHYS(pfn); > > struct mem_section *ms; > > > > + /* > > + * Ensure the upper PAGE_SHIFT bits are clear in the > > + * pfn. Else it might lead to false positives when > > + * some of the upper bits are set, but the lower bits > > + * match a valid pfn. > > + */ > > + if (PHYS_PFN(addr) != pfn) > > + return 0; > > + > > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > > return 0; > > ms = __nr_to_section(pfn_to_section_nr(pfn)); > > @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > > * Traditionally early sections always returned pfn_valid() for > > * the entire section-sized span. > > */ > > - return early_section(ms) || pfn_section_valid(ms, pfn); > > + if (early_section(ms)) > > + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > > + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > > + > > + return pfn_section_valid(ms, pfn); > > } > > #endif > > Hello David/Mike, > > Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and > SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we > still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly > suggest. Thank you. Even now arm64 still frees parts of the memory map and pfn_valid() should be able to tell if a part of a section is freed or not. For instance for the following memory configuration |<----section---->|<----hole---->|<----section---->| +--------+--------+--------------+--------+--------+ | bank 0 | unused | | bank 1 | unused | +--------+--------+--------------+--------+--------+ the memory map corresponding to the "unused" areas is freed, but the generic pfn_valid() will still return 1 there. So we either should stop freeing unused memory map on arm64, or keep arm64::pfn_valid() or implement something along the lines of this patch. I personally don't think that the memory savings from freeing the unused memory map worth the pain of maintenance and bugs happening here and there.
On 5/24/21 12:22 PM, Mike Rapoport wrote: > Hello Anshuman, > > On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: >> >> On 4/22/21 1:20 PM, Anshuman Khandual wrote: >>> Platforms like arm and arm64 have redefined pfn_valid() because their early >>> memory sections might have contained memmap holes after freeing parts of it >>> during boot, which should be skipped while validating a pfn for struct page >>> backing. This scenario on certain platforms where memmap is not continuous, >>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. >>> Then the generic pfn_valid() can be improved to accommodate such platforms. >>> This reduces overall code footprint and also improves maintainability. >>> >>> free_unused_memmap() and pfn_to_online_page() have been updated to include >>> such cases. This also exports memblock_is_memory() for all drivers that use >>> pfn_valid() but lack required visibility. After the new config is in place, >>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. >>> >>> Cc: Catalin Marinas <catalin.marinas@arm.com> >>> Cc: Will Deacon <will@kernel.org> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Mike Rapoport <rppt@kernel.org> >>> Cc: David Hildenbrand <david@redhat.com> >>> Cc: linux-arm-kernel@lists.infradead.org >>> Cc: linux-kernel@vger.kernel.org >>> Cc: linux-mm@kvack.org >>> Suggested-by: David Hildenbrand <david@redhat.com> >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>> --- >>> This patch applies on the latest mainline kernel after Mike's series >>> regarding arm64 based pfn_valid(). >>> >>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t >>> >>> Changes in RFC V2: >>> >>> - Dropped support for arm (32 bit) >>> - Replaced memblock_is_map_memory() check with memblock_is_memory() >>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() >>> - Updated pfn_to_online_page() per David >>> - Updated free_unused_memmap() to preserve existing semantics per Mike >>> - Exported memblock_is_memory() instead of memblock_is_map_memory() >>> >>> Changes in RFC V1: >>> >>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ >>> >>> arch/arm64/Kconfig | 2 +- >>> arch/arm64/include/asm/page.h | 1 - >>> arch/arm64/mm/init.c | 41 ----------------------------------- >>> include/linux/mmzone.h | 18 ++++++++++++++- >>> mm/Kconfig | 9 ++++++++ >>> mm/memblock.c | 8 +++++-- >>> mm/memory_hotplug.c | 5 +++++ >>> 7 files changed, 38 insertions(+), 46 deletions(-) >>> >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>> index b4a9b493ce72..4cdc3570ffa9 100644 >>> --- a/arch/arm64/Kconfig >>> +++ b/arch/arm64/Kconfig >>> @@ -144,7 +144,6 @@ config ARM64 >>> select HAVE_ARCH_KGDB >>> select HAVE_ARCH_MMAP_RND_BITS >>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >>> - select HAVE_ARCH_PFN_VALID >>> select HAVE_ARCH_PREL32_RELOCATIONS >>> select HAVE_ARCH_SECCOMP_FILTER >>> select HAVE_ARCH_STACKLEAK >>> @@ -167,6 +166,7 @@ config ARM64 >>> if $(cc-option,-fpatchable-function-entry=2) >>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ >>> if DYNAMIC_FTRACE_WITH_REGS >>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES >>> select HAVE_EFFICIENT_UNALIGNED_ACCESS >>> select HAVE_FAST_GUP >>> select HAVE_FTRACE_MCOUNT_RECORD >>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>> index 75ddfe671393..fcbef3eec4b2 100644 >>> --- a/arch/arm64/include/asm/page.h >>> +++ b/arch/arm64/include/asm/page.h >>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >>> >>> typedef struct page *pgtable_t; >>> >>> -int pfn_valid(unsigned long pfn); >>> int pfn_is_map_memory(unsigned long pfn); >>> >>> #include <asm/memory.h> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>> index f431b38d0837..5731a11550d8 100644 >>> --- a/arch/arm64/mm/init.c >>> +++ b/arch/arm64/mm/init.c >>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >>> free_area_init(max_zone_pfns); >>> } >>> >>> -int pfn_valid(unsigned long pfn) >>> -{ >>> - phys_addr_t addr = PFN_PHYS(pfn); >>> - >>> - /* >>> - * Ensure the upper PAGE_SHIFT bits are clear in the >>> - * pfn. Else it might lead to false positives when >>> - * some of the upper bits are set, but the lower bits >>> - * match a valid pfn. >>> - */ >>> - if (PHYS_PFN(addr) != pfn) >>> - return 0; >>> - >>> -#ifdef CONFIG_SPARSEMEM >>> -{ >>> - struct mem_section *ms; >>> - >>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>> - return 0; >>> - >>> - ms = __pfn_to_section(pfn); >>> - if (!valid_section(ms)) >>> - return 0; >>> - >>> - /* >>> - * ZONE_DEVICE memory does not have the memblock entries. >>> - * memblock_is_memory() check for ZONE_DEVICE based >>> - * addresses will always fail. Even the normal hotplugged >>> - * memory will never have MEMBLOCK_NOMAP flag set in their >>> - * memblock entries. Skip memblock search for all non early >>> - * memory sections covering all of hotplug memory including >>> - * both normal and ZONE_DEVICE based. >>> - */ >>> - if (!early_section(ms)) >>> - return pfn_section_valid(ms, pfn); >>> -} >>> -#endif >>> - return memblock_is_memory(addr); >>> -} >>> -EXPORT_SYMBOL(pfn_valid); >>> - >>> int pfn_is_map_memory(unsigned long pfn) >>> { >>> phys_addr_t addr = PFN_PHYS(pfn); >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>> index 961f0eeefb62..18bf71665211 100644 >>> --- a/include/linux/mmzone.h >>> +++ b/include/linux/mmzone.h >>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >>> * >>> * Return: 1 for PFNs that have memory map entries and 0 otherwise >>> */ >>> +bool memblock_is_memory(phys_addr_t addr); >>> + >>> static inline int pfn_valid(unsigned long pfn) >>> { >>> + phys_addr_t addr = PFN_PHYS(pfn); >>> struct mem_section *ms; >>> >>> + /* >>> + * Ensure the upper PAGE_SHIFT bits are clear in the >>> + * pfn. Else it might lead to false positives when >>> + * some of the upper bits are set, but the lower bits >>> + * match a valid pfn. >>> + */ >>> + if (PHYS_PFN(addr) != pfn) >>> + return 0; >>> + >>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>> return 0; >>> ms = __nr_to_section(pfn_to_section_nr(pfn)); >>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) >>> * Traditionally early sections always returned pfn_valid() for >>> * the entire section-sized span. >>> */ >>> - return early_section(ms) || pfn_section_valid(ms, pfn); >>> + if (early_section(ms)) >>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? >>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; >>> + >>> + return pfn_section_valid(ms, pfn); >>> } >>> #endif >> >> Hello David/Mike, >> >> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and >> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we >> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly >> suggest. Thank you. > > Even now arm64 still frees parts of the memory map and pfn_valid() should > be able to tell if a part of a section is freed or not. > > For instance for the following memory configuration > > |<----section---->|<----hole---->|<----section---->| > +--------+--------+--------------+--------+--------+ > | bank 0 | unused | | bank 1 | unused | > +--------+--------+--------------+--------+--------+ > > the memory map corresponding to the "unused" areas is freed, but the generic > pfn_valid() will still return 1 there. But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP is enabled, which is the only option now on arm64. Then how can memmap have holes (from unused areas) anymore ? Am I missing something here. > > So we either should stop freeing unused memory map on arm64, or keep > arm64::pfn_valid() or implement something along the lines of this patch. > > I personally don't think that the memory savings from freeing the unused > memory map worth the pain of maintenance and bugs happening here and there. >
On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote: > > On 5/24/21 12:22 PM, Mike Rapoport wrote: > > Hello Anshuman, > > > > On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: > >> > >> On 4/22/21 1:20 PM, Anshuman Khandual wrote: > >>> Platforms like arm and arm64 have redefined pfn_valid() because their early > >>> memory sections might have contained memmap holes after freeing parts of it > >>> during boot, which should be skipped while validating a pfn for struct page > >>> backing. This scenario on certain platforms where memmap is not continuous, > >>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > >>> Then the generic pfn_valid() can be improved to accommodate such platforms. > >>> This reduces overall code footprint and also improves maintainability. > >>> > >>> free_unused_memmap() and pfn_to_online_page() have been updated to include > >>> such cases. This also exports memblock_is_memory() for all drivers that use > >>> pfn_valid() but lack required visibility. After the new config is in place, > >>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > >>> > >>> Cc: Catalin Marinas <catalin.marinas@arm.com> > >>> Cc: Will Deacon <will@kernel.org> > >>> Cc: Andrew Morton <akpm@linux-foundation.org> > >>> Cc: Mike Rapoport <rppt@kernel.org> > >>> Cc: David Hildenbrand <david@redhat.com> > >>> Cc: linux-arm-kernel@lists.infradead.org > >>> Cc: linux-kernel@vger.kernel.org > >>> Cc: linux-mm@kvack.org > >>> Suggested-by: David Hildenbrand <david@redhat.com> > >>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > >>> --- > >>> This patch applies on the latest mainline kernel after Mike's series > >>> regarding arm64 based pfn_valid(). > >>> > >>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > >>> > >>> Changes in RFC V2: > >>> > >>> - Dropped support for arm (32 bit) > >>> - Replaced memblock_is_map_memory() check with memblock_is_memory() > >>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > >>> - Updated pfn_to_online_page() per David > >>> - Updated free_unused_memmap() to preserve existing semantics per Mike > >>> - Exported memblock_is_memory() instead of memblock_is_map_memory() > >>> > >>> Changes in RFC V1: > >>> > >>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > >>> > >>> arch/arm64/Kconfig | 2 +- > >>> arch/arm64/include/asm/page.h | 1 - > >>> arch/arm64/mm/init.c | 41 ----------------------------------- > >>> include/linux/mmzone.h | 18 ++++++++++++++- > >>> mm/Kconfig | 9 ++++++++ > >>> mm/memblock.c | 8 +++++-- > >>> mm/memory_hotplug.c | 5 +++++ > >>> 7 files changed, 38 insertions(+), 46 deletions(-) > >>> > >>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > >>> index b4a9b493ce72..4cdc3570ffa9 100644 > >>> --- a/arch/arm64/Kconfig > >>> +++ b/arch/arm64/Kconfig > >>> @@ -144,7 +144,6 @@ config ARM64 > >>> select HAVE_ARCH_KGDB > >>> select HAVE_ARCH_MMAP_RND_BITS > >>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > >>> - select HAVE_ARCH_PFN_VALID > >>> select HAVE_ARCH_PREL32_RELOCATIONS > >>> select HAVE_ARCH_SECCOMP_FILTER > >>> select HAVE_ARCH_STACKLEAK > >>> @@ -167,6 +166,7 @@ config ARM64 > >>> if $(cc-option,-fpatchable-function-entry=2) > >>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > >>> if DYNAMIC_FTRACE_WITH_REGS > >>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES > >>> select HAVE_EFFICIENT_UNALIGNED_ACCESS > >>> select HAVE_FAST_GUP > >>> select HAVE_FTRACE_MCOUNT_RECORD > >>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > >>> index 75ddfe671393..fcbef3eec4b2 100644 > >>> --- a/arch/arm64/include/asm/page.h > >>> +++ b/arch/arm64/include/asm/page.h > >>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > >>> > >>> typedef struct page *pgtable_t; > >>> > >>> -int pfn_valid(unsigned long pfn); > >>> int pfn_is_map_memory(unsigned long pfn); > >>> > >>> #include <asm/memory.h> > >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > >>> index f431b38d0837..5731a11550d8 100644 > >>> --- a/arch/arm64/mm/init.c > >>> +++ b/arch/arm64/mm/init.c > >>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > >>> free_area_init(max_zone_pfns); > >>> } > >>> > >>> -int pfn_valid(unsigned long pfn) > >>> -{ > >>> - phys_addr_t addr = PFN_PHYS(pfn); > >>> - > >>> - /* > >>> - * Ensure the upper PAGE_SHIFT bits are clear in the > >>> - * pfn. Else it might lead to false positives when > >>> - * some of the upper bits are set, but the lower bits > >>> - * match a valid pfn. > >>> - */ > >>> - if (PHYS_PFN(addr) != pfn) > >>> - return 0; > >>> - > >>> -#ifdef CONFIG_SPARSEMEM > >>> -{ > >>> - struct mem_section *ms; > >>> - > >>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > >>> - return 0; > >>> - > >>> - ms = __pfn_to_section(pfn); > >>> - if (!valid_section(ms)) > >>> - return 0; > >>> - > >>> - /* > >>> - * ZONE_DEVICE memory does not have the memblock entries. > >>> - * memblock_is_memory() check for ZONE_DEVICE based > >>> - * addresses will always fail. Even the normal hotplugged > >>> - * memory will never have MEMBLOCK_NOMAP flag set in their > >>> - * memblock entries. Skip memblock search for all non early > >>> - * memory sections covering all of hotplug memory including > >>> - * both normal and ZONE_DEVICE based. > >>> - */ > >>> - if (!early_section(ms)) > >>> - return pfn_section_valid(ms, pfn); > >>> -} > >>> -#endif > >>> - return memblock_is_memory(addr); > >>> -} > >>> -EXPORT_SYMBOL(pfn_valid); > >>> - > >>> int pfn_is_map_memory(unsigned long pfn) > >>> { > >>> phys_addr_t addr = PFN_PHYS(pfn); > >>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >>> index 961f0eeefb62..18bf71665211 100644 > >>> --- a/include/linux/mmzone.h > >>> +++ b/include/linux/mmzone.h > >>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > >>> * > >>> * Return: 1 for PFNs that have memory map entries and 0 otherwise > >>> */ > >>> +bool memblock_is_memory(phys_addr_t addr); > >>> + > >>> static inline int pfn_valid(unsigned long pfn) > >>> { > >>> + phys_addr_t addr = PFN_PHYS(pfn); > >>> struct mem_section *ms; > >>> > >>> + /* > >>> + * Ensure the upper PAGE_SHIFT bits are clear in the > >>> + * pfn. Else it might lead to false positives when > >>> + * some of the upper bits are set, but the lower bits > >>> + * match a valid pfn. > >>> + */ > >>> + if (PHYS_PFN(addr) != pfn) > >>> + return 0; > >>> + > >>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > >>> return 0; > >>> ms = __nr_to_section(pfn_to_section_nr(pfn)); > >>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > >>> * Traditionally early sections always returned pfn_valid() for > >>> * the entire section-sized span. > >>> */ > >>> - return early_section(ms) || pfn_section_valid(ms, pfn); > >>> + if (early_section(ms)) > >>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > >>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > >>> + > >>> + return pfn_section_valid(ms, pfn); > >>> } > >>> #endif > >> > >> Hello David/Mike, > >> > >> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and > >> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we > >> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly > >> suggest. Thank you. > > > > Even now arm64 still frees parts of the memory map and pfn_valid() should > > be able to tell if a part of a section is freed or not. > > > > For instance for the following memory configuration > > > > |<----section---->|<----hole---->|<----section---->| > > +--------+--------+--------------+--------+--------+ > > | bank 0 | unused | | bank 1 | unused | > > +--------+--------+--------------+--------+--------+ > > > > the memory map corresponding to the "unused" areas is freed, but the generic > > pfn_valid() will still return 1 there. > > But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP > is enabled, which is the only option now on arm64. Then how can memmap have > holes (from unused areas) anymore ? Am I missing something here. Ah, you are right, I missed this detail myself :) With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can simply rid of arm64::pfn_valid() without any changes to the generic version. > > So we either should stop freeing unused memory map on arm64, or keep > > arm64::pfn_valid() or implement something along the lines of this patch. > > > > I personally don't think that the memory savings from freeing the unused > > memory map worth the pain of maintenance and bugs happening here and there. > >
On 5/25/21 12:02 PM, Mike Rapoport wrote: > On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote: >> >> On 5/24/21 12:22 PM, Mike Rapoport wrote: >>> Hello Anshuman, >>> >>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: >>>> >>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote: >>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early >>>>> memory sections might have contained memmap holes after freeing parts of it >>>>> during boot, which should be skipped while validating a pfn for struct page >>>>> backing. This scenario on certain platforms where memmap is not continuous, >>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. >>>>> Then the generic pfn_valid() can be improved to accommodate such platforms. >>>>> This reduces overall code footprint and also improves maintainability. >>>>> >>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include >>>>> such cases. This also exports memblock_is_memory() for all drivers that use >>>>> pfn_valid() but lack required visibility. After the new config is in place, >>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. >>>>> >>>>> Cc: Catalin Marinas <catalin.marinas@arm.com> >>>>> Cc: Will Deacon <will@kernel.org> >>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>> Cc: Mike Rapoport <rppt@kernel.org> >>>>> Cc: David Hildenbrand <david@redhat.com> >>>>> Cc: linux-arm-kernel@lists.infradead.org >>>>> Cc: linux-kernel@vger.kernel.org >>>>> Cc: linux-mm@kvack.org >>>>> Suggested-by: David Hildenbrand <david@redhat.com> >>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>>>> --- >>>>> This patch applies on the latest mainline kernel after Mike's series >>>>> regarding arm64 based pfn_valid(). >>>>> >>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t >>>>> >>>>> Changes in RFC V2: >>>>> >>>>> - Dropped support for arm (32 bit) >>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory() >>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() >>>>> - Updated pfn_to_online_page() per David >>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike >>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory() >>>>> >>>>> Changes in RFC V1: >>>>> >>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ >>>>> >>>>> arch/arm64/Kconfig | 2 +- >>>>> arch/arm64/include/asm/page.h | 1 - >>>>> arch/arm64/mm/init.c | 41 ----------------------------------- >>>>> include/linux/mmzone.h | 18 ++++++++++++++- >>>>> mm/Kconfig | 9 ++++++++ >>>>> mm/memblock.c | 8 +++++-- >>>>> mm/memory_hotplug.c | 5 +++++ >>>>> 7 files changed, 38 insertions(+), 46 deletions(-) >>>>> >>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>>>> index b4a9b493ce72..4cdc3570ffa9 100644 >>>>> --- a/arch/arm64/Kconfig >>>>> +++ b/arch/arm64/Kconfig >>>>> @@ -144,7 +144,6 @@ config ARM64 >>>>> select HAVE_ARCH_KGDB >>>>> select HAVE_ARCH_MMAP_RND_BITS >>>>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >>>>> - select HAVE_ARCH_PFN_VALID >>>>> select HAVE_ARCH_PREL32_RELOCATIONS >>>>> select HAVE_ARCH_SECCOMP_FILTER >>>>> select HAVE_ARCH_STACKLEAK >>>>> @@ -167,6 +166,7 @@ config ARM64 >>>>> if $(cc-option,-fpatchable-function-entry=2) >>>>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ >>>>> if DYNAMIC_FTRACE_WITH_REGS >>>>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES >>>>> select HAVE_EFFICIENT_UNALIGNED_ACCESS >>>>> select HAVE_FAST_GUP >>>>> select HAVE_FTRACE_MCOUNT_RECORD >>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>>>> index 75ddfe671393..fcbef3eec4b2 100644 >>>>> --- a/arch/arm64/include/asm/page.h >>>>> +++ b/arch/arm64/include/asm/page.h >>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >>>>> >>>>> typedef struct page *pgtable_t; >>>>> >>>>> -int pfn_valid(unsigned long pfn); >>>>> int pfn_is_map_memory(unsigned long pfn); >>>>> >>>>> #include <asm/memory.h> >>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>>> index f431b38d0837..5731a11550d8 100644 >>>>> --- a/arch/arm64/mm/init.c >>>>> +++ b/arch/arm64/mm/init.c >>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >>>>> free_area_init(max_zone_pfns); >>>>> } >>>>> >>>>> -int pfn_valid(unsigned long pfn) >>>>> -{ >>>>> - phys_addr_t addr = PFN_PHYS(pfn); >>>>> - >>>>> - /* >>>>> - * Ensure the upper PAGE_SHIFT bits are clear in the >>>>> - * pfn. Else it might lead to false positives when >>>>> - * some of the upper bits are set, but the lower bits >>>>> - * match a valid pfn. >>>>> - */ >>>>> - if (PHYS_PFN(addr) != pfn) >>>>> - return 0; >>>>> - >>>>> -#ifdef CONFIG_SPARSEMEM >>>>> -{ >>>>> - struct mem_section *ms; >>>>> - >>>>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>>>> - return 0; >>>>> - >>>>> - ms = __pfn_to_section(pfn); >>>>> - if (!valid_section(ms)) >>>>> - return 0; >>>>> - >>>>> - /* >>>>> - * ZONE_DEVICE memory does not have the memblock entries. >>>>> - * memblock_is_memory() check for ZONE_DEVICE based >>>>> - * addresses will always fail. Even the normal hotplugged >>>>> - * memory will never have MEMBLOCK_NOMAP flag set in their >>>>> - * memblock entries. Skip memblock search for all non early >>>>> - * memory sections covering all of hotplug memory including >>>>> - * both normal and ZONE_DEVICE based. >>>>> - */ >>>>> - if (!early_section(ms)) >>>>> - return pfn_section_valid(ms, pfn); >>>>> -} >>>>> -#endif >>>>> - return memblock_is_memory(addr); >>>>> -} >>>>> -EXPORT_SYMBOL(pfn_valid); >>>>> - >>>>> int pfn_is_map_memory(unsigned long pfn) >>>>> { >>>>> phys_addr_t addr = PFN_PHYS(pfn); >>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>>>> index 961f0eeefb62..18bf71665211 100644 >>>>> --- a/include/linux/mmzone.h >>>>> +++ b/include/linux/mmzone.h >>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >>>>> * >>>>> * Return: 1 for PFNs that have memory map entries and 0 otherwise >>>>> */ >>>>> +bool memblock_is_memory(phys_addr_t addr); >>>>> + >>>>> static inline int pfn_valid(unsigned long pfn) >>>>> { >>>>> + phys_addr_t addr = PFN_PHYS(pfn); >>>>> struct mem_section *ms; >>>>> >>>>> + /* >>>>> + * Ensure the upper PAGE_SHIFT bits are clear in the >>>>> + * pfn. Else it might lead to false positives when >>>>> + * some of the upper bits are set, but the lower bits >>>>> + * match a valid pfn. >>>>> + */ >>>>> + if (PHYS_PFN(addr) != pfn) >>>>> + return 0; >>>>> + >>>>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>>>> return 0; >>>>> ms = __nr_to_section(pfn_to_section_nr(pfn)); >>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) >>>>> * Traditionally early sections always returned pfn_valid() for >>>>> * the entire section-sized span. >>>>> */ >>>>> - return early_section(ms) || pfn_section_valid(ms, pfn); >>>>> + if (early_section(ms)) >>>>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? >>>>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; >>>>> + >>>>> + return pfn_section_valid(ms, pfn); >>>>> } >>>>> #endif >>>> >>>> Hello David/Mike, >>>> >>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and >>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we >>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly >>>> suggest. Thank you. >>> >>> Even now arm64 still frees parts of the memory map and pfn_valid() should >>> be able to tell if a part of a section is freed or not. >>> >>> For instance for the following memory configuration >>> >>> |<----section---->|<----hole---->|<----section---->| >>> +--------+--------+--------------+--------+--------+ >>> | bank 0 | unused | | bank 1 | unused | >>> +--------+--------+--------------+--------+--------+ >>> >>> the memory map corresponding to the "unused" areas is freed, but the generic >>> pfn_valid() will still return 1 there. >> >> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP >> is enabled, which is the only option now on arm64. Then how can memmap have >> holes (from unused areas) anymore ? Am I missing something here. > > Ah, you are right, I missed this detail myself :) > > With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can > simply rid of arm64::pfn_valid() without any changes to the generic > version. Though just moved the pfn bits sanity check into generic pfn_valid(). I hope this looks okay. From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001 From: Anshuman Khandual <anshuman.khandual@arm.com> Date: Tue, 25 May 2021 10:27:09 +0100 Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64 platforms and free_unused_memmap() would just return without creating any holes in the memmap mapping. There is no need for any special handling in pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves the pfn upper bits sanity check into generic pfn_valid(). Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- arch/arm64/Kconfig | 1 - arch/arm64/include/asm/page.h | 1 - arch/arm64/mm/init.c | 37 ----------------------------------- include/linux/mmzone.h | 9 +++++++++ 4 files changed, 9 insertions(+), 39 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index d7dc8698cf8e..7904728befcc 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -154,7 +154,6 @@ config ARM64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT - select HAVE_ARCH_PFN_VALID select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET select HAVE_ARCH_SECCOMP_FILTER diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 75ddfe671393..fcbef3eec4b2 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; -int pfn_valid(unsigned long pfn); int pfn_is_map_memory(unsigned long pfn); #include <asm/memory.h> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 725aa84f2faa..49019ea0c8a8 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) free_area_init(max_zone_pfns); } -int pfn_valid(unsigned long pfn) -{ - phys_addr_t addr = PFN_PHYS(pfn); - struct mem_section *ms; - - /* - * Ensure the upper PAGE_SHIFT bits are clear in the - * pfn. Else it might lead to false positives when - * some of the upper bits are set, but the lower bits - * match a valid pfn. - */ - if (PHYS_PFN(addr) != pfn) - return 0; - - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) - return 0; - - ms = __pfn_to_section(pfn); - if (!valid_section(ms)) - return 0; - - /* - * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_map_memory() check for ZONE_DEVICE based - * addresses will always fail. Even the normal hotplugged - * memory will never have MEMBLOCK_NOMAP flag set in their - * memblock entries. Skip memblock search for all non early - * memory sections covering all of hotplug memory including - * both normal and ZONE_DEVICE based. - */ - if (!early_section(ms)) - return pfn_section_valid(ms, pfn); - - return memblock_is_memory(addr); -} -EXPORT_SYMBOL(pfn_valid); - int pfn_is_map_memory(unsigned long pfn) { phys_addr_t addr = PFN_PHYS(pfn); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a9b263d4cf9d..d0c4fc506fa3 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn) { struct mem_section *ms; + /* + * Ensure the upper PAGE_SHIFT bits are clear in the + * pfn. Else it might lead to false positives when + * some of the upper bits are set, but the lower bits + * match a valid pfn. + */ + if (PHYS_PFN(PFN_PHYS(pfn)) != pfn) + return 0; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; ms = __nr_to_section(pfn_to_section_nr(pfn));
On Tue, May 25, 2021 at 03:22:53PM +0530, Anshuman Khandual wrote: > > > On 5/25/21 12:02 PM, Mike Rapoport wrote: > > On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote: > >> > >> On 5/24/21 12:22 PM, Mike Rapoport wrote: > >>> Hello Anshuman, > >>> > >>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: > >>>> > >>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote: > >>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early > >>>>> memory sections might have contained memmap holes after freeing parts of it > >>>>> during boot, which should be skipped while validating a pfn for struct page > >>>>> backing. This scenario on certain platforms where memmap is not continuous, > >>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. > >>>>> Then the generic pfn_valid() can be improved to accommodate such platforms. > >>>>> This reduces overall code footprint and also improves maintainability. > >>>>> > >>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include > >>>>> such cases. This also exports memblock_is_memory() for all drivers that use > >>>>> pfn_valid() but lack required visibility. After the new config is in place, > >>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. > >>>>> > >>>>> Cc: Catalin Marinas <catalin.marinas@arm.com> > >>>>> Cc: Will Deacon <will@kernel.org> > >>>>> Cc: Andrew Morton <akpm@linux-foundation.org> > >>>>> Cc: Mike Rapoport <rppt@kernel.org> > >>>>> Cc: David Hildenbrand <david@redhat.com> > >>>>> Cc: linux-arm-kernel@lists.infradead.org > >>>>> Cc: linux-kernel@vger.kernel.org > >>>>> Cc: linux-mm@kvack.org > >>>>> Suggested-by: David Hildenbrand <david@redhat.com> > >>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > >>>>> --- > >>>>> This patch applies on the latest mainline kernel after Mike's series > >>>>> regarding arm64 based pfn_valid(). > >>>>> > >>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t > >>>>> > >>>>> Changes in RFC V2: > >>>>> > >>>>> - Dropped support for arm (32 bit) > >>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory() > >>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() > >>>>> - Updated pfn_to_online_page() per David > >>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike > >>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory() > >>>>> > >>>>> Changes in RFC V1: > >>>>> > >>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ > >>>>> > >>>>> arch/arm64/Kconfig | 2 +- > >>>>> arch/arm64/include/asm/page.h | 1 - > >>>>> arch/arm64/mm/init.c | 41 ----------------------------------- > >>>>> include/linux/mmzone.h | 18 ++++++++++++++- > >>>>> mm/Kconfig | 9 ++++++++ > >>>>> mm/memblock.c | 8 +++++-- > >>>>> mm/memory_hotplug.c | 5 +++++ > >>>>> 7 files changed, 38 insertions(+), 46 deletions(-) > >>>>> > >>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > >>>>> index b4a9b493ce72..4cdc3570ffa9 100644 > >>>>> --- a/arch/arm64/Kconfig > >>>>> +++ b/arch/arm64/Kconfig > >>>>> @@ -144,7 +144,6 @@ config ARM64 > >>>>> select HAVE_ARCH_KGDB > >>>>> select HAVE_ARCH_MMAP_RND_BITS > >>>>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > >>>>> - select HAVE_ARCH_PFN_VALID > >>>>> select HAVE_ARCH_PREL32_RELOCATIONS > >>>>> select HAVE_ARCH_SECCOMP_FILTER > >>>>> select HAVE_ARCH_STACKLEAK > >>>>> @@ -167,6 +166,7 @@ config ARM64 > >>>>> if $(cc-option,-fpatchable-function-entry=2) > >>>>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ > >>>>> if DYNAMIC_FTRACE_WITH_REGS > >>>>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES > >>>>> select HAVE_EFFICIENT_UNALIGNED_ACCESS > >>>>> select HAVE_FAST_GUP > >>>>> select HAVE_FTRACE_MCOUNT_RECORD > >>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > >>>>> index 75ddfe671393..fcbef3eec4b2 100644 > >>>>> --- a/arch/arm64/include/asm/page.h > >>>>> +++ b/arch/arm64/include/asm/page.h > >>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > >>>>> > >>>>> typedef struct page *pgtable_t; > >>>>> > >>>>> -int pfn_valid(unsigned long pfn); > >>>>> int pfn_is_map_memory(unsigned long pfn); > >>>>> > >>>>> #include <asm/memory.h> > >>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > >>>>> index f431b38d0837..5731a11550d8 100644 > >>>>> --- a/arch/arm64/mm/init.c > >>>>> +++ b/arch/arm64/mm/init.c > >>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > >>>>> free_area_init(max_zone_pfns); > >>>>> } > >>>>> > >>>>> -int pfn_valid(unsigned long pfn) > >>>>> -{ > >>>>> - phys_addr_t addr = PFN_PHYS(pfn); > >>>>> - > >>>>> - /* > >>>>> - * Ensure the upper PAGE_SHIFT bits are clear in the > >>>>> - * pfn. Else it might lead to false positives when > >>>>> - * some of the upper bits are set, but the lower bits > >>>>> - * match a valid pfn. > >>>>> - */ > >>>>> - if (PHYS_PFN(addr) != pfn) > >>>>> - return 0; > >>>>> - > >>>>> -#ifdef CONFIG_SPARSEMEM > >>>>> -{ > >>>>> - struct mem_section *ms; > >>>>> - > >>>>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > >>>>> - return 0; > >>>>> - > >>>>> - ms = __pfn_to_section(pfn); > >>>>> - if (!valid_section(ms)) > >>>>> - return 0; > >>>>> - > >>>>> - /* > >>>>> - * ZONE_DEVICE memory does not have the memblock entries. > >>>>> - * memblock_is_memory() check for ZONE_DEVICE based > >>>>> - * addresses will always fail. Even the normal hotplugged > >>>>> - * memory will never have MEMBLOCK_NOMAP flag set in their > >>>>> - * memblock entries. Skip memblock search for all non early > >>>>> - * memory sections covering all of hotplug memory including > >>>>> - * both normal and ZONE_DEVICE based. > >>>>> - */ > >>>>> - if (!early_section(ms)) > >>>>> - return pfn_section_valid(ms, pfn); > >>>>> -} > >>>>> -#endif > >>>>> - return memblock_is_memory(addr); > >>>>> -} > >>>>> -EXPORT_SYMBOL(pfn_valid); > >>>>> - > >>>>> int pfn_is_map_memory(unsigned long pfn) > >>>>> { > >>>>> phys_addr_t addr = PFN_PHYS(pfn); > >>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >>>>> index 961f0eeefb62..18bf71665211 100644 > >>>>> --- a/include/linux/mmzone.h > >>>>> +++ b/include/linux/mmzone.h > >>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) > >>>>> * > >>>>> * Return: 1 for PFNs that have memory map entries and 0 otherwise > >>>>> */ > >>>>> +bool memblock_is_memory(phys_addr_t addr); > >>>>> + > >>>>> static inline int pfn_valid(unsigned long pfn) > >>>>> { > >>>>> + phys_addr_t addr = PFN_PHYS(pfn); > >>>>> struct mem_section *ms; > >>>>> > >>>>> + /* > >>>>> + * Ensure the upper PAGE_SHIFT bits are clear in the > >>>>> + * pfn. Else it might lead to false positives when > >>>>> + * some of the upper bits are set, but the lower bits > >>>>> + * match a valid pfn. > >>>>> + */ > >>>>> + if (PHYS_PFN(addr) != pfn) > >>>>> + return 0; > >>>>> + > >>>>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > >>>>> return 0; > >>>>> ms = __nr_to_section(pfn_to_section_nr(pfn)); > >>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) > >>>>> * Traditionally early sections always returned pfn_valid() for > >>>>> * the entire section-sized span. > >>>>> */ > >>>>> - return early_section(ms) || pfn_section_valid(ms, pfn); > >>>>> + if (early_section(ms)) > >>>>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? > >>>>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; > >>>>> + > >>>>> + return pfn_section_valid(ms, pfn); > >>>>> } > >>>>> #endif > >>>> > >>>> Hello David/Mike, > >>>> > >>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and > >>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we > >>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly > >>>> suggest. Thank you. > >>> > >>> Even now arm64 still frees parts of the memory map and pfn_valid() should > >>> be able to tell if a part of a section is freed or not. > >>> > >>> For instance for the following memory configuration > >>> > >>> |<----section---->|<----hole---->|<----section---->| > >>> +--------+--------+--------------+--------+--------+ > >>> | bank 0 | unused | | bank 1 | unused | > >>> +--------+--------+--------------+--------+--------+ > >>> > >>> the memory map corresponding to the "unused" areas is freed, but the generic > >>> pfn_valid() will still return 1 there. > >> > >> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP > >> is enabled, which is the only option now on arm64. Then how can memmap have > >> holes (from unused areas) anymore ? Am I missing something here. > > > > Ah, you are right, I missed this detail myself :) > > > > With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can > > simply rid of arm64::pfn_valid() without any changes to the generic > > version. > > Though just moved the pfn bits sanity check into generic pfn_valid(). > I hope this looks okay. > > From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001 > From: Anshuman Khandual <anshuman.khandual@arm.com> > Date: Tue, 25 May 2021 10:27:09 +0100 > Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID > > CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64 > platforms and free_unused_memmap() would just return without creating any > holes in the memmap mapping. There is no need for any special handling in > pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves > the pfn upper bits sanity check into generic pfn_valid(). > > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> > --- > arch/arm64/Kconfig | 1 - > arch/arm64/include/asm/page.h | 1 - > arch/arm64/mm/init.c | 37 ----------------------------------- > include/linux/mmzone.h | 9 +++++++++ > 4 files changed, 9 insertions(+), 39 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index d7dc8698cf8e..7904728befcc 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -154,7 +154,6 @@ config ARM64 > select HAVE_ARCH_KGDB > select HAVE_ARCH_MMAP_RND_BITS > select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > - select HAVE_ARCH_PFN_VALID > select HAVE_ARCH_PREL32_RELOCATIONS > select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET > select HAVE_ARCH_SECCOMP_FILTER > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h > index 75ddfe671393..fcbef3eec4b2 100644 > --- a/arch/arm64/include/asm/page.h > +++ b/arch/arm64/include/asm/page.h > @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); > > typedef struct page *pgtable_t; > > -int pfn_valid(unsigned long pfn); > int pfn_is_map_memory(unsigned long pfn); > > #include <asm/memory.h> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 725aa84f2faa..49019ea0c8a8 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) > free_area_init(max_zone_pfns); > } > > -int pfn_valid(unsigned long pfn) > -{ > - phys_addr_t addr = PFN_PHYS(pfn); > - struct mem_section *ms; > - > - /* > - * Ensure the upper PAGE_SHIFT bits are clear in the > - * pfn. Else it might lead to false positives when > - * some of the upper bits are set, but the lower bits > - * match a valid pfn. > - */ > - if (PHYS_PFN(addr) != pfn) > - return 0; > - > - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > - return 0; > - > - ms = __pfn_to_section(pfn); > - if (!valid_section(ms)) > - return 0; > - > - /* > - * ZONE_DEVICE memory does not have the memblock entries. > - * memblock_is_map_memory() check for ZONE_DEVICE based > - * addresses will always fail. Even the normal hotplugged > - * memory will never have MEMBLOCK_NOMAP flag set in their > - * memblock entries. Skip memblock search for all non early > - * memory sections covering all of hotplug memory including > - * both normal and ZONE_DEVICE based. > - */ > - if (!early_section(ms)) > - return pfn_section_valid(ms, pfn); > - > - return memblock_is_memory(addr); > -} > -EXPORT_SYMBOL(pfn_valid); > - > int pfn_is_map_memory(unsigned long pfn) > { > phys_addr_t addr = PFN_PHYS(pfn); > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index a9b263d4cf9d..d0c4fc506fa3 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn) > { > struct mem_section *ms; > > + /* > + * Ensure the upper PAGE_SHIFT bits are clear in the > + * pfn. Else it might lead to false positives when > + * some of the upper bits are set, but the lower bits > + * match a valid pfn. > + */ > + if (PHYS_PFN(PFN_PHYS(pfn)) != pfn) > + return 0; > + > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __nr_to_section(pfn_to_section_nr(pfn)); > -- > 2.20.1
On 25.05.21 12:03, Mike Rapoport wrote: > On Tue, May 25, 2021 at 03:22:53PM +0530, Anshuman Khandual wrote: >> >> >> On 5/25/21 12:02 PM, Mike Rapoport wrote: >>> On Tue, May 25, 2021 at 11:30:15AM +0530, Anshuman Khandual wrote: >>>> >>>> On 5/24/21 12:22 PM, Mike Rapoport wrote: >>>>> Hello Anshuman, >>>>> >>>>> On Mon, May 24, 2021 at 10:28:32AM +0530, Anshuman Khandual wrote: >>>>>> >>>>>> On 4/22/21 1:20 PM, Anshuman Khandual wrote: >>>>>>> Platforms like arm and arm64 have redefined pfn_valid() because their early >>>>>>> memory sections might have contained memmap holes after freeing parts of it >>>>>>> during boot, which should be skipped while validating a pfn for struct page >>>>>>> backing. This scenario on certain platforms where memmap is not continuous, >>>>>>> could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. >>>>>>> Then the generic pfn_valid() can be improved to accommodate such platforms. >>>>>>> This reduces overall code footprint and also improves maintainability. >>>>>>> >>>>>>> free_unused_memmap() and pfn_to_online_page() have been updated to include >>>>>>> such cases. This also exports memblock_is_memory() for all drivers that use >>>>>>> pfn_valid() but lack required visibility. After the new config is in place, >>>>>>> drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. >>>>>>> >>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com> >>>>>>> Cc: Will Deacon <will@kernel.org> >>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>>>> Cc: Mike Rapoport <rppt@kernel.org> >>>>>>> Cc: David Hildenbrand <david@redhat.com> >>>>>>> Cc: linux-arm-kernel@lists.infradead.org >>>>>>> Cc: linux-kernel@vger.kernel.org >>>>>>> Cc: linux-mm@kvack.org >>>>>>> Suggested-by: David Hildenbrand <david@redhat.com> >>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >>>>>>> --- >>>>>>> This patch applies on the latest mainline kernel after Mike's series >>>>>>> regarding arm64 based pfn_valid(). >>>>>>> >>>>>>> https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t >>>>>>> >>>>>>> Changes in RFC V2: >>>>>>> >>>>>>> - Dropped support for arm (32 bit) >>>>>>> - Replaced memblock_is_map_memory() check with memblock_is_memory() >>>>>>> - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() >>>>>>> - Updated pfn_to_online_page() per David >>>>>>> - Updated free_unused_memmap() to preserve existing semantics per Mike >>>>>>> - Exported memblock_is_memory() instead of memblock_is_map_memory() >>>>>>> >>>>>>> Changes in RFC V1: >>>>>>> >>>>>>> - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ >>>>>>> >>>>>>> arch/arm64/Kconfig | 2 +- >>>>>>> arch/arm64/include/asm/page.h | 1 - >>>>>>> arch/arm64/mm/init.c | 41 ----------------------------------- >>>>>>> include/linux/mmzone.h | 18 ++++++++++++++- >>>>>>> mm/Kconfig | 9 ++++++++ >>>>>>> mm/memblock.c | 8 +++++-- >>>>>>> mm/memory_hotplug.c | 5 +++++ >>>>>>> 7 files changed, 38 insertions(+), 46 deletions(-) >>>>>>> >>>>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >>>>>>> index b4a9b493ce72..4cdc3570ffa9 100644 >>>>>>> --- a/arch/arm64/Kconfig >>>>>>> +++ b/arch/arm64/Kconfig >>>>>>> @@ -144,7 +144,6 @@ config ARM64 >>>>>>> select HAVE_ARCH_KGDB >>>>>>> select HAVE_ARCH_MMAP_RND_BITS >>>>>>> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >>>>>>> - select HAVE_ARCH_PFN_VALID >>>>>>> select HAVE_ARCH_PREL32_RELOCATIONS >>>>>>> select HAVE_ARCH_SECCOMP_FILTER >>>>>>> select HAVE_ARCH_STACKLEAK >>>>>>> @@ -167,6 +166,7 @@ config ARM64 >>>>>>> if $(cc-option,-fpatchable-function-entry=2) >>>>>>> select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ >>>>>>> if DYNAMIC_FTRACE_WITH_REGS >>>>>>> + select HAVE_EARLY_SECTION_MEMMAP_HOLES >>>>>>> select HAVE_EFFICIENT_UNALIGNED_ACCESS >>>>>>> select HAVE_FAST_GUP >>>>>>> select HAVE_FTRACE_MCOUNT_RECORD >>>>>>> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >>>>>>> index 75ddfe671393..fcbef3eec4b2 100644 >>>>>>> --- a/arch/arm64/include/asm/page.h >>>>>>> +++ b/arch/arm64/include/asm/page.h >>>>>>> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >>>>>>> >>>>>>> typedef struct page *pgtable_t; >>>>>>> >>>>>>> -int pfn_valid(unsigned long pfn); >>>>>>> int pfn_is_map_memory(unsigned long pfn); >>>>>>> >>>>>>> #include <asm/memory.h> >>>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>>>>> index f431b38d0837..5731a11550d8 100644 >>>>>>> --- a/arch/arm64/mm/init.c >>>>>>> +++ b/arch/arm64/mm/init.c >>>>>>> @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >>>>>>> free_area_init(max_zone_pfns); >>>>>>> } >>>>>>> >>>>>>> -int pfn_valid(unsigned long pfn) >>>>>>> -{ >>>>>>> - phys_addr_t addr = PFN_PHYS(pfn); >>>>>>> - >>>>>>> - /* >>>>>>> - * Ensure the upper PAGE_SHIFT bits are clear in the >>>>>>> - * pfn. Else it might lead to false positives when >>>>>>> - * some of the upper bits are set, but the lower bits >>>>>>> - * match a valid pfn. >>>>>>> - */ >>>>>>> - if (PHYS_PFN(addr) != pfn) >>>>>>> - return 0; >>>>>>> - >>>>>>> -#ifdef CONFIG_SPARSEMEM >>>>>>> -{ >>>>>>> - struct mem_section *ms; >>>>>>> - >>>>>>> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>>>>>> - return 0; >>>>>>> - >>>>>>> - ms = __pfn_to_section(pfn); >>>>>>> - if (!valid_section(ms)) >>>>>>> - return 0; >>>>>>> - >>>>>>> - /* >>>>>>> - * ZONE_DEVICE memory does not have the memblock entries. >>>>>>> - * memblock_is_memory() check for ZONE_DEVICE based >>>>>>> - * addresses will always fail. Even the normal hotplugged >>>>>>> - * memory will never have MEMBLOCK_NOMAP flag set in their >>>>>>> - * memblock entries. Skip memblock search for all non early >>>>>>> - * memory sections covering all of hotplug memory including >>>>>>> - * both normal and ZONE_DEVICE based. >>>>>>> - */ >>>>>>> - if (!early_section(ms)) >>>>>>> - return pfn_section_valid(ms, pfn); >>>>>>> -} >>>>>>> -#endif >>>>>>> - return memblock_is_memory(addr); >>>>>>> -} >>>>>>> -EXPORT_SYMBOL(pfn_valid); >>>>>>> - >>>>>>> int pfn_is_map_memory(unsigned long pfn) >>>>>>> { >>>>>>> phys_addr_t addr = PFN_PHYS(pfn); >>>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>>>>>> index 961f0eeefb62..18bf71665211 100644 >>>>>>> --- a/include/linux/mmzone.h >>>>>>> +++ b/include/linux/mmzone.h >>>>>>> @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) >>>>>>> * >>>>>>> * Return: 1 for PFNs that have memory map entries and 0 otherwise >>>>>>> */ >>>>>>> +bool memblock_is_memory(phys_addr_t addr); >>>>>>> + >>>>>>> static inline int pfn_valid(unsigned long pfn) >>>>>>> { >>>>>>> + phys_addr_t addr = PFN_PHYS(pfn); >>>>>>> struct mem_section *ms; >>>>>>> >>>>>>> + /* >>>>>>> + * Ensure the upper PAGE_SHIFT bits are clear in the >>>>>>> + * pfn. Else it might lead to false positives when >>>>>>> + * some of the upper bits are set, but the lower bits >>>>>>> + * match a valid pfn. >>>>>>> + */ >>>>>>> + if (PHYS_PFN(addr) != pfn) >>>>>>> + return 0; >>>>>>> + >>>>>>> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >>>>>>> return 0; >>>>>>> ms = __nr_to_section(pfn_to_section_nr(pfn)); >>>>>>> @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) >>>>>>> * Traditionally early sections always returned pfn_valid() for >>>>>>> * the entire section-sized span. >>>>>>> */ >>>>>>> - return early_section(ms) || pfn_section_valid(ms, pfn); >>>>>>> + if (early_section(ms)) >>>>>>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? >>>>>>> + memblock_is_memory(pfn << PAGE_SHIFT) : 1; >>>>>>> + >>>>>>> + return pfn_section_valid(ms, pfn); >>>>>>> } >>>>>>> #endif >>>>>> >>>>>> Hello David/Mike, >>>>>> >>>>>> Now that pfn_is_map_memory() usage has been decoupled from pfn_valid() and >>>>>> SPARSEMEM_VMEMMAP is only available memory model on arm64, wondering if we >>>>>> still need this HAVE_EARLY_SECTION_MEMMAP_HOLES proposal ? Please do kindly >>>>>> suggest. Thank you. >>>>> >>>>> Even now arm64 still frees parts of the memory map and pfn_valid() should >>>>> be able to tell if a part of a section is freed or not. >>>>> >>>>> For instance for the following memory configuration >>>>> >>>>> |<----section---->|<----hole---->|<----section---->| >>>>> +--------+--------+--------------+--------+--------+ >>>>> | bank 0 | unused | | bank 1 | unused | >>>>> +--------+--------+--------------+--------+--------+ >>>>> >>>>> the memory map corresponding to the "unused" areas is freed, but the generic >>>>> pfn_valid() will still return 1 there. >>>> >>>> But is not free_unused_memmap() return early when CONFIG_SPARSEMEM_VMEMMAP >>>> is enabled, which is the only option now on arm64. Then how can memmap have >>>> holes (from unused areas) anymore ? Am I missing something here. >>> >>> Ah, you are right, I missed this detail myself :) >>> >>> With CONFIG_SPARSEMEM_VMEMMAP as the only memory model for arm64, we can >>> simply rid of arm64::pfn_valid() without any changes to the generic >>> version. >> >> Though just moved the pfn bits sanity check into generic pfn_valid(). >> I hope this looks okay. >> >> From 7a63f460bcb6ae171c2081bfad81edd9e8f3b7a0 Mon Sep 17 00:00:00 2001 >> From: Anshuman Khandual <anshuman.khandual@arm.com> >> Date: Tue, 25 May 2021 10:27:09 +0100 >> Subject: [PATCH] arm64/mm: Drop HAVE_ARCH_PFN_VALID >> >> CONFIG_SPARSEMEM_VMEMMAP is now the only available memory model on arm64 >> platforms and free_unused_memmap() would just return without creating any >> holes in the memmap mapping. There is no need for any special handling in >> pfn_valid() and HAVE_ARCH_PFN_VALID can just be dropped. This also moves >> the pfn upper bits sanity check into generic pfn_valid(). >> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > > Acked-by: Mike Rapoport <rppt@linux.ibm.com> > Indeed, looks good Acked-by: David Hildenbrand <david@redhat.com> >> --- >> arch/arm64/Kconfig | 1 - >> arch/arm64/include/asm/page.h | 1 - >> arch/arm64/mm/init.c | 37 ----------------------------------- >> include/linux/mmzone.h | 9 +++++++++ >> 4 files changed, 9 insertions(+), 39 deletions(-) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index d7dc8698cf8e..7904728befcc 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -154,7 +154,6 @@ config ARM64 >> select HAVE_ARCH_KGDB >> select HAVE_ARCH_MMAP_RND_BITS >> select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >> - select HAVE_ARCH_PFN_VALID >> select HAVE_ARCH_PREL32_RELOCATIONS >> select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET >> select HAVE_ARCH_SECCOMP_FILTER >> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h >> index 75ddfe671393..fcbef3eec4b2 100644 >> --- a/arch/arm64/include/asm/page.h >> +++ b/arch/arm64/include/asm/page.h >> @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); >> >> typedef struct page *pgtable_t; >> >> -int pfn_valid(unsigned long pfn); >> int pfn_is_map_memory(unsigned long pfn); >> >> #include <asm/memory.h> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 725aa84f2faa..49019ea0c8a8 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -219,43 +219,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) >> free_area_init(max_zone_pfns); >> } >> >> -int pfn_valid(unsigned long pfn) >> -{ >> - phys_addr_t addr = PFN_PHYS(pfn); >> - struct mem_section *ms; >> - >> - /* >> - * Ensure the upper PAGE_SHIFT bits are clear in the >> - * pfn. Else it might lead to false positives when >> - * some of the upper bits are set, but the lower bits >> - * match a valid pfn. >> - */ >> - if (PHYS_PFN(addr) != pfn) >> - return 0; >> - >> - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> - return 0; >> - >> - ms = __pfn_to_section(pfn); >> - if (!valid_section(ms)) >> - return 0; >> - >> - /* >> - * ZONE_DEVICE memory does not have the memblock entries. >> - * memblock_is_map_memory() check for ZONE_DEVICE based >> - * addresses will always fail. Even the normal hotplugged >> - * memory will never have MEMBLOCK_NOMAP flag set in their >> - * memblock entries. Skip memblock search for all non early >> - * memory sections covering all of hotplug memory including >> - * both normal and ZONE_DEVICE based. >> - */ >> - if (!early_section(ms)) >> - return pfn_section_valid(ms, pfn); >> - >> - return memblock_is_memory(addr); >> -} >> -EXPORT_SYMBOL(pfn_valid); >> - >> int pfn_is_map_memory(unsigned long pfn) >> { >> phys_addr_t addr = PFN_PHYS(pfn); >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >> index a9b263d4cf9d..d0c4fc506fa3 100644 >> --- a/include/linux/mmzone.h >> +++ b/include/linux/mmzone.h >> @@ -1443,6 +1443,15 @@ static inline int pfn_valid(unsigned long pfn) >> { >> struct mem_section *ms; >> >> + /* >> + * Ensure the upper PAGE_SHIFT bits are clear in the >> + * pfn. Else it might lead to false positives when >> + * some of the upper bits are set, but the lower bits >> + * match a valid pfn. >> + */ >> + if (PHYS_PFN(PFN_PHYS(pfn)) != pfn) >> + return 0; >> + >> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) >> return 0; >> ms = __nr_to_section(pfn_to_section_nr(pfn)); >> -- >> 2.20.1 >
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b4a9b493ce72..4cdc3570ffa9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -144,7 +144,6 @@ config ARM64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT - select HAVE_ARCH_PFN_VALID select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_STACKLEAK @@ -167,6 +166,7 @@ config ARM64 if $(cc-option,-fpatchable-function-entry=2) select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ if DYNAMIC_FTRACE_WITH_REGS + select HAVE_EARLY_SECTION_MEMMAP_HOLES select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_FAST_GUP select HAVE_FTRACE_MCOUNT_RECORD diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 75ddfe671393..fcbef3eec4b2 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -37,7 +37,6 @@ void copy_highpage(struct page *to, struct page *from); typedef struct page *pgtable_t; -int pfn_valid(unsigned long pfn); int pfn_is_map_memory(unsigned long pfn); #include <asm/memory.h> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index f431b38d0837..5731a11550d8 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -217,47 +217,6 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max) free_area_init(max_zone_pfns); } -int pfn_valid(unsigned long pfn) -{ - phys_addr_t addr = PFN_PHYS(pfn); - - /* - * Ensure the upper PAGE_SHIFT bits are clear in the - * pfn. Else it might lead to false positives when - * some of the upper bits are set, but the lower bits - * match a valid pfn. - */ - if (PHYS_PFN(addr) != pfn) - return 0; - -#ifdef CONFIG_SPARSEMEM -{ - struct mem_section *ms; - - if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) - return 0; - - ms = __pfn_to_section(pfn); - if (!valid_section(ms)) - return 0; - - /* - * ZONE_DEVICE memory does not have the memblock entries. - * memblock_is_memory() check for ZONE_DEVICE based - * addresses will always fail. Even the normal hotplugged - * memory will never have MEMBLOCK_NOMAP flag set in their - * memblock entries. Skip memblock search for all non early - * memory sections covering all of hotplug memory including - * both normal and ZONE_DEVICE based. - */ - if (!early_section(ms)) - return pfn_section_valid(ms, pfn); -} -#endif - return memblock_is_memory(addr); -} -EXPORT_SYMBOL(pfn_valid); - int pfn_is_map_memory(unsigned long pfn) { phys_addr_t addr = PFN_PHYS(pfn); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 961f0eeefb62..18bf71665211 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1421,10 +1421,22 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) * * Return: 1 for PFNs that have memory map entries and 0 otherwise */ +bool memblock_is_memory(phys_addr_t addr); + static inline int pfn_valid(unsigned long pfn) { + phys_addr_t addr = PFN_PHYS(pfn); struct mem_section *ms; + /* + * Ensure the upper PAGE_SHIFT bits are clear in the + * pfn. Else it might lead to false positives when + * some of the upper bits are set, but the lower bits + * match a valid pfn. + */ + if (PHYS_PFN(addr) != pfn) + return 0; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; ms = __nr_to_section(pfn_to_section_nr(pfn)); @@ -1434,7 +1446,11 @@ static inline int pfn_valid(unsigned long pfn) * Traditionally early sections always returned pfn_valid() for * the entire section-sized span. */ - return early_section(ms) || pfn_section_valid(ms, pfn); + if (early_section(ms)) + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ? + memblock_is_memory(pfn << PAGE_SHIFT) : 1; + + return pfn_section_valid(ms, pfn); } #endif diff --git a/mm/Kconfig b/mm/Kconfig index 24c045b24b95..db7128111874 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -135,6 +135,15 @@ config HAVE_FAST_GUP config ARCH_KEEP_MEMBLOCK bool +config HAVE_EARLY_SECTION_MEMMAP_HOLES + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP + def_bool n + help + Early sections on certain platforms might have some memory ranges that + are not backed with struct page mappings. When subscribed, this option + enables special handling for those memory ranges in certain situations + such as pfn_valid(). + # Keep arch NUMA mapping infrastructure post-init. config NUMA_KEEP_MEMINFO bool diff --git a/mm/memblock.c b/mm/memblock.c index 3abf2c3fea7f..93f8a9c8428d 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1740,6 +1740,7 @@ bool __init_memblock memblock_is_memory(phys_addr_t addr) { return memblock_search(&memblock.memory, addr) != -1; } +EXPORT_SYMBOL(memblock_is_memory); bool __init_memblock memblock_is_map_memory(phys_addr_t addr) { @@ -1931,8 +1932,11 @@ static void __init free_unused_memmap(void) unsigned long start, end, prev_end = 0; int i; - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) || - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + return; + + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) && + !IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID)) return; /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0cdbbfbc5757..8c78b6a3d888 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn) * Save some code text when online_section() + * pfn_section_valid() are sufficient. */ + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) { + if (early_section(ms) && !memblock_is_memory(PFN_PHYS(pfn))) + return NULL; + } + if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn)) return NULL;
Platforms like arm and arm64 have redefined pfn_valid() because their early memory sections might have contained memmap holes after freeing parts of it during boot, which should be skipped while validating a pfn for struct page backing. This scenario on certain platforms where memmap is not continuous, could be captured with a new option CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES. Then the generic pfn_valid() can be improved to accommodate such platforms. This reduces overall code footprint and also improves maintainability. free_unused_memmap() and pfn_to_online_page() have been updated to include such cases. This also exports memblock_is_memory() for all drivers that use pfn_valid() but lack required visibility. After the new config is in place, drop CONFIG_HAVE_ARCH_PFN_VALID from arm64 platforms. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> --- This patch applies on the latest mainline kernel after Mike's series regarding arm64 based pfn_valid(). https://lore.kernel.org/linux-mm/20210422061902.21614-1-rppt@kernel.org/T/#t Changes in RFC V2: - Dropped support for arm (32 bit) - Replaced memblock_is_map_memory() check with memblock_is_memory() - MEMBLOCK_NOMAP memory are no longer skipped for pfn_valid() - Updated pfn_to_online_page() per David - Updated free_unused_memmap() to preserve existing semantics per Mike - Exported memblock_is_memory() instead of memblock_is_map_memory() Changes in RFC V1: - https://patchwork.kernel.org/project/linux-mm/patch/1615174073-10520-1-git-send-email-anshuman.khandual@arm.com/ arch/arm64/Kconfig | 2 +- arch/arm64/include/asm/page.h | 1 - arch/arm64/mm/init.c | 41 ----------------------------------- include/linux/mmzone.h | 18 ++++++++++++++- mm/Kconfig | 9 ++++++++ mm/memblock.c | 8 +++++-- mm/memory_hotplug.c | 5 +++++ 7 files changed, 38 insertions(+), 46 deletions(-)