mm: sparse: Skip no-map regions in memblocks_present
diff mbox series

Message ID 1562921491-23899-1-git-send-email-karahmed@amazon.de
State New
Headers show
Series
  • mm: sparse: Skip no-map regions in memblocks_present
Related show

Commit Message

Raslan, KarimAllah July 12, 2019, 8:51 a.m. UTC
Do not mark regions that are marked with nomap to be present, otherwise
these memblock cause unnecessarily allocation of metadata.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
---
 mm/sparse.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Wei Yang July 12, 2019, 11:09 p.m. UTC | #1
On Fri, Jul 12, 2019 at 10:51:31AM +0200, KarimAllah Ahmed wrote:
>Do not mark regions that are marked with nomap to be present, otherwise
>these memblock cause unnecessarily allocation of metadata.
>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: Michal Hocko <mhocko@suse.com>
>Cc: Mike Rapoport <rppt@linux.ibm.com>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Qian Cai <cai@lca.pw>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Cc: Logan Gunthorpe <logang@deltatee.com>
>Cc: linux-mm@kvack.org
>Cc: linux-kernel@vger.kernel.org
>Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>---
> mm/sparse.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
>diff --git a/mm/sparse.c b/mm/sparse.c
>index fd13166..33810b6 100644
>--- a/mm/sparse.c
>+++ b/mm/sparse.c
>@@ -256,6 +256,10 @@ void __init memblocks_present(void)
> 	struct memblock_region *reg;
> 
> 	for_each_memblock(memory, reg) {
>+
>+		if (memblock_is_nomap(reg))
>+			continue;
>+
> 		memory_present(memblock_get_region_node(reg),
> 			       memblock_region_memory_base_pfn(reg),
> 			       memblock_region_memory_end_pfn(reg));


The logic looks good, while I am not sure this would take effect. Since the
metadata is SECTION size aligned while memblock is not.

If I am correct, on arm64, we mark nomap memblock in map_mem()

    memblock_mark_nomap(kernel_start, kernel_end - kernel_start);

And kernel text area is less than 40M, if I am right. This means
memblocks_present would still mark the section present. 

Would you mind showing how much memory range it is marked nomap?

>-- 
>2.7.4
Raslan, KarimAllah July 13, 2019, 1:53 p.m. UTC | #2
On Fri, 2019-07-12 at 23:09 +0000, Wei Yang wrote:
> On Fri, Jul 12, 2019 at 10:51:31AM +0200, KarimAllah Ahmed wrote:
> > 
> > Do not mark regions that are marked with nomap to be present, otherwise
> > these memblock cause unnecessarily allocation of metadata.
> > 
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Mike Rapoport <rppt@linux.ibm.com>
> > Cc: Baoquan He <bhe@redhat.com>
> > Cc: Qian Cai <cai@lca.pw>
> > Cc: Wei Yang <richard.weiyang@gmail.com>
> > Cc: Logan Gunthorpe <logang@deltatee.com>
> > Cc: linux-mm@kvack.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> > ---
> > mm/sparse.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> > 
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index fd13166..33810b6 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -256,6 +256,10 @@ void __init memblocks_present(void)
> > 	struct memblock_region *reg;
> > 
> > 	for_each_memblock(memory, reg) {
> > +
> > +		if (memblock_is_nomap(reg))
> > +			continue;
> > +
> > 		memory_present(memblock_get_region_node(reg),
> > 			       memblock_region_memory_base_pfn(reg),
> > 			       memblock_region_memory_end_pfn(reg));
> 
> 
> The logic looks good, while I am not sure this would take effect. Since the
> metadata is SECTION size aligned while memblock is not.
> 
> If I am correct, on arm64, we mark nomap memblock in map_mem()
> 
>     memblock_mark_nomap(kernel_start, kernel_end - kernel_start);

The nomap is also done by EFI code in ${src}/drivers/firmware/efi/arm-init.c

.. and hopefully in the future by this:
https://lkml.org/lkml/2019/7/12/126

So it is not really striclty associated with the map_mem().

So it is extremely dependent on the platform how much memory will end up mappedĀ 
as nomap.

> 
> And kernel text area is less than 40M, if I am right. This means
> memblocks_present would still mark the section present. 
> 
> Would you mind showing how much memory range it is marked nomap?

We actually have some downstream patches that are using this nomap flag for
more than the use-cases I described above which would enflate the nomap regionsĀ 
a bit :)

> 
> > 
> > -- 
> > 2.7.4
> 



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Wei Yang July 13, 2019, 4:52 p.m. UTC | #3
On Sat, Jul 13, 2019 at 01:53:25PM +0000, Raslan, KarimAllah wrote:
>On Fri, 2019-07-12 at 23:09 +0000, Wei Yang wrote:
>> On Fri, Jul 12, 2019 at 10:51:31AM +0200, KarimAllah Ahmed wrote:
>> > 
>> > Do not mark regions that are marked with nomap to be present, otherwise
>> > these memblock cause unnecessarily allocation of metadata.
>> > 
>> > Cc: Andrew Morton <akpm@linux-foundation.org>
>> > Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
>> > Cc: Oscar Salvador <osalvador@suse.de>
>> > Cc: Michal Hocko <mhocko@suse.com>
>> > Cc: Mike Rapoport <rppt@linux.ibm.com>
>> > Cc: Baoquan He <bhe@redhat.com>
>> > Cc: Qian Cai <cai@lca.pw>
>> > Cc: Wei Yang <richard.weiyang@gmail.com>
>> > Cc: Logan Gunthorpe <logang@deltatee.com>
>> > Cc: linux-mm@kvack.org
>> > Cc: linux-kernel@vger.kernel.org
>> > Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
>> > ---
>> > mm/sparse.c | 4 ++++
>> > 1 file changed, 4 insertions(+)
>> > 
>> > diff --git a/mm/sparse.c b/mm/sparse.c
>> > index fd13166..33810b6 100644
>> > --- a/mm/sparse.c
>> > +++ b/mm/sparse.c
>> > @@ -256,6 +256,10 @@ void __init memblocks_present(void)
>> > 	struct memblock_region *reg;
>> > 
>> > 	for_each_memblock(memory, reg) {
>> > +
>> > +		if (memblock_is_nomap(reg))
>> > +			continue;
>> > +
>> > 		memory_present(memblock_get_region_node(reg),
>> > 			       memblock_region_memory_base_pfn(reg),
>> > 			       memblock_region_memory_end_pfn(reg));
>> 
>> 
>> The logic looks good, while I am not sure this would take effect. Since the
>> metadata is SECTION size aligned while memblock is not.
>> 
>> If I am correct, on arm64, we mark nomap memblock in map_mem()
>> 
>>     memblock_mark_nomap(kernel_start, kernel_end - kernel_start);
>
>The nomap is also done by EFI code in ${src}/drivers/firmware/efi/arm-init.c
>
>.. and hopefully in the future by this:
>https://lkml.org/lkml/2019/7/12/126
>
>So it is not really striclty associated with the map_mem().
>
>So it is extremely dependent on the platform how much memory will end up mapped??
>as nomap.
>
>> 
>> And kernel text area is less than 40M, if I am right. This means
>> memblocks_present would still mark the section present. 
>> 
>> Would you mind showing how much memory range it is marked nomap?
>
>We actually have some downstream patches that are using this nomap flag for
>more than the use-cases I described above which would enflate the nomap regions??
>a bit :)
>

Thanks for your explanation.

If my understanding is correct, the range you mark nomap could not be used by
the system, it looks those ranges are useless for the system. Just curious
about how linux could use these memory after marking nomap?

>> 
>> > 
>> > -- 
>> > 2.7.4
>> 
>
>
>
>Amazon Development Center Germany GmbH
>Krausenstr. 38
>10117 Berlin
>Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
>Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
>Sitz: Berlin
>Ust-ID: DE 289 237 879
>
>
Michal Hocko July 23, 2019, 7:06 a.m. UTC | #4
On Fri 12-07-19 10:51:31, KarimAllah Ahmed wrote:
> Do not mark regions that are marked with nomap to be present, otherwise
> these memblock cause unnecessarily allocation of metadata.

This begs for much more information. How come nomap regions are in
usable memblocks? What if memblock allocator used that memory?
In other words, shouldn't nomap (an unusable memory iirc) be in reserved
memblocks or removed altogethher?

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Cc: Logan Gunthorpe <logang@deltatee.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
> ---
>  mm/sparse.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index fd13166..33810b6 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -256,6 +256,10 @@ void __init memblocks_present(void)
>  	struct memblock_region *reg;
>  
>  	for_each_memblock(memory, reg) {
> +
> +		if (memblock_is_nomap(reg))
> +			continue;
> +
>  		memory_present(memblock_get_region_node(reg),
>  			       memblock_region_memory_base_pfn(reg),
>  			       memblock_region_memory_end_pfn(reg));
> -- 
> 2.7.4

Patch
diff mbox series

diff --git a/mm/sparse.c b/mm/sparse.c
index fd13166..33810b6 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -256,6 +256,10 @@  void __init memblocks_present(void)
 	struct memblock_region *reg;
 
 	for_each_memblock(memory, reg) {
+
+		if (memblock_is_nomap(reg))
+			continue;
+
 		memory_present(memblock_get_region_node(reg),
 			       memblock_region_memory_base_pfn(reg),
 			       memblock_region_memory_end_pfn(reg));