Patchwork vmemmap, memory_hotplug: fallback to base pages for vmmap

login
register
mail settings
Submitter Johannes Weiner
Date July 11, 2017, 9:45 p.m.
Message ID <20170711214541.GA11141@cmpxchg.org>
Download mbox | patch
Permalink /patch/9835509/
State New
Headers show

Comments

Johannes Weiner - July 11, 2017, 9:45 p.m.
On Tue, Jul 11, 2017 at 11:25:45PM +0200, Michal Hocko wrote:
> On Tue 11-07-17 13:26:23, Johannes Weiner wrote:
> > Hi Michael,
> > 
> > On Tue, Jul 11, 2017 at 04:25:58PM +0200, Michal Hocko wrote:
> > > Ohh, scratch that. The patch is bogus. I have completely missed that
> > > vmemmap_populate_hugepages already falls back to
> > > vmemmap_populate_basepages. I have to revisit the bug report I have
> > > received to see what happened apart from the allocation warning. Maybe
> > > we just want to silent that warning.
> > 
> > Yep, this should be fixed in 8e2cdbcb86b0 ("x86-64: fall back to
> > regular page vmemmap on allocation failure").
> > 
> > I figure it's good to keep some sort of warning there, though, as it
> > could have performance implications when we fall back to base pages.
> 
> Yeah, but I am not really sure the allocation warning is the right thing
> here because it is just too verbose. If you consider that we will get
> this warning for each memory section (128MB or 2GB)... I guess the
> existing
> pr_warn_once("vmemmap: falling back to regular page backing\n");
> 
> or maybe make it pr_warn should be enough. What do you think?

It could be useful to dump the memory context at least once, to 1) let
the user know we're falling back but also 2) to get the default report
we split out anytime we fail in a low-memory situation - in case there
is a problem with the MM subsystem.

Maybe something along the lines of this? (totally untested)
Michal Hocko - July 12, 2017, 9:03 a.m.
On Tue 11-07-17 17:45:41, Johannes Weiner wrote:
[...]
> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
> index a56c3989f773..efd3f48c667c 100644
> --- a/mm/sparse-vmemmap.c
> +++ b/mm/sparse-vmemmap.c
> @@ -52,18 +52,24 @@ void * __meminit vmemmap_alloc_block(unsigned long size, int node)
>  {
>  	/* If the main allocator is up use that, fallback to bootmem. */
>  	if (slab_is_available()) {
> +		unsigned int order;
> +		static int warned;
>  		struct page *page;
> +		gfp_t gfp_mask;
>  
> +		order = get_order(size);
> +		gfp_mask = GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN;

why not do
		gfp_mask = GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT;
		if (warned)
			gfp_mask |= __GFP_NOWARN;

and get the actual allocation warning from the allocation context. Then
we can keep the warning vmemmap_populate_hugepages because that would be
more descriptive that what is going on.

Btw. __GFP_REPEAT has been replaced by __GFP_RETRY_MAYFAIL in mmotm
tree.

>  		if (node_state(node, N_HIGH_MEMORY))
> -			page = alloc_pages_node(
> -				node, GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
> -				get_order(size));
> +			page = alloc_pages_node(node, gfp_mask, size);
>  		else
> -			page = alloc_pages(
> -				GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
> -				get_order(size));
> +			page = alloc_pages(gfp_mask, size);
>  		if (page)
>  			return page_address(page);
> +		if (!warned) {
> +			warn_alloc(gfp_mask, NULL,
> +				   "vmemmap alloc failure: order:%u", order);
> +			warned = 1;
> +		}
>  		return NULL;
>  	} else
>  		return __earlyonly_bootmem_alloc(node, size, size,

Patch

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 95651dc58e09..d03c8f244e5b 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1302,7 +1302,6 @@  static int __meminit vmemmap_populate_hugepages(unsigned long start,
 			vmemmap_verify((pte_t *)pmd, node, addr, next);
 			continue;
 		}
-		pr_warn_once("vmemmap: falling back to regular page backing\n");
 		if (vmemmap_populate_basepages(addr, next, node))
 			return -ENOMEM;
 	}
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index a56c3989f773..efd3f48c667c 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -52,18 +52,24 @@  void * __meminit vmemmap_alloc_block(unsigned long size, int node)
 {
 	/* If the main allocator is up use that, fallback to bootmem. */
 	if (slab_is_available()) {
+		unsigned int order;
+		static int warned;
 		struct page *page;
+		gfp_t gfp_mask;
 
+		order = get_order(size);
+		gfp_mask = GFP_KERNEL|__GFP_ZERO|__GFP_REPEAT|__GFP_NOWARN;
 		if (node_state(node, N_HIGH_MEMORY))
-			page = alloc_pages_node(
-				node, GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
-				get_order(size));
+			page = alloc_pages_node(node, gfp_mask, size);
 		else
-			page = alloc_pages(
-				GFP_KERNEL | __GFP_ZERO | __GFP_REPEAT,
-				get_order(size));
+			page = alloc_pages(gfp_mask, size);
 		if (page)
 			return page_address(page);
+		if (!warned) {
+			warn_alloc(gfp_mask, NULL,
+				   "vmemmap alloc failure: order:%u", order);
+			warned = 1;
+		}
 		return NULL;
 	} else
 		return __earlyonly_bootmem_alloc(node, size, size,