diff mbox series

[v2,2/3] mm: calculate deferred pages after skipping mirrored memory

Message ID 20180726193509.3326-3-pasha.tatashin@oracle.com (mailing list archive)
State New, archived
Headers show
Series memmap_init_zone improvements | expand

Commit Message

Pavel Tatashin July 26, 2018, 7:35 p.m. UTC
update_defer_init() should be called only when struct page is about to be
initialized. Because it counts number of initialized struct pages, but
there we may skip struct pages if there is some mirrored memory.

So move, update_defer_init() after checking for mirrored memory.

Also, rename update_defer_init() to defer_init() and reverse the return
boolean to emphasize that this is a boolean function, that tells that the
reset of memmap initialization should be deferred.

Make this function self-contained: do not pass number of already
initialized pages in this zone by using static counters.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 mm/page_alloc.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

Comments

Oscar Salvador July 27, 2018, 11:56 a.m. UTC | #1
On Thu, Jul 26, 2018 at 03:35:08PM -0400, Pavel Tatashin wrote:
> update_defer_init() should be called only when struct page is about to be
> initialized. Because it counts number of initialized struct pages, but
> there we may skip struct pages if there is some mirrored memory.
> 
> So move, update_defer_init() after checking for mirrored memory.
> 
> Also, rename update_defer_init() to defer_init() and reverse the return
> boolean to emphasize that this is a boolean function, that tells that the
> reset of memmap initialization should be deferred.
> 
> Make this function self-contained: do not pass number of already
> initialized pages in this zone by using static counters.
> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  mm/page_alloc.c | 45 +++++++++++++++++++++++++--------------------
>  1 file changed, 25 insertions(+), 20 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6796dacd46ac..4946c73e549b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -306,24 +306,33 @@ static inline bool __meminit early_page_uninitialised(unsigned long pfn)
>  }
>  
>  /*
> - * Returns false when the remaining initialisation should be deferred until
> + * Returns true when the remaining initialisation should be deferred until
>   * later in the boot cycle when it can be parallelised.
>   */
> -static inline bool update_defer_init(pg_data_t *pgdat,
> -				unsigned long pfn, unsigned long zone_end,
> -				unsigned long *nr_initialised)
> +static bool __meminit
> +defer_init(int nid, unsigned long pfn, unsigned long end_pfn)

Hi Pavel,

maybe I do not understand properly the __init/__meminit macros, but should not
"defer_init" be __init instead of __meminit?
I think that functions marked as __meminit are not freed up, right?

Besides that, this looks good to me:

Reviewed-by: Oscar Salvador <osalvador@suse.de>
Pavel Tatashin July 27, 2018, 2:53 p.m. UTC | #2
unsigned long *nr_initialised)
> > +static bool __meminit
> > +defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
>
> Hi Pavel,
>
> maybe I do not understand properly the __init/__meminit macros, but should not
> "defer_init" be __init instead of __meminit?
> I think that functions marked as __meminit are not freed up, right?

Not exactly. As I understand: __meminit is the same as __init when
CONFIG_MEMORY_HOTPLUG=n. But, when memory hotplug is configured,
__meminit is not freed, because code that adds memory is shared
between boot and hotplug. In this case defer_init() is called only
during boot, and could be __init, but it is called from
memmap_init_zone() which is __meminit and thus section mismatch would
happen.

We could split memmap_init_zone() into two functions: boot and hotplug
variants, or we could use __ref, but I do not think any of that is
really needed. Keeping defer_init() in __meminit is OK, it does not
take that much memory.

>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thank you,
Pavel
Oscar Salvador July 27, 2018, 3:07 p.m. UTC | #3
On Fri, Jul 27, 2018 at 10:53:24AM -0400, Pavel Tatashin wrote:
>                          unsigned long *nr_initialised)
> > > +static bool __meminit
> > > +defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
> >
> > Hi Pavel,
> >
> > maybe I do not understand properly the __init/__meminit macros, but should not
> > "defer_init" be __init instead of __meminit?
> > I think that functions marked as __meminit are not freed up, right?
> 
> Not exactly. As I understand: __meminit is the same as __init when
> CONFIG_MEMORY_HOTPLUG=n. But, when memory hotplug is configured,
> __meminit is not freed, because code that adds memory is shared
> between boot and hotplug. In this case defer_init() is called only
> during boot, and could be __init, but it is called from
> memmap_init_zone() which is __meminit and thus section mismatch would
> happen.

Oh yes, I did not think about memmap_init_zone(), you are right.
Then, nothing to argue about ;-).

Thanks
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6796dacd46ac..4946c73e549b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -306,24 +306,33 @@  static inline bool __meminit early_page_uninitialised(unsigned long pfn)
 }
 
 /*
- * Returns false when the remaining initialisation should be deferred until
+ * Returns true when the remaining initialisation should be deferred until
  * later in the boot cycle when it can be parallelised.
  */
-static inline bool update_defer_init(pg_data_t *pgdat,
-				unsigned long pfn, unsigned long zone_end,
-				unsigned long *nr_initialised)
+static bool __meminit
+defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 {
+	static unsigned long prev_end_pfn, nr_initialised;
+
+	/*
+	 * prev_end_pfn static that contains the end of previous zone
+	 * No need to protect because called very early in boot before smp_init.
+	 */
+	if (prev_end_pfn != end_pfn) {
+		prev_end_pfn = end_pfn;
+		nr_initialised = 0;
+	}
+
 	/* Always populate low zones for address-constrained allocations */
-	if (zone_end < pgdat_end_pfn(pgdat))
-		return true;
-	(*nr_initialised)++;
-	if ((*nr_initialised > pgdat->static_init_pgcnt) &&
-	    (pfn & (PAGES_PER_SECTION - 1)) == 0) {
-		pgdat->first_deferred_pfn = pfn;
+	if (end_pfn < pgdat_end_pfn(NODE_DATA(nid)))
 		return false;
+	nr_initialised++;
+	if ((nr_initialised > NODE_DATA(nid)->static_init_pgcnt) &&
+	    (pfn & (PAGES_PER_SECTION - 1)) == 0) {
+		NODE_DATA(nid)->first_deferred_pfn = pfn;
+		return true;
 	}
-
-	return true;
+	return false;
 }
 #else
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -331,11 +340,9 @@  static inline bool early_page_uninitialised(unsigned long pfn)
 	return false;
 }
 
-static inline bool update_defer_init(pg_data_t *pgdat,
-				unsigned long pfn, unsigned long zone_end,
-				unsigned long *nr_initialised)
+static inline bool defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 {
-	return true;
+	return false;
 }
 #endif
 
@@ -5459,9 +5466,7 @@  void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		struct vmem_altmap *altmap)
 {
 	unsigned long end_pfn = start_pfn + size;
-	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long pfn;
-	unsigned long nr_initialised = 0;
 	struct page *page;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 	struct memblock_region *r = NULL, *tmp;
@@ -5492,8 +5497,6 @@  void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 
 		if (!early_pfn_in_nid(pfn, nid))
 			continue;
-		if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
-			break;
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 		/*
@@ -5516,6 +5519,8 @@  void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			}
 		}
 #endif
+		if (defer_init(nid, pfn, end_pfn))
+			break;
 
 not_early:
 		page = pfn_to_page(pfn);