diff mbox series

[v3,4/5] mm/page_alloc: Move initialization of node and zones to an own function

Message ID 20180725220144.11531-5-osalvador@techadventures.net (mailing list archive)
State New, archived
Headers show
Series Refactor free_area_init_core and add free_area_init_core_hotplug | expand

Commit Message

Oscar Salvador July 25, 2018, 10:01 p.m. UTC
From: Oscar Salvador <osalvador@suse.de>

Currently, whenever a new node is created/re-used from the memhotplug path,
we call free_area_init_node()->free_area_init_core().
But there is some code that we do not really need to run when we are coming
from such path.

free_area_init_core() performs the following actions:

1) Initializes pgdat internals, such as spinlock, waitqueues and more.
2) Account # nr_all_pages and nr_kernel_pages. These values are used later on
   when creating hash tables.
3) Account number of managed_pages per zone, substracting dma_reserved and memmap pages.
4) Initializes some fields of the zone structure data
5) Calls init_currently_empty_zone to initialize all the freelists
6) Calls memmap_init to initialize all pages belonging to certain zone

When called from memhotplug path, free_area_init_core() only performs actions #1 and #4.

Action #2 is pointless as the zones do not have any pages since either the node was freed,
or we are re-using it, eitherway all zones belonging to this node should have 0 pages.
For the same reason, action #3 results always in manages_pages being 0.

Action #5 and #6 are performed later on when onlining the pages:
 online_pages()->move_pfn_range_to_zone()->init_currently_empty_zone()
 online_pages()->move_pfn_range_to_zone()->memmap_init_zone()

This patch moves the node/zone initializtion to their own function, so it allows us
to create a small version of free_area_init_core(next patch), where we only perform:

1) Initialization of pgdat internals, such as spinlock, waitqueues and more
4) Initialization of some fields of the zone structure data

This patch only introduces these two functions.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 mm/page_alloc.c | 50 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 30 insertions(+), 20 deletions(-)

Comments

Michal Hocko July 26, 2018, 8:12 a.m. UTC | #1
On Thu 26-07-18 00:01:43, osalvador@techadventures.net wrote:
> From: Oscar Salvador <osalvador@suse.de>
> 
> Currently, whenever a new node is created/re-used from the memhotplug path,
> we call free_area_init_node()->free_area_init_core().
> But there is some code that we do not really need to run when we are coming
> from such path.
> 
> free_area_init_core() performs the following actions:
> 
> 1) Initializes pgdat internals, such as spinlock, waitqueues and more.
> 2) Account # nr_all_pages and nr_kernel_pages. These values are used later on
>    when creating hash tables.
> 3) Account number of managed_pages per zone, substracting dma_reserved and memmap pages.
> 4) Initializes some fields of the zone structure data
> 5) Calls init_currently_empty_zone to initialize all the freelists
> 6) Calls memmap_init to initialize all pages belonging to certain zone
> 
> When called from memhotplug path, free_area_init_core() only performs actions #1 and #4.
> 
> Action #2 is pointless as the zones do not have any pages since either the node was freed,
> or we are re-using it, eitherway all zones belonging to this node should have 0 pages.
> For the same reason, action #3 results always in manages_pages being 0.
> 
> Action #5 and #6 are performed later on when onlining the pages:
>  online_pages()->move_pfn_range_to_zone()->init_currently_empty_zone()
>  online_pages()->move_pfn_range_to_zone()->memmap_init_zone()
> 
> This patch moves the node/zone initializtion to their own function, so it allows us
> to create a small version of free_area_init_core(next patch), where we only perform:
> 
> 1) Initialization of pgdat internals, such as spinlock, waitqueues and more
> 4) Initialization of some fields of the zone structure data
> 
> This patch only introduces these two functions.

OK, this looks definitely better. I will have to check that all the
required state is initialized properly. Considering the above
explanation I would simply fold the follow up patch into this one. It is
not so large it would get hard to review and you would make it clear why
the work is done.

> +/*
> + * Set up the zone data structures:
> + *   - mark all pages reserved
> + *   - mark all memory queues empty
> + *   - clear the memory bitmaps
> + *
> + * NOTE: pgdat should get zeroed by caller.
> + * NOTE: this function is only called during early init.
> + */
> +static void __paginginit free_area_init_core(struct pglist_data *pgdat)

now that this function is called only from the early init code we can
make it s@__paginginit@__init@ AFAICS.

> +{
> +	enum zone_type j;
> +	int nid = pgdat->node_id;
>  
> +	pgdat_init_internals(pgdat);
>  	pgdat->per_cpu_nodestats = &boot_nodestats;
>  
>  	for (j = 0; j < MAX_NR_ZONES; j++) {
> @@ -6310,13 +6326,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>  		 * when the bootmem allocator frees pages into the buddy system.
>  		 * And all highmem pages will be managed by the buddy system.
>  		 */
> -		zone->managed_pages = freesize;
> -		zone_set_nid(zone, nid);
> -		zone->name = zone_names[j];
> -		zone->zone_pgdat = pgdat;
> -		spin_lock_init(&zone->lock);
> -		zone_seqlock_init(zone);
> -		zone_pcp_init(zone);
> +		zone_init_internals(zone, j, nid, freesize);
>  
>  		if (!size)
>  			continue;
> -- 
> 2.13.6
>
Pavel Tatashin July 26, 2018, 3:35 p.m. UTC | #2
> OK, this looks definitely better. I will have to check that all the
> required state is initialized properly. Considering the above
> explanation I would simply fold the follow up patch into this one. It is
> not so large it would get hard to review and you would make it clear why
> the work is done.

I will review this work, once Oscar combines patches 4 & 5 as Michal suggested.


>
> > +/*
> > + * Set up the zone data structures:
> > + *   - mark all pages reserved
> > + *   - mark all memory queues empty
> > + *   - clear the memory bitmaps
> > + *
> > + * NOTE: pgdat should get zeroed by caller.
> > + * NOTE: this function is only called during early init.
> > + */
> > +static void __paginginit free_area_init_core(struct pglist_data *pgdat)
>
> now that this function is called only from the early init code we can
> make it s@__paginginit@__init@ AFAICS.

True, in patch 5. Also, zone_init_internals() should be marked as __paginginit.

Thank you,
Pavel
Oscar Salvador July 26, 2018, 7:15 p.m. UTC | #3
On Thu, Jul 26, 2018 at 11:35:35AM -0400, Pavel Tatashin wrote:
> > OK, this looks definitely better. I will have to check that all the
> > required state is initialized properly. Considering the above
> > explanation I would simply fold the follow up patch into this one. It is
> > not so large it would get hard to review and you would make it clear why
> > the work is done.
> 
> I will review this work, once Oscar combines patches 4 & 5 as Michal suggested.

I will send a new version tomorrow with some fixups and patch4 and patch5 joined.

Thanks
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4e84a17a5030..a455dc85da19 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6237,21 +6237,9 @@  static void pgdat_init_kcompactd(struct pglist_data *pgdat)
 static void pgdat_init_kcompactd(struct pglist_data *pgdat) {}
 #endif
 
-/*
- * Set up the zone data structures:
- *   - mark all pages reserved
- *   - mark all memory queues empty
- *   - clear the memory bitmaps
- *
- * NOTE: pgdat should get zeroed by caller.
- */
-static void __paginginit free_area_init_core(struct pglist_data *pgdat)
+static void __paginginit pgdat_init_internals(struct pglist_data *pgdat)
 {
-	enum zone_type j;
-	int nid = pgdat->node_id;
-
 	pgdat_resize_init(pgdat);
-
 	pgdat_init_numabalancing(pgdat);
 	pgdat_init_split_queue(pgdat);
 	pgdat_init_kcompactd(pgdat);
@@ -6262,7 +6250,35 @@  static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 	pgdat_page_ext_init(pgdat);
 	spin_lock_init(&pgdat->lru_lock);
 	lruvec_init(node_lruvec(pgdat));
+}
+
+static void zone_init_internals(struct zone *zone, enum zone_type idx, int nid,
+							unsigned long remaining_pages)
+{
+	zone->managed_pages = remaining_pages;
+	zone_set_nid(zone, nid);
+	zone->name = zone_names[idx];
+	zone->zone_pgdat = NODE_DATA(nid);
+	spin_lock_init(&zone->lock);
+	zone_seqlock_init(zone);
+	zone_pcp_init(zone);
+}
+
+/*
+ * Set up the zone data structures:
+ *   - mark all pages reserved
+ *   - mark all memory queues empty
+ *   - clear the memory bitmaps
+ *
+ * NOTE: pgdat should get zeroed by caller.
+ * NOTE: this function is only called during early init.
+ */
+static void __paginginit free_area_init_core(struct pglist_data *pgdat)
+{
+	enum zone_type j;
+	int nid = pgdat->node_id;
 
+	pgdat_init_internals(pgdat);
 	pgdat->per_cpu_nodestats = &boot_nodestats;
 
 	for (j = 0; j < MAX_NR_ZONES; j++) {
@@ -6310,13 +6326,7 @@  static void __paginginit free_area_init_core(struct pglist_data *pgdat)
 		 * when the bootmem allocator frees pages into the buddy system.
 		 * And all highmem pages will be managed by the buddy system.
 		 */
-		zone->managed_pages = freesize;
-		zone_set_nid(zone, nid);
-		zone->name = zone_names[j];
-		zone->zone_pgdat = pgdat;
-		spin_lock_init(&zone->lock);
-		zone_seqlock_init(zone);
-		zone_pcp_init(zone);
+		zone_init_internals(zone, j, nid, freesize);
 
 		if (!size)
 			continue;