diff mbox series

[v9,3/8] mm,memory_hotplug: Factor out adjusting present pages into adjust_present_page_count()

Message ID 20210416112411.9826-4-osalvador@suse.de (mailing list archive)
State New
Headers show
Series [v9,1/8] drivers/base/memory: Introduce memory_block_{online,offline} | expand

Commit Message

Oscar Salvador April 16, 2021, 11:24 a.m. UTC
From: David Hildenbrand <david@redhat.com>

Let's have a single place (inspired by adjust_managed_page_count()) where
we adjust present pages.
In contrast to adjust_managed_page_count(), only memory onlining/offlining
is allowed to modify the number of present pages.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
---
 mm/memory_hotplug.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

Comments

Michal Hocko April 20, 2021, 9:45 a.m. UTC | #1
On Fri 16-04-21 13:24:06, Oscar Salvador wrote:
> From: David Hildenbrand <david@redhat.com>
> 
> Let's have a single place (inspired by adjust_managed_page_count()) where
> we adjust present pages.
> In contrast to adjust_managed_page_count(), only memory onlining/offlining
> is allowed to modify the number of present pages.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Not sure self review counts ;)

Acked-by: Michal Hocko <mhocko@suse.com>

Btw. I strongly suspect the resize lock is quite pointless here.
Something for a follow up patch.

> ---
>  mm/memory_hotplug.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 25e59d5dc13c..d05056b3c173 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -829,6 +829,16 @@ struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
>  	return default_zone_for_pfn(nid, start_pfn, nr_pages);
>  }
>  
> +static void adjust_present_page_count(struct zone *zone, long nr_pages)
> +{
> +	unsigned long flags;
> +
> +	zone->present_pages += nr_pages;
> +	pgdat_resize_lock(zone->zone_pgdat, &flags);
> +	zone->zone_pgdat->node_present_pages += nr_pages;
> +	pgdat_resize_unlock(zone->zone_pgdat, &flags);
> +}
> +
>  int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  		       int online_type, int nid)
>  {
> @@ -882,11 +892,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
>  	}
>  
>  	online_pages_range(pfn, nr_pages);
> -	zone->present_pages += nr_pages;
> -
> -	pgdat_resize_lock(zone->zone_pgdat, &flags);
> -	zone->zone_pgdat->node_present_pages += nr_pages;
> -	pgdat_resize_unlock(zone->zone_pgdat, &flags);
> +	adjust_present_page_count(zone, nr_pages);
>  
>  	node_states_set_node(nid, &arg);
>  	if (need_zonelists_rebuild)
> @@ -1701,11 +1707,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
>  
>  	/* removal success */
>  	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
> -	zone->present_pages -= nr_pages;
> -
> -	pgdat_resize_lock(zone->zone_pgdat, &flags);
> -	zone->zone_pgdat->node_present_pages -= nr_pages;
> -	pgdat_resize_unlock(zone->zone_pgdat, &flags);
> +	adjust_present_page_count(zone, -nr_pages);
>  
>  	init_per_zone_wmark_min();
>  
> -- 
> 2.16.3
Oscar Salvador April 21, 2021, 8 a.m. UTC | #2
On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote:
> On Fri 16-04-21 13:24:06, Oscar Salvador wrote:
> > From: David Hildenbrand <david@redhat.com>
> > 
> > Let's have a single place (inspired by adjust_managed_page_count()) where
> > we adjust present pages.
> > In contrast to adjust_managed_page_count(), only memory onlining/offlining
> > is allowed to modify the number of present pages.
> > 
> > Signed-off-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 
> Not sure self review counts ;)

Uhm, the original author is David, I just added my signed-off-by as a deliverer.
I thought that in that case was ok to stick my Reviewed-by.
Or maybe my signed-off-by carries that implicitly.

> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Btw. I strongly suspect the resize lock is quite pointless here.
> Something for a follow up patch.

What makes you think that?
I have been thinking about this, let us ignore this patch for a moment.

If I poked the code correctly, node_size_lock is taken in:

remove_pfn_range_from_zone()
move_pfn_range_to_zone()

both of them handling {zone,node}->spanned_pages

Then we take it in {offline,online}_pages() for {zone,node}->present_pages.

The other places where we take it are __init functions, so not of interest.

Given that {offline,online}_pages() is serialized by the memory_hotplug lock,
I would say that {node,zone}->{spanned,present}_pages is, at any time, stable?
So, no need for the lock even without considering this patch?

Now, getting back to this patch.
adjust_present_page_count() will be called from memory_block_online(), which
is not holding the memory_hotplug lock yet.
But, we only fiddle with present pages out of {online,offline}_pages() if
we have vmemmap pages, and since that operates on the same memory block,
its lock should serialize that.

I think I went down a rabbit hole, I am slightly confused now.
David Hildenbrand April 21, 2021, 8:06 a.m. UTC | #3
On 21.04.21 10:00, Oscar Salvador wrote:
> On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote:
>> On Fri 16-04-21 13:24:06, Oscar Salvador wrote:
>>> From: David Hildenbrand <david@redhat.com>
>>>
>>> Let's have a single place (inspired by adjust_managed_page_count()) where
>>> we adjust present pages.
>>> In contrast to adjust_managed_page_count(), only memory onlining/offlining
>>> is allowed to modify the number of present pages.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Oscar Salvador <osalvador@suse.de>
>>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>>
>> Not sure self review counts ;)
> 
> Uhm, the original author is David, I just added my signed-off-by as a deliverer.
> I thought that in that case was ok to stick my Reviewed-by.
> Or maybe my signed-off-by carries that implicitly.
> 
>> Acked-by: Michal Hocko <mhocko@suse.com>
>>
>> Btw. I strongly suspect the resize lock is quite pointless here.
>> Something for a follow up patch.
> 
> What makes you think that?
> I have been thinking about this, let us ignore this patch for a moment.
> 
> If I poked the code correctly, node_size_lock is taken in:
> 
> remove_pfn_range_from_zone()
> move_pfn_range_to_zone()
> 
> both of them handling {zone,node}->spanned_pages
> 
> Then we take it in {offline,online}_pages() for {zone,node}->present_pages.
> 
> The other places where we take it are __init functions, so not of interest.
> 
> Given that {offline,online}_pages() is serialized by the memory_hotplug lock,
> I would say that {node,zone}->{spanned,present}_pages is, at any time, stable?
> So, no need for the lock even without considering this patch?
> 
> Now, getting back to this patch.
> adjust_present_page_count() will be called from memory_block_online(), which
> is not holding the memory_hotplug lock yet.
> But, we only fiddle with present pages out of {online,offline}_pages() if
> we have vmemmap pages, and since that operates on the same memory block,
> its lock should serialize that.
> 
> I think I went down a rabbit hole, I am slightly confused now.

We always hold the device hotplug lock when onlining/offlining memory.

I agree that the lock might be unnecessary (had the same thoughts a 
while ago), we can look into that in the future.
Michal Hocko April 21, 2021, 8:31 a.m. UTC | #4
On Wed 21-04-21 10:00:36, Oscar Salvador wrote:
> On Tue, Apr 20, 2021 at 11:45:55AM +0200, Michal Hocko wrote:
> > On Fri 16-04-21 13:24:06, Oscar Salvador wrote:
> > > From: David Hildenbrand <david@redhat.com>
> > > 
> > > Let's have a single place (inspired by adjust_managed_page_count()) where
> > > we adjust present pages.
> > > In contrast to adjust_managed_page_count(), only memory onlining/offlining
> > > is allowed to modify the number of present pages.
> > > 
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> > > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> > 
> > Not sure self review counts ;)
> 
> Uhm, the original author is David, I just added my signed-off-by as a deliverer.
> I thought that in that case was ok to stick my Reviewed-by.
> Or maybe my signed-off-by carries that implicitly.

Yeah I do expect that one should review own changes but this is not
really anything to lose sleep over.

> > Acked-by: Michal Hocko <mhocko@suse.com>
> > 
> > Btw. I strongly suspect the resize lock is quite pointless here.
> > Something for a follow up patch.
> 
> What makes you think that?

         * Write access to present_pages at runtime should be protected by
         * mem_hotplug_begin/end(). Any reader who can't tolerant drift of
         * present_pages should get_online_mems() to get a stable value.

> I have been thinking about this, let us ignore this patch for a moment.
> 
> If I poked the code correctly, node_size_lock is taken in:
> 
> remove_pfn_range_from_zone()
> move_pfn_range_to_zone()
> 
> both of them handling {zone,node}->spanned_pages
> 
> Then we take it in {offline,online}_pages() for {zone,node}->present_pages.
> 
> The other places where we take it are __init functions, so not of interest.
> 
> Given that {offline,online}_pages() is serialized by the memory_hotplug lock,
> I would say that {node,zone}->{spanned,present}_pages is, at any time, stable?
> So, no need for the lock even without considering this patch?

Yes. The resize lock is really only relevant to the parallel struct page
initialization during early boot. The hotplug usage seems just a left
over from the past or maybe it has never been really relevant in that
context.

> Now, getting back to this patch.
> adjust_present_page_count() will be called from memory_block_online(), which
> is not holding the memory_hotplug lock yet.
> But, we only fiddle with present pages out of {online,offline}_pages() if
> we have vmemmap pages, and since that operates on the same memory block,
> its lock should serialize that.

Memory hotplug is always synchronized on the device level.
Oscar Salvador April 21, 2021, 8:35 a.m. UTC | #5
On Wed, Apr 21, 2021 at 10:31:30AM +0200, Michal Hocko wrote:
> > Given that {offline,online}_pages() is serialized by the memory_hotplug lock,
> > I would say that {node,zone}->{spanned,present}_pages is, at any time, stable?
> > So, no need for the lock even without considering this patch?
> 
> Yes. The resize lock is really only relevant to the parallel struct page
> initialization during early boot. The hotplug usage seems just a left
> over from the past or maybe it has never been really relevant in that
> context.

Ok, I will prepare a follow-up patch to remove the lock in such situations
when this work goes in.
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 25e59d5dc13c..d05056b3c173 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -829,6 +829,16 @@  struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn,
 	return default_zone_for_pfn(nid, start_pfn, nr_pages);
 }
 
+static void adjust_present_page_count(struct zone *zone, long nr_pages)
+{
+	unsigned long flags;
+
+	zone->present_pages += nr_pages;
+	pgdat_resize_lock(zone->zone_pgdat, &flags);
+	zone->zone_pgdat->node_present_pages += nr_pages;
+	pgdat_resize_unlock(zone->zone_pgdat, &flags);
+}
+
 int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 		       int online_type, int nid)
 {
@@ -882,11 +892,7 @@  int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
 	}
 
 	online_pages_range(pfn, nr_pages);
-	zone->present_pages += nr_pages;
-
-	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages += nr_pages;
-	pgdat_resize_unlock(zone->zone_pgdat, &flags);
+	adjust_present_page_count(zone, nr_pages);
 
 	node_states_set_node(nid, &arg);
 	if (need_zonelists_rebuild)
@@ -1701,11 +1707,7 @@  int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
 
 	/* removal success */
 	adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages);
-	zone->present_pages -= nr_pages;
-
-	pgdat_resize_lock(zone->zone_pgdat, &flags);
-	zone->zone_pgdat->node_present_pages -= nr_pages;
-	pgdat_resize_unlock(zone->zone_pgdat, &flags);
+	adjust_present_page_count(zone, -nr_pages);
 
 	init_per_zone_wmark_min();