diff mbox series

[RFC,1/3] mm, memory_hotplug: try to migrate full section worth of pages

Message ID 20181120134323.13007-2-mhocko@kernel.org (mailing list archive)
State New, archived
Headers show
Series few memory offlining enhancements | expand

Commit Message

Michal Hocko Nov. 20, 2018, 1:43 p.m. UTC
From: Michal Hocko <mhocko@suse.com>

do_migrate_range has been limiting the number of pages to migrate to 256
for some reason which is not documented. Even if the limit made some
sense back then when it was introduced it doesn't really serve a good
purpose these days. If the range contains huge pages then
we break out of the loop too early and go through LRU and pcp
caches draining and scan_movable_pages is quite suboptimal.

The only reason to limit the number of pages I can think of is to reduce
the potential time to react on the fatal signal. But even then the
number of pages is a questionable metric because even a single page
might migration block in a non-killable state (e.g. __unmap_and_move).

Remove the limit and offline the full requested range (this is one
membblock worth of pages with the current code). Should we ever get a
report that offlining takes too long to react on fatal signal then we
should rather fix the core migration to use killable waits and bailout
on a signal.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

Comments

David Hildenbrand Nov. 20, 2018, 2:18 p.m. UTC | #1
On 20.11.18 14:43, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> do_migrate_range has been limiting the number of pages to migrate to 256
> for some reason which is not documented. Even if the limit made some
> sense back then when it was introduced it doesn't really serve a good
> purpose these days. If the range contains huge pages then
> we break out of the loop too early and go through LRU and pcp
> caches draining and scan_movable_pages is quite suboptimal.
> 
> The only reason to limit the number of pages I can think of is to reduce
> the potential time to react on the fatal signal. But even then the
> number of pages is a questionable metric because even a single page
> might migration block in a non-killable state (e.g. __unmap_and_move).
> 
> Remove the limit and offline the full requested range (this is one
> membblock worth of pages with the current code). Should we ever get a
> report that offlining takes too long to react on fatal signal then we
> should rather fix the core migration to use killable waits and bailout
> on a signal.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/memory_hotplug.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index c82193db4be6..6263c8cd4491 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1339,18 +1339,16 @@ static struct page *new_node_page(struct page *page, unsigned long private)
>  	return new_page_nodemask(page, nid, &nmask);
>  }
>  
> -#define NR_OFFLINE_AT_ONCE_PAGES	(256)
>  static int
>  do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  {
>  	unsigned long pfn;
>  	struct page *page;
> -	int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
>  	int not_managed = 0;
>  	int ret = 0;
>  	LIST_HEAD(source);
>  
> -	for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) {
> +	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
>  		if (!pfn_valid(pfn))
>  			continue;
>  		page = pfn_to_page(pfn);
> @@ -1362,8 +1360,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  				ret = -EBUSY;
>  				break;
>  			}
> -			if (isolate_huge_page(page, &source))
> -				move_pages -= 1 << compound_order(head);
> +			isolate_huge_page(page, &source);
>  			continue;
>  		} else if (PageTransHuge(page))
>  			pfn = page_to_pfn(compound_head(page))
> @@ -1382,7 +1379,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>  		if (!ret) { /* Success */
>  			put_page(page);
>  			list_add_tail(&page->lru, &source);
> -			move_pages--;
>  			if (!__PageMovable(page))
>  				inc_node_page_state(page, NR_ISOLATED_ANON +
>  						    page_is_file_cache(page));
> 

Yes, there is basically no statement why it was done that way. If it is
important, there should be one.

(we could also check for pending signals inside that function if really
required)

Reviewed-by: David Hildenbrand <david@redhat.com>
Michal Hocko Nov. 20, 2018, 2:25 p.m. UTC | #2
On Tue 20-11-18 15:18:41, David Hildenbrand wrote:
[...]
> (we could also check for pending signals inside that function if really
> required)

do_migrate_pages is not the proper layer to check signals. Because the
loop only isolates pages and that is not expensive. The most expensive
part is deeper down in the migration core. We wait for page lock or
writeback and that can take a long. None of that is killable wait which
is a larger surgery but something that we should consider should there
be any need to address this.

> Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks!
David Hildenbrand Nov. 20, 2018, 2:27 p.m. UTC | #3
On 20.11.18 15:25, Michal Hocko wrote:
> On Tue 20-11-18 15:18:41, David Hildenbrand wrote:
> [...]
>> (we could also check for pending signals inside that function if really
>> required)
> 
> do_migrate_pages is not the proper layer to check signals. Because the
> loop only isolates pages and that is not expensive. The most expensive
> part is deeper down in the migration core. We wait for page lock or
> writeback and that can take a long. None of that is killable wait which
> is a larger surgery but something that we should consider should there
> be any need to address this.

Indeed, that makes sense.

> 
>> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> Thanks!
>
Pasha Tatashin Nov. 20, 2018, 2:33 p.m. UTC | #4
On 18-11-20 14:43:21, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> do_migrate_range has been limiting the number of pages to migrate to 256
> for some reason which is not documented. Even if the limit made some
> sense back then when it was introduced it doesn't really serve a good
> purpose these days. If the range contains huge pages then
> we break out of the loop too early and go through LRU and pcp
> caches draining and scan_movable_pages is quite suboptimal.
> 
> The only reason to limit the number of pages I can think of is to reduce
> the potential time to react on the fatal signal. But even then the
> number of pages is a questionable metric because even a single page
> might migration block in a non-killable state (e.g. __unmap_and_move).
> 
> Remove the limit and offline the full requested range (this is one
> membblock worth of pages with the current code). Should we ever get a
> report that offlining takes too long to react on fatal signal then we
> should rather fix the core migration to use killable waits and bailout
> on a signal.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Looks good to me, I also do not see a reason for 256 pages limit.

Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>

Added Kame to CC, who introduced page offlining, and this limit, but as
far as I can tell the last time he was active on LKML was in 2016.

Pasha
Oscar Salvador Nov. 20, 2018, 2:51 p.m. UTC | #5
On Tue, 2018-11-20 at 14:43 +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> do_migrate_range has been limiting the number of pages to migrate to
> 256
> for some reason which is not documented. 

When looking back at old memory-hotplug commits one feels pretty sad
about the brevity of the changelogs.

> Signed-off-by: Michal Hocko <mhocko@suse.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>
Michal Hocko Nov. 20, 2018, 3 p.m. UTC | #6
On Tue 20-11-18 15:51:32, Oscar Salvador wrote:
> On Tue, 2018-11-20 at 14:43 +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > do_migrate_range has been limiting the number of pages to migrate to
> > 256
> > for some reason which is not documented. 
> 
> When looking back at old memory-hotplug commits one feels pretty sad
> about the brevity of the changelogs.

Well, things evolve and we've become much more careful about changelogs
over time. It still gets quite a lot of time to push back on changelogs
even these days though. People still keep forgetting that "what" is not
as important as "why" because the former is usually quite easy to
understand from reading the diff. The intention behind is usually what
gets forgotten after years. I guess people realize this much more after
few excavation git blame tours.

> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks!
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c82193db4be6..6263c8cd4491 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1339,18 +1339,16 @@  static struct page *new_node_page(struct page *page, unsigned long private)
 	return new_page_nodemask(page, nid, &nmask);
 }
 
-#define NR_OFFLINE_AT_ONCE_PAGES	(256)
 static int
 do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
 	struct page *page;
-	int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
 	int not_managed = 0;
 	int ret = 0;
 	LIST_HEAD(source);
 
-	for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) {
+	for (pfn = start_pfn; pfn < end_pfn; pfn++) {
 		if (!pfn_valid(pfn))
 			continue;
 		page = pfn_to_page(pfn);
@@ -1362,8 +1360,7 @@  do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 				ret = -EBUSY;
 				break;
 			}
-			if (isolate_huge_page(page, &source))
-				move_pages -= 1 << compound_order(head);
+			isolate_huge_page(page, &source);
 			continue;
 		} else if (PageTransHuge(page))
 			pfn = page_to_pfn(compound_head(page))
@@ -1382,7 +1379,6 @@  do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		if (!ret) { /* Success */
 			put_page(page);
 			list_add_tail(&page->lru, &source);
-			move_pages--;
 			if (!__PageMovable(page))
 				inc_node_page_state(page, NR_ISOLATED_ANON +
 						    page_is_file_cache(page));