diff mbox series

[v3,2/2] mm: Make alloc_contig_range handle in-use hugetlb pages

Message ID 20210222135137.25717-3-osalvador@suse.de (mailing list archive)
State New, archived
Headers show
Series Make alloc_contig_range handle Hugetlb pages | expand

Commit Message

Oscar Salvador Feb. 22, 2021, 1:51 p.m. UTC
alloc_contig_range() will fail if it finds a HugeTLB page within the range,
without a chance to handle them. Since HugeTLB pages can be migrated as any
LRU or Movable page, it does not make sense to bail out without trying.
Enable the interface to recognize in-use HugeTLB pages so we can migrate
them, and have much better chances to succeed the call.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
 include/linux/hugetlb.h |  5 +++--
 mm/compaction.c         | 12 +++++++++++-
 mm/hugetlb.c            | 21 +++++++++++++++++----
 mm/vmscan.c             |  5 +++--
 4 files changed, 34 insertions(+), 9 deletions(-)

Comments

Mike Kravetz Feb. 25, 2021, 11:05 p.m. UTC | #1
On 2/22/21 5:51 AM, Oscar Salvador wrote:
> alloc_contig_range() will fail if it finds a HugeTLB page within the range,
> without a chance to handle them. Since HugeTLB pages can be migrated as any
> LRU or Movable page, it does not make sense to bail out without trying.
> Enable the interface to recognize in-use HugeTLB pages so we can migrate
> them, and have much better chances to succeed the call.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
>  include/linux/hugetlb.h |  5 +++--
>  mm/compaction.c         | 12 +++++++++++-
>  mm/hugetlb.c            | 21 +++++++++++++++++----
>  mm/vmscan.c             |  5 +++--
>  4 files changed, 34 insertions(+), 9 deletions(-)

Thanks,

Changes look good.  I like the simple retry one time for pages which may
go from free to in use.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

BTW,
This series will need to be rebased on latest which has hugetlb page flag
changes.  Should be as simple as:
s/PageHugeFreed/HPageFreed/
Michal Hocko Feb. 26, 2021, 8:46 a.m. UTC | #2
On Mon 22-02-21 14:51:37, Oscar Salvador wrote:
[...]
> @@ -2394,9 +2397,19 @@ bool isolate_or_dissolve_huge_page(struct page *page)
>  	 */
>  	if (hstate_is_gigantic(h))
>  		return ret;
> -
> -	if (!page_count(head) && alloc_and_dissolve_huge_page(h, head))
> +retry:
> +	if (page_count(head) && isolate_huge_page(head, list)) {
>  		ret = true;
> +	} else if (!page_count(head)) {

This is rather head spinning. Do we need to test page_count in the else
branch? Do you want to optimize for a case where the page cannot be
isolated because of page_huge_active?

> +		int err = alloc_and_dissolve_huge_page(h, head);
> +
> +		if (!err) {
> +			ret = true;
> +		} else if (err == -EBUSY && try_again) {
> +			try_again = false;
> +			goto retry;
> +		}

Is this retry once logic really needed? Does it really give us any real
benefit? alloc_and_dissolve_huge_page already retries when the page is
being freed.
Oscar Salvador Feb. 26, 2021, 10:24 a.m. UTC | #3
On Fri, Feb 26, 2021 at 09:46:57AM +0100, Michal Hocko wrote:
> On Mon 22-02-21 14:51:37, Oscar Salvador wrote:
> [...]
> > @@ -2394,9 +2397,19 @@ bool isolate_or_dissolve_huge_page(struct page *page)
> >  	 */
> >  	if (hstate_is_gigantic(h))
> >  		return ret;
> > -
> > -	if (!page_count(head) && alloc_and_dissolve_huge_page(h, head))
> > +retry:
> > +	if (page_count(head) && isolate_huge_page(head, list)) {
> >  		ret = true;
> > +	} else if (!page_count(head)) {
> 
> This is rather head spinning. Do we need to test page_count in the else
> branch? Do you want to optimize for a case where the page cannot be
> isolated because of page_huge_active?

Well, I wanted to explictly call out both cases.
We either 1) have an in-use page and we try to issolate it or 2) we have a free
page (count == 0).

If the page could not be dissolved due to page_huge_active, this would either
mean that page is about to be freed, or that someone has already issolated the
page.
Being the former case, one could say that falling-through alloc_and_dissolve is
ok.

But no, I did not really want to optimize anything here, just wanted to be explicit
about what we are checking and why.

> > +
> > +		if (!err) {
> > +			ret = true;
> > +		} else if (err == -EBUSY && try_again) {
> > +			try_again = false;
> > +			goto retry;
> > +		}
> 
> Is this retry once logic really needed? Does it really give us any real
> benefit? alloc_and_dissolve_huge_page already retries when the page is
> being freed.

alloc_and_dissolve_huge_page retries when the page is about to be freed.
Here we retry in case alloc_and_dissolve_huge_page() noticed that someone grabbed
the page (refcount 0 -> 1), which would possibly mean userspace started using this
page. If that is the case, we give isolate_huge_page another chance to see if we
can succeed and we can migrate the page.

Chances this to happen? Honestly, as any race, quite low.
So, the answer to your question would be, no, it is not really needed, I just felt
we could play "clever" here.
Oscar Salvador Feb. 26, 2021, 10:27 a.m. UTC | #4
On Fri, Feb 26, 2021 at 11:24:29AM +0100, Oscar Salvador wrote:
> On Fri, Feb 26, 2021 at 09:46:57AM +0100, Michal Hocko wrote:
> > On Mon 22-02-21 14:51:37, Oscar Salvador wrote:
> > [...]
> > > @@ -2394,9 +2397,19 @@ bool isolate_or_dissolve_huge_page(struct page *page)
> > >  	 */
> > >  	if (hstate_is_gigantic(h))
> > >  		return ret;
> > > -
> > > -	if (!page_count(head) && alloc_and_dissolve_huge_page(h, head))
> > > +retry:
> > > +	if (page_count(head) && isolate_huge_page(head, list)) {
> > >  		ret = true;
> > > +	} else if (!page_count(head)) {
> > 
> > This is rather head spinning. Do we need to test page_count in the else
> > branch? Do you want to optimize for a case where the page cannot be
> > isolated because of page_huge_active?
> 
> Well, I wanted to explictly call out both cases.
> We either 1) have an in-use page and we try to issolate it or 2) we have a free
> page (count == 0).
> 
> If the page could not be dissolved due to page_huge_active, this would either
> mean that page is about to be freed, or that someone has already issolated the
> page.
> Being the former case, one could say that falling-through alloc_and_dissolve is
> ok.
> 
> But no, I did not really want to optimize anything here, just wanted to be explicit
> about what we are checking and why.

Maybe I could add a comment to make it more explicit.
Michal Hocko Feb. 26, 2021, 12:46 p.m. UTC | #5
On Fri 26-02-21 11:24:29, Oscar Salvador wrote:
> On Fri, Feb 26, 2021 at 09:46:57AM +0100, Michal Hocko wrote:
> > On Mon 22-02-21 14:51:37, Oscar Salvador wrote:
> > [...]
> > > @@ -2394,9 +2397,19 @@ bool isolate_or_dissolve_huge_page(struct page *page)
> > >  	 */
> > >  	if (hstate_is_gigantic(h))
> > >  		return ret;
> > > -
> > > -	if (!page_count(head) && alloc_and_dissolve_huge_page(h, head))
> > > +retry:
> > > +	if (page_count(head) && isolate_huge_page(head, list)) {
> > >  		ret = true;
> > > +	} else if (!page_count(head)) {
> > 
> > This is rather head spinning. Do we need to test page_count in the else
> > branch? Do you want to optimize for a case where the page cannot be
> > isolated because of page_huge_active?
> 
> Well, I wanted to explictly call out both cases.
> We either 1) have an in-use page and we try to issolate it or 2) we have a free
> page (count == 0).
> 
> If the page could not be dissolved due to page_huge_active, this would either
> mean that page is about to be freed, or that someone has already issolated the
> page.
> Being the former case, one could say that falling-through alloc_and_dissolve is
> ok.
> 
> But no, I did not really want to optimize anything here, just wanted to be explicit
> about what we are checking and why.

Well, I will leave it to others. I do not feel strongly about this but
to me it makes the code harder to think about because the situation is
unstable and any of those condition can change as they are evaluated. So
an explicit checks makes the code harder in the end. I would simply got
with 
	if (isolate_huge_page(head, list) || !alloc_and_dissolve_huge_page())
		ret = true;

if either of the conditional needs a retry then it should be done
internally. Like alloc_and_dissolve_huge_page already does to stabilize
the PageFreed flag. An early bail out on non-free hugetlb page would
also better be done inside alloc_and_dissolve_huge_page.
Oscar Salvador Feb. 28, 2021, 1:43 p.m. UTC | #6
On Fri, Feb 26, 2021 at 01:46:21PM +0100, Michal Hocko wrote:
> Well, I will leave it to others. I do not feel strongly about this but
> to me it makes the code harder to think about because the situation is
> unstable and any of those condition can change as they are evaluated. So
> an explicit checks makes the code harder in the end. I would simply got
> with 
> 	if (isolate_huge_page(head, list) || !alloc_and_dissolve_huge_page())
> 		ret = true;
> 
> if either of the conditional needs a retry then it should be done
> internally. Like alloc_and_dissolve_huge_page already does to stabilize
> the PageFreed flag. An early bail out on non-free hugetlb page would
> also better be done inside alloc_and_dissolve_huge_page.

The retry could be done internally in alloc_and_dissolve_huge_page in
case someoen grabbed the page from under us, but calling
isolate_huge_page from there seemed a bit odd to me, that is why I
placed the logic in the outter function.
It looks more logic to me, but of course, that is just my taste.

I do not think it makes the code that hard to follow, but I will leave
it to the others.
If there is a consensus that a simplistic version is prefered, I do not
have a problem to go with that.

Mike, what is your take on this?

Thanks
David Hildenbrand March 5, 2021, 5:30 p.m. UTC | #7
On 22.02.21 14:51, Oscar Salvador wrote:
> alloc_contig_range() will fail if it finds a HugeTLB page within the range,
> without a chance to handle them. Since HugeTLB pages can be migrated as any
> LRU or Movable page, it does not make sense to bail out without trying.
> Enable the interface to recognize in-use HugeTLB pages so we can migrate
> them, and have much better chances to succeed the call.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>

LGTM as far as I can tell, I am only wondering if it wouldn't even be 
cleaner to squash both patches.
diff mbox series

Patch

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 72352d718829..8c17d0dbc87c 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -505,7 +505,7 @@  struct huge_bootmem_page {
 	struct hstate *hstate;
 };
 
-bool isolate_or_dissolve_huge_page(struct page *page);
+bool isolate_or_dissolve_huge_page(struct page *page, struct list_head *list);
 struct page *alloc_huge_page(struct vm_area_struct *vma,
 				unsigned long addr, int avoid_reserve);
 struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid,
@@ -776,7 +776,8 @@  void set_page_huge_active(struct page *page);
 #else	/* CONFIG_HUGETLB_PAGE */
 struct hstate {};
 
-static inline bool isolate_or_dissolve_huge_page(struct page *page)
+static inline bool isolate_or_dissolve_huge_page(struct page *page,
+						 struct list_head *list)
 {
 	return false;
 }
diff --git a/mm/compaction.c b/mm/compaction.c
index d52506ed9db7..6d9169e71d61 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -906,9 +906,18 @@  isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		}
 
 		if (PageHuge(page) && cc->alloc_contig) {
-			if (!isolate_or_dissolve_huge_page(page))
+			if (!isolate_or_dissolve_huge_page(page, &cc->migratepages))
 				goto isolate_fail;
 
+			if (PageHuge(page)) {
+				/*
+				 * Hugepage was successfully isolated and placed
+				 * on the cc->migratepages list.
+				 */
+				low_pfn += compound_nr(page) - 1;
+				goto isolate_success_no_list;
+			}
+
 			/*
 			 * Ok, the hugepage was dissolved. Now these pages are
 			 * Buddy and cannot be re-allocated because they are
@@ -1053,6 +1062,7 @@  isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 
 isolate_success:
 		list_add(&page->lru, &cc->migratepages);
+isolate_success_no_list:
 		cc->nr_migratepages += compound_nr(page);
 		nr_isolated += compound_nr(page);
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 56eba64a1d33..95dd54cd53c0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2336,7 +2336,9 @@  static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page)
 		goto unlock;
 	} else if (page_count(old_page)) {
 		/*
-		 * Someone has grabbed the page, fail for now.
+		 * Someone has grabbed the page, return -EBUSY so we give
+		 * isolate_or_dissolve_huge_page a chance to handle an in-use
+		 * page.
 		 */
 		ret = -EBUSY;
 		update_and_free_page(h, new_page);
@@ -2368,11 +2370,12 @@  static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page)
 	return ret;
 }
 
-bool isolate_or_dissolve_huge_page(struct page *page)
+bool isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
 {
 	struct hstate *h = NULL;
 	struct page *head;
 	bool ret = false;
+	bool try_again = true;
 
 	spin_lock(&hugetlb_lock);
 	if (PageHuge(page)) {
@@ -2394,9 +2397,19 @@  bool isolate_or_dissolve_huge_page(struct page *page)
 	 */
 	if (hstate_is_gigantic(h))
 		return ret;
-
-	if (!page_count(head) && alloc_and_dissolve_huge_page(h, head))
+retry:
+	if (page_count(head) && isolate_huge_page(head, list)) {
 		ret = true;
+	} else if (!page_count(head)) {
+		int err = alloc_and_dissolve_huge_page(h, head);
+
+		if (!err) {
+			ret = true;
+		} else if (err == -EBUSY && try_again) {
+			try_again = false;
+			goto retry;
+		}
+	}
 
 	return ret;
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b1b574ad199d..0803adca4469 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1506,8 +1506,9 @@  unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 	LIST_HEAD(clean_pages);
 
 	list_for_each_entry_safe(page, next, page_list, lru) {
-		if (page_is_file_lru(page) && !PageDirty(page) &&
-		    !__PageMovable(page) && !PageUnevictable(page)) {
+		if (!PageHuge(page) && page_is_file_lru(page) &&
+		    !PageDirty(page) && !__PageMovable(page) &&
+		    !PageUnevictable(page)) {
 			ClearPageActive(page);
 			list_move(&page->lru, &clean_pages);
 		}