diff mbox series

[v4,05/10] mm/gup: migrate pinned pages out of movable zone

Message ID 20201217185243.3288048-6-pasha.tatashin@soleen.com (mailing list archive)
State New, archived
Headers show
Series prohibit pinning pages in ZONE_MOVABLE | expand

Commit Message

Pasha Tatashin Dec. 17, 2020, 6:52 p.m. UTC
We should not pin pages in ZONE_MOVABLE. Currently, we do not pin only
movable CMA pages. Generalize the function that migrates CMA pages to
migrate all movable pages. Use is_pinnable_page() to check which
pages need to be migrated

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
---
 include/linux/migrate.h        |  1 +
 include/linux/mmzone.h         | 11 ++++--
 include/trace/events/migrate.h |  3 +-
 mm/gup.c                       | 68 ++++++++++++++--------------------
 4 files changed, 39 insertions(+), 44 deletions(-)

Comments

Michal Hocko Dec. 18, 2020, 9:43 a.m. UTC | #1
On Thu 17-12-20 13:52:38, Pavel Tatashin wrote:
> +	 * 1. Pinned pages: (long-term) pinning of movable pages is avoided
> +	 *    when pages are pinned and faulted, but it is still possible that
> +	 *    address space already has pages in ZONE_MOVABLE at the time when
> +	 *    pages are pinned (i.e. user has touches that memory before
> +	 *    pinning). In such case we try to migrate them to a different zone,
> +	 *    but if migration fails the pages can still end-up pinned in
> +	 *    ZONE_MOVABLE. In such case, memory offlining might retry a long
> +	 *    time and will only succeed once user application unpins pages.

I still dislike this. Pinning can fail so there shouldn't be any reasons
to break MOVABLE constrain for something that can be handled. If
anything there should be a very good reasoning behind this decision
documented.
Pasha Tatashin Dec. 18, 2020, 12:24 p.m. UTC | #2
On Fri, Dec 18, 2020 at 4:43 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Thu 17-12-20 13:52:38, Pavel Tatashin wrote:
> > +      * 1. Pinned pages: (long-term) pinning of movable pages is avoided
> > +      *    when pages are pinned and faulted, but it is still possible that
> > +      *    address space already has pages in ZONE_MOVABLE at the time when
> > +      *    pages are pinned (i.e. user has touches that memory before
> > +      *    pinning). In such case we try to migrate them to a different zone,
> > +      *    but if migration fails the pages can still end-up pinned in
> > +      *    ZONE_MOVABLE. In such case, memory offlining might retry a long
> > +      *    time and will only succeed once user application unpins pages.
>
> I still dislike this. Pinning can fail so there shouldn't be any reasons
> to break MOVABLE constrain for something that can be handled. If
> anything there should be a very good reasoning behind this decision
> documented.

This is basically current behaviour, after patch 8, we can never pin
pages in the movable zone, so I will update this comment in that
patch.

Thank you,
Pasha

> --
> Michal Hocko
> SUSE Labs
Michal Hocko Dec. 18, 2020, 1:08 p.m. UTC | #3
On Fri 18-12-20 07:24:53, Pavel Tatashin wrote:
> On Fri, Dec 18, 2020 at 4:43 AM Michal Hocko <mhocko@suse.com> wrote:
> >
> > On Thu 17-12-20 13:52:38, Pavel Tatashin wrote:
> > > +      * 1. Pinned pages: (long-term) pinning of movable pages is avoided
> > > +      *    when pages are pinned and faulted, but it is still possible that
> > > +      *    address space already has pages in ZONE_MOVABLE at the time when
> > > +      *    pages are pinned (i.e. user has touches that memory before
> > > +      *    pinning). In such case we try to migrate them to a different zone,
> > > +      *    but if migration fails the pages can still end-up pinned in
> > > +      *    ZONE_MOVABLE. In such case, memory offlining might retry a long
> > > +      *    time and will only succeed once user application unpins pages.
> >
> > I still dislike this. Pinning can fail so there shouldn't be any reasons
> > to break MOVABLE constrain for something that can be handled. If
> > anything there should be a very good reasoning behind this decision
> > documented.
> 
> This is basically current behaviour, after patch 8, we can never pin
> pages in the movable zone, so I will update this comment in that
> patch.

Then it would be much easier for review to state that the existing
behavior is unchanged and do not update this comment just to remove it
in a later patch. Because this patch should be straightforward change of
the condition which pages to migrate (+some renaming which should be
reasonably easy to follow). Maybe it would be even better to do the
renaming separately without any functional changes and make only the
change in the condition here.
Pasha Tatashin Jan. 13, 2021, 7:14 p.m. UTC | #4
On Fri, Dec 18, 2020 at 8:08 AM Michal Hocko <mhocko@suse.com> wrote:
>
> On Fri 18-12-20 07:24:53, Pavel Tatashin wrote:
> > On Fri, Dec 18, 2020 at 4:43 AM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Thu 17-12-20 13:52:38, Pavel Tatashin wrote:
> > > > +      * 1. Pinned pages: (long-term) pinning of movable pages is avoided
> > > > +      *    when pages are pinned and faulted, but it is still possible that
> > > > +      *    address space already has pages in ZONE_MOVABLE at the time when
> > > > +      *    pages are pinned (i.e. user has touches that memory before
> > > > +      *    pinning). In such case we try to migrate them to a different zone,
> > > > +      *    but if migration fails the pages can still end-up pinned in
> > > > +      *    ZONE_MOVABLE. In such case, memory offlining might retry a long
> > > > +      *    time and will only succeed once user application unpins pages.
> > >
> > > I still dislike this. Pinning can fail so there shouldn't be any reasons
> > > to break MOVABLE constrain for something that can be handled. If
> > > anything there should be a very good reasoning behind this decision
> > > documented.
> >
> > This is basically current behaviour, after patch 8, we can never pin
> > pages in the movable zone, so I will update this comment in that
> > patch.
>
> Then it would be much easier for review to state that the existing
> behavior is unchanged and do not update this comment just to remove it
> in a later patch. Because this patch should be straightforward change of
> the condition which pages to migrate (+some renaming which should be
> reasonably easy to follow).

Makes sense, I will update this comment correctly right away when the
behaviour changes.
diff mbox series

Patch

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 4594838a0f7c..aae5ef0b3ba1 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -27,6 +27,7 @@  enum migrate_reason {
 	MR_MEMPOLICY_MBIND,
 	MR_NUMA_MISPLACED,
 	MR_CONTIG_RANGE,
+	MR_LONGTERM_PIN,
 	MR_TYPES
 };
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b593316bff3d..25c0c13ba4b1 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -386,9 +386,14 @@  enum zone_type {
 	 * likely to succeed, and to locally limit unmovable allocations - e.g.,
 	 * to increase the number of THP/huge pages. Notable special cases are:
 	 *
-	 * 1. Pinned pages: (long-term) pinning of movable pages might
-	 *    essentially turn such pages unmovable. Memory offlining might
-	 *    retry a long time.
+	 * 1. Pinned pages: (long-term) pinning of movable pages is avoided
+	 *    when pages are pinned and faulted, but it is still possible that
+	 *    address space already has pages in ZONE_MOVABLE at the time when
+	 *    pages are pinned (i.e. user has touches that memory before
+	 *    pinning). In such case we try to migrate them to a different zone,
+	 *    but if migration fails the pages can still end-up pinned in
+	 *    ZONE_MOVABLE. In such case, memory offlining might retry a long
+	 *    time and will only succeed once user application unpins pages.
 	 * 2. memblock allocations: kernelcore/movablecore setups might create
 	 *    situations where ZONE_MOVABLE contains unmovable allocations
 	 *    after boot. Memory offlining and allocations fail early.
diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h
index 4d434398d64d..363b54ce104c 100644
--- a/include/trace/events/migrate.h
+++ b/include/trace/events/migrate.h
@@ -20,7 +20,8 @@ 
 	EM( MR_SYSCALL,		"syscall_or_cpuset")		\
 	EM( MR_MEMPOLICY_MBIND,	"mempolicy_mbind")		\
 	EM( MR_NUMA_MISPLACED,	"numa_misplaced")		\
-	EMe(MR_CONTIG_RANGE,	"contig_range")
+	EM( MR_CONTIG_RANGE,	"contig_range")			\
+	EMe(MR_LONGTERM_PIN,	"longterm_pin")
 
 /*
  * First define the enums in the above macros to be exported to userspace
diff --git a/mm/gup.c b/mm/gup.c
index 04602e94856b..591d8e2dfc70 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -89,11 +89,12 @@  static __maybe_unused struct page *try_grab_compound_head(struct page *page,
 		int orig_refs = refs;
 
 		/*
-		 * Can't do FOLL_LONGTERM + FOLL_PIN with CMA in the gup fast
-		 * path, so fail and let the caller fall back to the slow path.
+		 * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
+		 * right zone, so fail and let the caller fall back to the slow
+		 * path.
 		 */
-		if (unlikely(flags & FOLL_LONGTERM) &&
-				is_migrate_cma_page(page))
+		if (unlikely((flags & FOLL_LONGTERM) &&
+			     !is_pinnable_page(page)))
 			return NULL;
 
 		/*
@@ -1549,19 +1550,18 @@  struct page *get_dump_page(unsigned long addr)
 }
 #endif /* CONFIG_ELF_CORE */
 
-#ifdef CONFIG_CMA
-static long check_and_migrate_cma_pages(struct mm_struct *mm,
-					unsigned long start,
-					unsigned long nr_pages,
-					struct page **pages,
-					struct vm_area_struct **vmas,
-					unsigned int gup_flags)
+static long check_and_migrate_movable_pages(struct mm_struct *mm,
+					    unsigned long start,
+					    unsigned long nr_pages,
+					    struct page **pages,
+					    struct vm_area_struct **vmas,
+					    unsigned int gup_flags)
 {
 	unsigned long i;
 	unsigned long step;
 	bool drain_allow = true;
 	bool migrate_allow = true;
-	LIST_HEAD(cma_page_list);
+	LIST_HEAD(movable_page_list);
 	long ret = nr_pages;
 	struct migration_target_control mtc = {
 		.nid = NUMA_NO_NODE,
@@ -1579,13 +1579,12 @@  static long check_and_migrate_cma_pages(struct mm_struct *mm,
 		 */
 		step = compound_nr(head) - (pages[i] - head);
 		/*
-		 * If we get a page from the CMA zone, since we are going to
-		 * be pinning these entries, we might as well move them out
-		 * of the CMA zone if possible.
+		 * If we get a movable page, since we are going to be pinning
+		 * these entries, try to move them out if possible.
 		 */
-		if (is_migrate_cma_page(head)) {
+		if (!is_pinnable_page(head)) {
 			if (PageHuge(head))
-				isolate_huge_page(head, &cma_page_list);
+				isolate_huge_page(head, &movable_page_list);
 			else {
 				if (!PageLRU(head) && drain_allow) {
 					lru_add_drain_all();
@@ -1593,7 +1592,7 @@  static long check_and_migrate_cma_pages(struct mm_struct *mm,
 				}
 
 				if (!isolate_lru_page(head)) {
-					list_add_tail(&head->lru, &cma_page_list);
+					list_add_tail(&head->lru, &movable_page_list);
 					mod_node_page_state(page_pgdat(head),
 							    NR_ISOLATED_ANON +
 							    page_is_file_lru(head),
@@ -1605,7 +1604,7 @@  static long check_and_migrate_cma_pages(struct mm_struct *mm,
 		i += step;
 	}
 
-	if (!list_empty(&cma_page_list)) {
+	if (!list_empty(&movable_page_list)) {
 		/*
 		 * drop the above get_user_pages reference.
 		 */
@@ -1615,25 +1614,24 @@  static long check_and_migrate_cma_pages(struct mm_struct *mm,
 			for (i = 0; i < nr_pages; i++)
 				put_page(pages[i]);
 
-		if (migrate_pages(&cma_page_list, alloc_migration_target, NULL,
-			(unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
+		if (migrate_pages(&movable_page_list, alloc_migration_target, NULL,
+			(unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN)) {
 			/*
 			 * some of the pages failed migration. Do get_user_pages
 			 * without migration.
 			 */
 			migrate_allow = false;
 
-			if (!list_empty(&cma_page_list))
-				putback_movable_pages(&cma_page_list);
+			if (!list_empty(&movable_page_list))
+				putback_movable_pages(&movable_page_list);
 		}
 		/*
 		 * We did migrate all the pages, Try to get the page references
-		 * again migrating any new CMA pages which we failed to isolate
-		 * earlier.
+		 * again migrating any pages which we failed to isolate earlier.
 		 */
 		ret = __get_user_pages_locked(mm, start, nr_pages,
-						   pages, vmas, NULL,
-						   gup_flags);
+					      pages, vmas, NULL,
+					      gup_flags);
 
 		if ((ret > 0) && migrate_allow) {
 			nr_pages = ret;
@@ -1644,17 +1642,6 @@  static long check_and_migrate_cma_pages(struct mm_struct *mm,
 
 	return ret;
 }
-#else
-static long check_and_migrate_cma_pages(struct mm_struct *mm,
-					unsigned long start,
-					unsigned long nr_pages,
-					struct page **pages,
-					struct vm_area_struct **vmas,
-					unsigned int gup_flags)
-{
-	return nr_pages;
-}
-#endif /* CONFIG_CMA */
 
 /*
  * __gup_longterm_locked() is a wrapper for __get_user_pages_locked which
@@ -1678,8 +1665,9 @@  static long __gup_longterm_locked(struct mm_struct *mm,
 
 	if (gup_flags & FOLL_LONGTERM) {
 		if (rc > 0)
-			rc = check_and_migrate_cma_pages(mm, start, rc, pages,
-							 vmas, gup_flags);
+			rc = check_and_migrate_movable_pages(mm, start, rc,
+							     pages, vmas,
+							     gup_flags);
 		memalloc_pin_restore(flags);
 	}
 	return rc;