Message ID | 20240225080008.1019653-3-vivek.kasireddy@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/gup: Introduce memfd_pin_folios() for pinning memfd folios | expand |
On 25.02.24 08:56, Vivek Kasireddy wrote: > This helper is the folio equivalent of check_and_migrate_movable_pages(). > Therefore, all the rules that apply to check_and_migrate_movable_pages() > also apply to this one as well. Currently, this helper is only used by > memfd_pin_folios(). > > This patch also includes changes to rename and convert the internal > functions collect_longterm_unpinnable_pages() and > migrate_longterm_unpinnable_pages() to work on folios. Since they > are also used by check_and_migrate_movable_pages(), a temporary > array is used to collect and share the folios with these functions. > > Cc: David Hildenbrand <david@redhat.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Jason Gunthorpe <jgg@nvidia.com> > Cc: Peter Xu <peterx@redhat.com> > Suggested-by: David Hildenbrand <david@redhat.com> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > --- > mm/gup.c | 129 +++++++++++++++++++++++++++++++++++++++---------------- > 1 file changed, 92 insertions(+), 37 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 0a45eda6aaeb..1410af954a4e 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2099,20 +2099,24 @@ struct page *get_dump_page(unsigned long addr) > > #ifdef CONFIG_MIGRATION > /* > - * Returns the number of collected pages. Return value is always >= 0. > + * Returns the number of collected folios. Return value is always >= 0. > */ > -static unsigned long collect_longterm_unpinnable_pages( > - struct list_head *movable_page_list, > - unsigned long nr_pages, > +static unsigned long collect_longterm_unpinnable_folios( > + struct list_head *movable_folio_list, > + unsigned long nr_folios, > + struct folio **folios, > struct page **pages) This function really shouldn't consume both folios and pages. Either use "folios" and handle the conversion from pages->folios in the caller, or handle it similar to release_pages() where we can pass either and simply always do page_folio() on the given pointer, using essentially an abstracted pointer type and always calling page_folio() on that thing. The easiest is likely to just do the page->folio conversion in the caller by looping over the arrays once more. See below. Temporary memory allocation can be avoided by using an abstracted pointer type. [...] > > + folio = folios[i]; > if (folio == prev_folio) > continue; > prev_folio = folio; > @@ -2126,7 +2130,7 @@ static unsigned long collect_longterm_unpinnable_pages( > continue; > > if (folio_test_hugetlb(folio)) { > - isolate_hugetlb(folio, movable_page_list); > + isolate_hugetlb(folio, movable_folio_list); > continue; > } > > @@ -2138,7 +2142,7 @@ static unsigned long collect_longterm_unpinnable_pages( > if (!folio_isolate_lru(folio)) > continue; > > - list_add_tail(&folio->lru, movable_page_list); > + list_add_tail(&folio->lru, movable_folio_list); > node_stat_mod_folio(folio, > NR_ISOLATED_ANON + folio_is_file_lru(folio), > folio_nr_pages(folio)); > @@ -2148,27 +2152,28 @@ static unsigned long collect_longterm_unpinnable_pages( > } > > /* > - * Unpins all pages and migrates device coherent pages and movable_page_list. > - * Returns -EAGAIN if all pages were successfully migrated or -errno for failure > - * (or partial success). > + * Unpins all folios and migrates device coherent folios and movable_folio_list. > + * Returns -EAGAIN if all folios were successfully migrated or -errno for > + * failure (or partial success). > */ > -static int migrate_longterm_unpinnable_pages( > - struct list_head *movable_page_list, > - unsigned long nr_pages, > - struct page **pages) > +static int migrate_longterm_unpinnable_folios( > + struct list_head *movable_folio_list, > + unsigned long nr_folios, > + struct folio **folios) > { > int ret; > unsigned long i; > > - for (i = 0; i < nr_pages; i++) { > - struct folio *folio = page_folio(pages[i]); > + for (i = 0; i < nr_folios; i++) { > + struct folio *folio = folios[i]; > > if (folio_is_device_coherent(folio)) { > /* > - * Migration will fail if the page is pinned, so convert > - * the pin on the source page to a normal reference. > + * Migration will fail if the folio is pinned, so > + * convert the pin on the source folio to a normal > + * reference. > */ > - pages[i] = NULL; > + folios[i] = NULL; > folio_get(folio); > gup_put_folio(folio, 1, FOLL_PIN); > > @@ -2181,23 +2186,23 @@ static int migrate_longterm_unpinnable_pages( > } > > /* > - * We can't migrate pages with unexpected references, so drop > + * We can't migrate folios with unexpected references, so drop > * the reference obtained by __get_user_pages_locked(). > - * Migrating pages have been added to movable_page_list after > + * Migrating folios have been added to movable_folio_list after > * calling folio_isolate_lru() which takes a reference so the > - * page won't be freed if it's migrating. > + * folio won't be freed if it's migrating. > */ > - unpin_user_page(pages[i]); > - pages[i] = NULL; > + unpin_folio(folios[i]); Aha, that's where you call unpin_folio() on an anon folio. Then simply drop the sanity check from inside unpin_folio() in patch #1. > + folios[i] = NULL; > } > > - if (!list_empty(movable_page_list)) { > + if (!list_empty(movable_folio_list)) { > struct migration_target_control mtc = { > .nid = NUMA_NO_NODE, > .gfp_mask = GFP_USER | __GFP_NOWARN, > }; > > - if (migrate_pages(movable_page_list, alloc_migration_target, > + if (migrate_pages(movable_folio_list, alloc_migration_target, > NULL, (unsigned long)&mtc, MIGRATE_SYNC, > MR_LONGTERM_PIN, NULL)) { > ret = -ENOMEM; > @@ -2205,15 +2210,15 @@ static int migrate_longterm_unpinnable_pages( > } > } > > - putback_movable_pages(movable_page_list); > + putback_movable_pages(movable_folio_list); This really needs a cleanup (independent of your work). We should rename it to putback_movable_folios: it only operates on folios. > > return -EAGAIN; > > err: > - for (i = 0; i < nr_pages; i++) > - if (pages[i]) > - unpin_user_page(pages[i]); > - putback_movable_pages(movable_page_list); > + for (i = 0; i < nr_folios; i++) > + if (folios[i]) > + unpin_folio(folios[i]); Can unpin_folios() be used? > + putback_movable_pages(movable_folio_list); > > return ret; > } > @@ -2237,16 +2242,60 @@ static int migrate_longterm_unpinnable_pages( > static long check_and_migrate_movable_pages(unsigned long nr_pages, > struct page **pages) > { > + unsigned long nr_folios = nr_pages; > unsigned long collected; > - LIST_HEAD(movable_page_list); > + LIST_HEAD(movable_folio_list); > + struct folio **folios; > + long ret; > > - collected = collect_longterm_unpinnable_pages(&movable_page_list, > - nr_pages, pages); > + folios = kmalloc_array(nr_folios, sizeof(*folios), GFP_KERNEL); > + if (!folios) > + return -ENOMEM; > + > + collected = collect_longterm_unpinnable_folios(&movable_folio_list, > + nr_folios, folios, > + pages); > + if (!collected) { > + kfree(folios); > + return 0; > + } > + > + ret = migrate_longterm_unpinnable_folios(&movable_folio_list, > + nr_folios, folios); > + kfree(folios); > + return ret; This function should likely be a pure rapper around check_and_migrate_movable_folios(). For example: static long check_and_migrate_movable_pages(unsigned long nr_pages, struct page **pages) { struct folio **folios; long ret; folios = kmalloc_array(nr_folios, sizeof(*folios), GFP_KERNEL); if (!folios) return -ENOMEM; /* TODO, convert all pages to folios. */ ret = check_and_migrate_movable_folios(nr_folios, folios); kfree(folios); return ret; } > +} > + > +/* > + * Check whether all folios are *allowed* to be pinned. Rather confusingly, all ... "to be pinned possibly forever ("longterm")". > + * folios in the range are required to be pinned via FOLL_PIN, before calling > + * this routine. > + * > + * If any folios in the range are not allowed to be pinned, then this routine > + * will migrate those folios away, unpin all the folios in the range and return > + * -EAGAIN. The caller should re-pin the entire range with FOLL_PIN and then > + * call this routine again. > + * [...]
diff --git a/mm/gup.c b/mm/gup.c index 0a45eda6aaeb..1410af954a4e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2099,20 +2099,24 @@ struct page *get_dump_page(unsigned long addr) #ifdef CONFIG_MIGRATION /* - * Returns the number of collected pages. Return value is always >= 0. + * Returns the number of collected folios. Return value is always >= 0. */ -static unsigned long collect_longterm_unpinnable_pages( - struct list_head *movable_page_list, - unsigned long nr_pages, +static unsigned long collect_longterm_unpinnable_folios( + struct list_head *movable_folio_list, + unsigned long nr_folios, + struct folio **folios, struct page **pages) { unsigned long i, collected = 0; struct folio *prev_folio = NULL; bool drain_allow = true; + struct folio *folio; - for (i = 0; i < nr_pages; i++) { - struct folio *folio = page_folio(pages[i]); + for (i = 0; i < nr_folios; i++) { + if (pages) + folios[i] = page_folio(pages[i]); + folio = folios[i]; if (folio == prev_folio) continue; prev_folio = folio; @@ -2126,7 +2130,7 @@ static unsigned long collect_longterm_unpinnable_pages( continue; if (folio_test_hugetlb(folio)) { - isolate_hugetlb(folio, movable_page_list); + isolate_hugetlb(folio, movable_folio_list); continue; } @@ -2138,7 +2142,7 @@ static unsigned long collect_longterm_unpinnable_pages( if (!folio_isolate_lru(folio)) continue; - list_add_tail(&folio->lru, movable_page_list); + list_add_tail(&folio->lru, movable_folio_list); node_stat_mod_folio(folio, NR_ISOLATED_ANON + folio_is_file_lru(folio), folio_nr_pages(folio)); @@ -2148,27 +2152,28 @@ static unsigned long collect_longterm_unpinnable_pages( } /* - * Unpins all pages and migrates device coherent pages and movable_page_list. - * Returns -EAGAIN if all pages were successfully migrated or -errno for failure - * (or partial success). + * Unpins all folios and migrates device coherent folios and movable_folio_list. + * Returns -EAGAIN if all folios were successfully migrated or -errno for + * failure (or partial success). */ -static int migrate_longterm_unpinnable_pages( - struct list_head *movable_page_list, - unsigned long nr_pages, - struct page **pages) +static int migrate_longterm_unpinnable_folios( + struct list_head *movable_folio_list, + unsigned long nr_folios, + struct folio **folios) { int ret; unsigned long i; - for (i = 0; i < nr_pages; i++) { - struct folio *folio = page_folio(pages[i]); + for (i = 0; i < nr_folios; i++) { + struct folio *folio = folios[i]; if (folio_is_device_coherent(folio)) { /* - * Migration will fail if the page is pinned, so convert - * the pin on the source page to a normal reference. + * Migration will fail if the folio is pinned, so + * convert the pin on the source folio to a normal + * reference. */ - pages[i] = NULL; + folios[i] = NULL; folio_get(folio); gup_put_folio(folio, 1, FOLL_PIN); @@ -2181,23 +2186,23 @@ static int migrate_longterm_unpinnable_pages( } /* - * We can't migrate pages with unexpected references, so drop + * We can't migrate folios with unexpected references, so drop * the reference obtained by __get_user_pages_locked(). - * Migrating pages have been added to movable_page_list after + * Migrating folios have been added to movable_folio_list after * calling folio_isolate_lru() which takes a reference so the - * page won't be freed if it's migrating. + * folio won't be freed if it's migrating. */ - unpin_user_page(pages[i]); - pages[i] = NULL; + unpin_folio(folios[i]); + folios[i] = NULL; } - if (!list_empty(movable_page_list)) { + if (!list_empty(movable_folio_list)) { struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, }; - if (migrate_pages(movable_page_list, alloc_migration_target, + if (migrate_pages(movable_folio_list, alloc_migration_target, NULL, (unsigned long)&mtc, MIGRATE_SYNC, MR_LONGTERM_PIN, NULL)) { ret = -ENOMEM; @@ -2205,15 +2210,15 @@ static int migrate_longterm_unpinnable_pages( } } - putback_movable_pages(movable_page_list); + putback_movable_pages(movable_folio_list); return -EAGAIN; err: - for (i = 0; i < nr_pages; i++) - if (pages[i]) - unpin_user_page(pages[i]); - putback_movable_pages(movable_page_list); + for (i = 0; i < nr_folios; i++) + if (folios[i]) + unpin_folio(folios[i]); + putback_movable_pages(movable_folio_list); return ret; } @@ -2237,16 +2242,60 @@ static int migrate_longterm_unpinnable_pages( static long check_and_migrate_movable_pages(unsigned long nr_pages, struct page **pages) { + unsigned long nr_folios = nr_pages; unsigned long collected; - LIST_HEAD(movable_page_list); + LIST_HEAD(movable_folio_list); + struct folio **folios; + long ret; - collected = collect_longterm_unpinnable_pages(&movable_page_list, - nr_pages, pages); + folios = kmalloc_array(nr_folios, sizeof(*folios), GFP_KERNEL); + if (!folios) + return -ENOMEM; + + collected = collect_longterm_unpinnable_folios(&movable_folio_list, + nr_folios, folios, + pages); + if (!collected) { + kfree(folios); + return 0; + } + + ret = migrate_longterm_unpinnable_folios(&movable_folio_list, + nr_folios, folios); + kfree(folios); + return ret; +} + +/* + * Check whether all folios are *allowed* to be pinned. Rather confusingly, all + * folios in the range are required to be pinned via FOLL_PIN, before calling + * this routine. + * + * If any folios in the range are not allowed to be pinned, then this routine + * will migrate those folios away, unpin all the folios in the range and return + * -EAGAIN. The caller should re-pin the entire range with FOLL_PIN and then + * call this routine again. + * + * If an error other than -EAGAIN occurs, this indicates a migration failure. + * The caller should give up, and propagate the error back up the call stack. + * + * If everything is OK and all folios in the range are allowed to be pinned, + * then this routine leaves all folios pinned and returns zero for success. + */ +static long check_and_migrate_movable_folios(unsigned long nr_folios, + struct folio **folios) +{ + unsigned long collected; + LIST_HEAD(movable_folio_list); + + collected = collect_longterm_unpinnable_folios(&movable_folio_list, + nr_folios, folios, + NULL); if (!collected) return 0; - return migrate_longterm_unpinnable_pages(&movable_page_list, nr_pages, - pages); + return migrate_longterm_unpinnable_folios(&movable_folio_list, + nr_folios, folios); } #else static long check_and_migrate_movable_pages(unsigned long nr_pages, @@ -2254,6 +2303,12 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, { return 0; } + +static long check_and_migrate_movable_folios(unsigned long nr_folios, + struct folio **folios) +{ + return 0; +} #endif /* CONFIG_MIGRATION */ /*
This helper is the folio equivalent of check_and_migrate_movable_pages(). Therefore, all the rules that apply to check_and_migrate_movable_pages() also apply to this one as well. Currently, this helper is only used by memfd_pin_folios(). This patch also includes changes to rename and convert the internal functions collect_longterm_unpinnable_pages() and migrate_longterm_unpinnable_pages() to work on folios. Since they are also used by check_and_migrate_movable_pages(), a temporary array is used to collect and share the folios with these functions. Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> --- mm/gup.c | 129 +++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 92 insertions(+), 37 deletions(-)