Message ID | 814dee5d3aadd38c3370eaaf438ba7eee9bf9d2b.1659399696.git-series.apopple@nvidia.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm/gup.c: Simplify and fix check_and_migrate_movable_pages() return codes | expand |
On Tue, Aug 02, 2022 at 10:30:12AM +1000, Alistair Popple wrote: > When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() > is called to migrate pages out of zones which should not contain any > longterm pinned pages. > > When migration succeeds all pages will have been unpinned so pinning > needs to be retried. This is indicated by returning zero. When all pages > are in the correct zone the number of pinned pages is returned. > > However migration can also fail, in which case pages are unpinned and > -ENOMEM is returned. However if the failure was due to not being unable > to isolate a page zero is returned. This leads to indefinite looping in > __gup_longterm_locked(). > > Fix this by simplifying the return codes such that zero indicates all > pages were successfully pinned in the correct zone while errors indicate > either pages were migrated and pinning should be retried or that > migration has failed and therefore the pinning operation should fail. > > This fixes the indefinite looping on page isolation failure by failing > the pin operation instead of retrying indefinitely. > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > > --- > > Changes for v2: > - Changed error handling to be move conventional using goto as > suggested by Jason. > - Removed coherent_pages check as it isn't necessary. > --- > mm/gup.c | 81 ++++++++++++++++++++++++++++----------------------------- > 1 file changed, 41 insertions(+), 40 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 364b274..5707c56 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -1901,20 +1901,24 @@ struct page *get_dump_page(unsigned long addr) > > #ifdef CONFIG_MIGRATION > /* > - * Check whether all pages are pinnable, if so return number of pages. If some > - * pages are not pinnable, migrate them, and unpin all pages. Return zero if > - * pages were migrated, or if some pages were not successfully isolated. > - * Return negative error if migration fails. > + * Check whether all pages are pinnable. If some pages are not pinnable migrate > + * them and unpin all the pages. Returns -EAGAIN if pages were unpinned or zero > + * if all pages are pinnable and in the right zone. Other errors indicate > + * migration failure. > */ > static long check_and_migrate_movable_pages(unsigned long nr_pages, > struct page **pages, > unsigned int gup_flags) > { > - unsigned long isolation_error_count = 0, i; > + unsigned long i; > struct folio *prev_folio = NULL; > LIST_HEAD(movable_page_list); > - bool drain_allow = true, coherent_pages = false; > - int ret = 0; > + bool drain_allow = true; > + int ret = -EAGAIN; It looked like every goto error set this? Why initialize it? It looks OK to me, a lot clearer Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Thanks, Jason
On Mon, Aug 1, 2022 at 8:32 PM Alistair Popple <apopple@nvidia.com> wrote: > > When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() > is called to migrate pages out of zones which should not contain any > longterm pinned pages. > > When migration succeeds all pages will have been unpinned so pinning > needs to be retried. This is indicated by returning zero. When all pages > are in the correct zone the number of pinned pages is returned. > > However migration can also fail, in which case pages are unpinned and > -ENOMEM is returned. However if the failure was due to not being unable > to isolate a page zero is returned. This leads to indefinite looping in > __gup_longterm_locked(). Hi Alistair, During prohibiting pinning movable zone development, there was a discussion where we figured that isolation errors should be transient [1]. What isolation errors are you seeing that lead to infinite loop? Why do they happen? Pasha [1] https://lore.kernel.org/linux-mm/20201218104655.GW32193@dhcp22.suse.cz
On Tue, 2 Aug 2022 10:30:12 +1000 Alistair Popple <apopple@nvidia.com> wrote: > When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() > is called to migrate pages out of zones which should not contain any > longterm pinned pages. > > When migration succeeds all pages will have been unpinned so pinning > needs to be retried. This is indicated by returning zero. When all pages > are in the correct zone the number of pinned pages is returned. > > However migration can also fail, in which case pages are unpinned and > -ENOMEM is returned. However if the failure was due to not being unable > to isolate a page zero is returned. This leads to indefinite looping in > __gup_longterm_locked(). > > Fix this by simplifying the return codes such that zero indicates all > pages were successfully pinned in the correct zone while errors indicate > either pages were migrated and pinning should be retried or that > migration has failed and therefore the pinning operation should fail. > > This fixes the indefinite looping on page isolation failure by failing > the pin operation instead of retrying indefinitely. > Are we able to identify a Fixes: for this? Presumably something in the series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping"?
Pasha Tatashin <pasha.tatashin@soleen.com> writes: > On Mon, Aug 1, 2022 at 8:32 PM Alistair Popple <apopple@nvidia.com> wrote: >> >> When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() >> is called to migrate pages out of zones which should not contain any >> longterm pinned pages. >> >> When migration succeeds all pages will have been unpinned so pinning >> needs to be retried. This is indicated by returning zero. When all pages >> are in the correct zone the number of pinned pages is returned. >> >> However migration can also fail, in which case pages are unpinned and >> -ENOMEM is returned. However if the failure was due to not being unable >> to isolate a page zero is returned. This leads to indefinite looping in >> __gup_longterm_locked(). > > Hi Alistair, > > During prohibiting pinning movable zone development, there was a > discussion where we figured that isolation errors should be transient > [1]. What isolation errors are you seeing that lead to infinite loop? > Why do they happen? Thanks for the pointer Pasha. There were reports of qemu running into the same zero page problem you reported there, see https://lore.kernel.org/linux-mm/165490039431.944052.12458624139225785964.stgit@omen/ This doesn't directly fix that problem as we need to allow pinning of the zero page, but it does prevent the infinite loop. I was going to re-spin this patch to retry instead of instant failure however reading that thread it seems the infinite loop is desired behaviour. So will re-spin this to leave that in-place. - Alistair > Pasha > > [1] https://lore.kernel.org/linux-mm/20201218104655.GW32193@dhcp22.suse.cz
Andrew Morton <akpm@linux-foundation.org> writes: > On Tue, 2 Aug 2022 10:30:12 +1000 Alistair Popple <apopple@nvidia.com> wrote: > >> When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() >> is called to migrate pages out of zones which should not contain any >> longterm pinned pages. >> >> When migration succeeds all pages will have been unpinned so pinning >> needs to be retried. This is indicated by returning zero. When all pages >> are in the correct zone the number of pinned pages is returned. >> >> However migration can also fail, in which case pages are unpinned and >> -ENOMEM is returned. However if the failure was due to not being unable >> to isolate a page zero is returned. This leads to indefinite looping in >> __gup_longterm_locked(). >> >> Fix this by simplifying the return codes such that zero indicates all >> pages were successfully pinned in the correct zone while errors indicate >> either pages were migrated and pinning should be retried or that >> migration has failed and therefore the pinning operation should fail. >> >> This fixes the indefinite looping on page isolation failure by failing >> the pin operation instead of retrying indefinitely. >> > > Are we able to identify a Fixes: for this? Presumably something in the > series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping"? It seems the infinite loop was desired behaviour so I will re-spin this as a pure clean-up.
On 04.08.22 02:12, Alistair Popple wrote: > > Andrew Morton <akpm@linux-foundation.org> writes: > >> On Tue, 2 Aug 2022 10:30:12 +1000 Alistair Popple <apopple@nvidia.com> wrote: >> >>> When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() >>> is called to migrate pages out of zones which should not contain any >>> longterm pinned pages. >>> >>> When migration succeeds all pages will have been unpinned so pinning >>> needs to be retried. This is indicated by returning zero. When all pages >>> are in the correct zone the number of pinned pages is returned. >>> >>> However migration can also fail, in which case pages are unpinned and >>> -ENOMEM is returned. However if the failure was due to not being unable >>> to isolate a page zero is returned. This leads to indefinite looping in >>> __gup_longterm_locked(). >>> >>> Fix this by simplifying the return codes such that zero indicates all >>> pages were successfully pinned in the correct zone while errors indicate >>> either pages were migrated and pinning should be retried or that >>> migration has failed and therefore the pinning operation should fail. >>> >>> This fixes the indefinite looping on page isolation failure by failing >>> the pin operation instead of retrying indefinitely. >>> >> >> Are we able to identify a Fixes: for this? Presumably something in the >> series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping"? > > It seems the infinite loop was desired behaviour so I will re-spin this > as a pure clean-up. > How can the infinite loop trigger when we allow longterm-pinning the shared zeropage? (note: disallowing that for now was a bug)
David Hildenbrand <david@redhat.com> writes: > On 04.08.22 02:12, Alistair Popple wrote: >> >> Andrew Morton <akpm@linux-foundation.org> writes: >> >>> On Tue, 2 Aug 2022 10:30:12 +1000 Alistair Popple <apopple@nvidia.com> wrote: >>> >>>> When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() >>>> is called to migrate pages out of zones which should not contain any >>>> longterm pinned pages. >>>> >>>> When migration succeeds all pages will have been unpinned so pinning >>>> needs to be retried. This is indicated by returning zero. When all pages >>>> are in the correct zone the number of pinned pages is returned. >>>> >>>> However migration can also fail, in which case pages are unpinned and >>>> -ENOMEM is returned. However if the failure was due to not being unable >>>> to isolate a page zero is returned. This leads to indefinite looping in >>>> __gup_longterm_locked(). >>>> >>>> Fix this by simplifying the return codes such that zero indicates all >>>> pages were successfully pinned in the correct zone while errors indicate >>>> either pages were migrated and pinning should be retried or that >>>> migration has failed and therefore the pinning operation should fail. >>>> >>>> This fixes the indefinite looping on page isolation failure by failing >>>> the pin operation instead of retrying indefinitely. >>>> >>> >>> Are we able to identify a Fixes: for this? Presumably something in the >>> series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping"? >> >> It seems the infinite loop was desired behaviour so I will re-spin this >> as a pure clean-up. >> > > How can the infinite loop trigger when we allow longterm-pinning the > shared zeropage? (note: disallowing that for now was a bug) Right, I don't know of any other triggers so based on the discussion Pasha pointed me at I think the infinite loop is probably fine unless there are other bugs. Apologies I should have copied you on the new version which is just a clean-up now - https://lore.kernel.org/linux-mm/20220804032241.859891-1-apopple@nvidia.com/
diff --git a/mm/gup.c b/mm/gup.c index 364b274..5707c56 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1901,20 +1901,24 @@ struct page *get_dump_page(unsigned long addr) #ifdef CONFIG_MIGRATION /* - * Check whether all pages are pinnable, if so return number of pages. If some - * pages are not pinnable, migrate them, and unpin all pages. Return zero if - * pages were migrated, or if some pages were not successfully isolated. - * Return negative error if migration fails. + * Check whether all pages are pinnable. If some pages are not pinnable migrate + * them and unpin all the pages. Returns -EAGAIN if pages were unpinned or zero + * if all pages are pinnable and in the right zone. Other errors indicate + * migration failure. */ static long check_and_migrate_movable_pages(unsigned long nr_pages, struct page **pages, unsigned int gup_flags) { - unsigned long isolation_error_count = 0, i; + unsigned long i; struct folio *prev_folio = NULL; LIST_HEAD(movable_page_list); - bool drain_allow = true, coherent_pages = false; - int ret = 0; + bool drain_allow = true; + int ret = -EAGAIN; + struct migration_target_control mtc = { + .nid = NUMA_NO_NODE, + .gfp_mask = GFP_USER | __GFP_NOWARN, + }; for (i = 0; i < nr_pages; i++) { struct folio *folio = page_folio(pages[i]); @@ -1935,7 +1939,6 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, * pages. */ pages[i] = 0; - coherent_pages = true; /* * Migration will fail if the page is pinned, so convert @@ -1946,10 +1949,10 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, unpin_user_page(&folio->page); } - ret = migrate_device_coherent_page(&folio->page); - if (ret) - goto unpin_pages; - + if (migrate_device_coherent_page(&folio->page)) { + ret = -EBUSY; + goto error; + } continue; } @@ -1960,8 +1963,10 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, */ if (folio_test_hugetlb(folio)) { if (isolate_hugetlb(&folio->page, - &movable_page_list)) - isolation_error_count++; + &movable_page_list)) { + ret = -EBUSY; + goto error; + } continue; } @@ -1971,28 +1976,26 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, } if (folio_isolate_lru(folio)) { - isolation_error_count++; - continue; + ret = -EBUSY; + goto error; } + list_add_tail(&folio->lru, &movable_page_list); node_stat_mod_folio(folio, NR_ISOLATED_ANON + folio_is_file_lru(folio), folio_nr_pages(folio)); } - if (!list_empty(&movable_page_list) || isolation_error_count - || coherent_pages) - goto unpin_pages; - /* - * If list is empty, and no isolation errors, means that all pages are - * in the correct zone. + * All pages are in the correct zone. */ - return nr_pages; + if (list_empty(&movable_page_list)) + return 0; -unpin_pages: /* - * pages[i] might be NULL if any device coherent pages were found. + * Unpin all pages. If device coherent pages were found + * migrate_deivce_coherent_page() will have already dropped the pin and + * set pages[i] == NULL. */ for (i = 0; i < nr_pages; i++) { if (!pages[i]) @@ -2004,21 +2007,19 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, put_page(pages[i]); } - if (!list_empty(&movable_page_list)) { - struct migration_target_control mtc = { - .nid = NUMA_NO_NODE, - .gfp_mask = GFP_USER | __GFP_NOWARN, - }; - - ret = migrate_pages(&movable_page_list, alloc_migration_target, - NULL, (unsigned long)&mtc, MIGRATE_SYNC, - MR_LONGTERM_PIN, NULL); - if (ret > 0) /* number of pages not migrated */ - ret = -ENOMEM; + if (migrate_pages(&movable_page_list, alloc_migration_target, + NULL, (unsigned long)&mtc, MIGRATE_SYNC, + MR_LONGTERM_PIN, NULL)) { + ret = -ENOMEM; + goto error; } - if (ret && !list_empty(&movable_page_list)) + return -EAGAIN; + +error: + if (!list_empty(&movable_page_list)) putback_movable_pages(&movable_page_list); + return ret; } #else @@ -2026,7 +2027,7 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, struct page **pages, unsigned int gup_flags) { - return nr_pages; + return 0; } #endif /* CONFIG_MIGRATION */ @@ -2054,10 +2055,10 @@ static long __gup_longterm_locked(struct mm_struct *mm, if (rc <= 0) break; rc = check_and_migrate_movable_pages(rc, pages, gup_flags); - } while (!rc); + } while (rc == -EAGAIN); memalloc_pin_restore(flags); - return rc; + return rc ? rc : nr_pages; } static bool is_valid_gup_flags(unsigned int gup_flags)
When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages() is called to migrate pages out of zones which should not contain any longterm pinned pages. When migration succeeds all pages will have been unpinned so pinning needs to be retried. This is indicated by returning zero. When all pages are in the correct zone the number of pinned pages is returned. However migration can also fail, in which case pages are unpinned and -ENOMEM is returned. However if the failure was due to not being unable to isolate a page zero is returned. This leads to indefinite looping in __gup_longterm_locked(). Fix this by simplifying the return codes such that zero indicates all pages were successfully pinned in the correct zone while errors indicate either pages were migrated and pinning should be retried or that migration has failed and therefore the pinning operation should fail. This fixes the indefinite looping on page isolation failure by failing the pin operation instead of retrying indefinitely. Signed-off-by: Alistair Popple <apopple@nvidia.com> --- Changes for v2: - Changed error handling to be move conventional using goto as suggested by Jason. - Removed coherent_pages check as it isn't necessary. --- mm/gup.c | 81 ++++++++++++++++++++++++++++----------------------------- 1 file changed, 41 insertions(+), 40 deletions(-) base-commit: 187e7c41445a0f202bb551f08ca7f8158fea1cd7