Message ID | 3289dc5e6c4c3174999598d8293adf8ed3e93b57.1582321645.git.riel@surriel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix THP migration for CMA allocations | expand |
On 21 Feb 2020, at 16:53, Rik van Riel wrote: > The code to implement THP migrations already exists, and the code > for CMA to clear out a region of memory already exists. > > Only a few small tweaks are needed to allow CMA to move THP memory > when attempting an allocation from alloc_contig_range. > > With these changes, migrating THPs from a CMA area works when > allocating a 1GB hugepage from CMA memory. > > Signed-off-by: Rik van Riel <riel@surriel.com> > --- > mm/compaction.c | 16 +++++++++------- > mm/page_alloc.c | 6 ++++-- > 2 files changed, 13 insertions(+), 9 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 672d3c78c6ab..f3e05c91df62 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -894,12 +894,12 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > /* > * Regardless of being on LRU, compound pages such as THP and > - * hugetlbfs are not to be compacted. We can potentially save > - * a lot of iterations if we skip them at once. The check is > - * racy, but we can consider only valid values and the only > - * danger is skipping too much. > + * hugetlbfs are not to be compacted most of the time. We can > + * potentially save a lot of iterations if we skip them at > + * once. The check is racy, but we can consider only valid > + * values and the only danger is skipping too much. > */ Maybe add “we do want to move them when allocating contiguous memory using CMA” to help people understand the context of using cc->alloc_contig? > - if (PageCompound(page)) { > + if (PageCompound(page) && !cc->alloc_contig) { > const unsigned int order = compound_order(page); > > if (likely(order < MAX_ORDER)) > @@ -969,7 +969,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > * and it's on LRU. It can only be a THP so the order > * is safe to read and it's 0 for tail pages. > */ > - if (unlikely(PageCompound(page))) { > + if (unlikely(PageCompound(page) && !cc->alloc_contig)) { > low_pfn += compound_nr(page) - 1; > goto isolate_fail; > } > @@ -981,7 +981,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > if (__isolate_lru_page(page, isolate_mode) != 0) > goto isolate_fail; > > - VM_BUG_ON_PAGE(PageCompound(page), page); > + /* The whole page is taken off the LRU; skip the tail pages. */ > + if (PageCompound(page)) > + low_pfn += compound_nr(page) - 1; > > /* Successfully isolated */ > del_page_from_lru_list(page, lruvec, page_lru(page)); > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index a36736812596..38c8ddfcecc8 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8253,14 +8253,16 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page, > > /* > * Hugepages are not in LRU lists, but they're movable. > + * THPs are on the LRU, but need to be counted as #small pages. > * We need not scan over tail pages because we don't > * handle each tail page individually in migration. > */ > - if (PageHuge(page)) { > + if (PageTransHuge(page)) { > struct page *head = compound_head(page); > unsigned int skip_pages; > > - if (!hugepage_migration_supported(page_hstate(head))) > + if (PageHuge(page) && > + !hugepage_migration_supported(page_hstate(head))) > return page; > > skip_pages = compound_nr(head) - (page - head); > -- > 2.24.1 Everything else looks good to me. Reviewed-by: Zi Yan <ziy@nvidia.com> -- Best Regards, Yan Zi
On Fri, 2020-02-21 at 17:31 -0500, Zi Yan wrote: > On 21 Feb 2020, at 16:53, Rik van Riel wrote: > > > +++ b/mm/compaction.c > > @@ -894,12 +894,12 @@ isolate_migratepages_block(struct > > compact_control *cc, unsigned long low_pfn, > > > > /* > > * Regardless of being on LRU, compound pages such as > > THP and > > - * hugetlbfs are not to be compacted. We can > > potentially save > > - * a lot of iterations if we skip them at once. The > > check is > > - * racy, but we can consider only valid values and the > > only > > - * danger is skipping too much. > > + * hugetlbfs are not to be compacted most of the time. > > We can > > + * potentially save a lot of iterations if we skip them > > at > > + * once. The check is racy, but we can consider only > > valid > > + * values and the only danger is skipping too much. > > */ > > Maybe add “we do want to move them when allocating contiguous memory > using CMA” to help > people understand the context of using cc->alloc_contig? I can certainly do that. I'll wait for feedback from other people to see if more changes are wanted, and plan to post v2 by Tuesday or so :)
On 2/21/20 10:53 PM, Rik van Riel wrote: > @@ -981,7 +981,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > if (__isolate_lru_page(page, isolate_mode) != 0) > goto isolate_fail; > > - VM_BUG_ON_PAGE(PageCompound(page), page); > + /* The whole page is taken off the LRU; skip the tail pages. */ > + if (PageCompound(page)) > + low_pfn += compound_nr(page) - 1; > > /* Successfully isolated */ > del_page_from_lru_list(page, lruvec, page_lru(page)); This continues by: inc_node_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); I think it now needs to use mod_node_page_state() with hpage_nr_pages(page) otherwise the counter will underflow after the migration? > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index a36736812596..38c8ddfcecc8 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8253,14 +8253,16 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page, > > /* > * Hugepages are not in LRU lists, but they're movable. > + * THPs are on the LRU, but need to be counted as #small pages. > * We need not scan over tail pages because we don't > * handle each tail page individually in migration. > */ > - if (PageHuge(page)) { > + if (PageTransHuge(page)) { Hmm, PageTransHuge() has VM_BUG_ON() for tail pages, while this code is written so that it can encounter a tail page and skip the rest of the compound page properly. So I would be worried about this. Also PageTransHuge() is basically just a PageHead() so for each non-hugetlbfs compound page this will assume it's a THP, while correctly it should reach the __PageMovable() || PageLRU(page) tests below. So probably this should do something like. if (PageHuge(page) || PageTransCompound(page)) { ... if (PageHuge(page) && !hpage_migration_supported)) return page. if (!PageLRU(head) && !__PageMovable(head)) return page ... > struct page *head = compound_head(page); > unsigned int skip_pages; > > - if (!hugepage_migration_supported(page_hstate(head))) > + if (PageHuge(page) && > + !hugepage_migration_supported(page_hstate(head))) > return page; > > skip_pages = compound_nr(head) - (page - head); >
On Mon, 2020-02-24 at 16:29 +0100, Vlastimil Babka wrote: > On 2/21/20 10:53 PM, Rik van Riel wrote: > > @@ -981,7 +981,9 @@ isolate_migratepages_block(struct > > compact_control *cc, unsigned long low_pfn, > > if (__isolate_lru_page(page, isolate_mode) != 0) > > goto isolate_fail; > > > > - VM_BUG_ON_PAGE(PageCompound(page), page); > > + /* The whole page is taken off the LRU; skip the tail > > pages. */ > > + if (PageCompound(page)) > > + low_pfn += compound_nr(page) - 1; > > > > /* Successfully isolated */ > > del_page_from_lru_list(page, lruvec, page_lru(page)); > > This continues by: > inc_node_page_state(page, NR_ISOLATED_ANON + > page_is_file_cache(page)); > > > I think it now needs to use mod_node_page_state() with > hpage_nr_pages(page) otherwise the counter will underflow after the > migration? You are absolutely right. I have not observed the underflow, but the functions doing the decrementing use hpage_nr_pages, and I need to do that as well on the incrementing side. Change made. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index a36736812596..38c8ddfcecc8 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -8253,14 +8253,16 @@ struct page *has_unmovable_pages(struct > > zone *zone, struct page *page, > > > > /* > > * Hugepages are not in LRU lists, but they're movable. > > + * THPs are on the LRU, but need to be counted as > > #small pages. > > * We need not scan over tail pages because we don't > > * handle each tail page individually in migration. > > */ > > - if (PageHuge(page)) { > > + if (PageTransHuge(page)) { > > Hmm, PageTransHuge() has VM_BUG_ON() for tail pages, while this code > is > written so that it can encounter a tail page and skip the rest of the > compound page properly. So I would be worried about this. Good point, a CMA allocation could start partway into a compound page. > Also PageTransHuge() is basically just a PageHead() so for each > non-hugetlbfs compound page this will assume it's a THP, while > correctly > it should reach the __PageMovable() || PageLRU(page) tests below. > > So probably this should do something like. > > if (PageHuge(page) || PageTransCompound(page)) { > ... > if (PageHuge(page) && !hpage_migration_supported)) return page. So far so good. > if (!PageLRU(head) && !__PageMovable(head)) return page I don't get this one, though. What about a THP that has not made it onto the LRU list yet for some reason? I don't think anonymous pages are marked __PageMovable, are they? It looks like they only have the PAGE_MAPPING_ANON flag set, not the PAGE_MAPPING_MOVABLE one. What am I missing? > ... > > > struct page *head = compound_head(page); > > unsigned int skip_pages; > > > > - if > > (!hugepage_migration_supported(page_hstate(head))) > > + if (PageHuge(page) && > > + !hugepage_migration_supported(page_hstate(h > > ead))) > > return page; > > > > skip_pages = compound_nr(head) - (page - head); > > > >
On 2/25/20 7:44 PM, Rik van Riel wrote: >> Also PageTransHuge() is basically just a PageHead() so for each >> non-hugetlbfs compound page this will assume it's a THP, while >> correctly >> it should reach the __PageMovable() || PageLRU(page) tests below. >> >> So probably this should do something like. >> >> if (PageHuge(page) || PageTransCompound(page)) { >> ... >> if (PageHuge(page) && !hpage_migration_supported)) return page. > > So far so good. > >> if (!PageLRU(head) && !__PageMovable(head)) return page > > I don't get this one, though. What about a THP that has > not made it onto the LRU list yet for some reason? Uh, is it any different from base pages which have to pass the same check? I guess the caller could do e.g. lru_add_drain_all() first. > I don't think anonymous pages are marked __PageMovable, > are they? It looks like they only have the PAGE_MAPPING_ANON > flag set, not the PAGE_MAPPING_MOVABLE one. > > What am I missing? My point is that we should not accept compound pages that are neither a migratable hugetlbfs page nor a THP, as movable. And your PageTransHuge() test and my PageTransCompound() is really just a test for all compound pages, not "hugetlbfs or THP only". I should have perhaps suggested PageCompound() instead of the PageTransCompound() wrapper, to make it more obvious. So we should test non-hugetlbfs pages first whether they are the kind of compound pages that are migratable. THP's should pass this test by PageLRU(), other compound movable pages by __PageMovable(head). > >> ... >> >> > struct page *head = compound_head(page); >> > unsigned int skip_pages; >> > >> > - if >> > (!hugepage_migration_supported(page_hstate(head))) >> > + if (PageHuge(page) && >> > + !hugepage_migration_supported(page_hstate(h >> > ead))) >> > return page; >> > >> > skip_pages = compound_nr(head) - (page - head); >> > >> >> >
On Wed, 2020-02-26 at 10:48 +0100, Vlastimil Babka wrote: > On 2/25/20 7:44 PM, Rik van Riel wrote: > > > Also PageTransHuge() is basically just a PageHead() so for each > > > non-hugetlbfs compound page this will assume it's a THP, while > > > correctly > > > it should reach the __PageMovable() || PageLRU(page) tests below. > > > > > > So probably this should do something like. > > > > > > if (PageHuge(page) || PageTransCompound(page)) { > > > ... > > > if (PageHuge(page) && !hpage_migration_supported)) return > > > page. > > > > So far so good. > > > > > if (!PageLRU(head) && !__PageMovable(head)) return page > > > > I don't get this one, though. What about a THP that has > > not made it onto the LRU list yet for some reason? > > Uh, is it any different from base pages which have to pass the same > check? I > guess the caller could do e.g. lru_add_drain_all() first. You are right, it is not different. As for lru_add_drain_all(), I wonder at what point that should happen? It appears that the order in which things are done does not really provide a good moment: 1) decide to attempt allocating a range of memory 2) scan each page block for unmovable pages 3) if no unmovable pages are found, mark the page block MIGRATE_ISOLATE I wonder if we should do things the opposite way, first marking the page block MIGRATE_ISOLATE (to prevent new allocations), then scanning it, and calling lru_add_drain_all if we encounter a page that looks like it could benefit from that. If we still see unmovable pages after that, it is cheap enough to set the page block back to its previous state. > > I don't think anonymous pages are marked __PageMovable, > > are they? It looks like they only have the PAGE_MAPPING_ANON > > flag set, not the PAGE_MAPPING_MOVABLE one. > > > > What am I missing? > > My point is that we should not accept compound pages that are neither > a > migratable hugetlbfs page nor a THP, as movable. I have merged your suggestions into my code base. Thank you for pointing out that 4kB pages have the exact same restrictions as THPs, and why. I'll run some tests and will post v2 of the series soon.
On 2/26/20 6:53 PM, Rik van Riel wrote: > On Wed, 2020-02-26 at 10:48 +0100, Vlastimil Babka wrote: >> On 2/25/20 7:44 PM, Rik van Riel wrote: >> >> Uh, is it any different from base pages which have to pass the same >> check? I >> guess the caller could do e.g. lru_add_drain_all() first. > > You are right, it is not different. > > As for lru_add_drain_all(), I wonder at what point that > should happen? Right now it seems to be done in alloc_contig_range(), but rather late. > It appears that the order in which things are done does > not really provide a good moment: > 1) decide to attempt allocating a range of memory > 2) scan each page block for unmovable pages > 3) if no unmovable pages are found, mark the page block > MIGRATE_ISOLATE > > I wonder if we should do things the opposite way, first > marking the page block MIGRATE_ISOLATE (to prevent new > allocations), then scanning it, and calling lru_add_drain_all > if we encounter a page that looks like it could benefit from > that. > > If we still see unmovable pages after that, it is cheap > enough to set the page block back to its previous state. Yeah seems like the whole has_unmovable_pages() thing isn't much useful here. It might prevent some unnecessary action like isolating something, then finding non-movable page and rolling back the isolation. But maybe it's not worth the savings, and also has_unmovable_pages() being false doesn't guarantee succeed in the actual isolate+migrate attempt. And if it can cause a false negative due to lru pages not drained, then it's actually worse than if it wasn't called at all.
On Fri, 2020-02-28 at 16:17 +0100, Vlastimil Babka wrote: > On 2/26/20 6:53 PM, Rik van Riel wrote: > > > > It appears that the order in which things are done does > > not really provide a good moment: > > 1) decide to attempt allocating a range of memory > > 2) scan each page block for unmovable pages > > 3) if no unmovable pages are found, mark the page block > > MIGRATE_ISOLATE > > > > I wonder if we should do things the opposite way, first > > marking the page block MIGRATE_ISOLATE (to prevent new > > allocations), then scanning it, and calling lru_add_drain_all > > if we encounter a page that looks like it could benefit from > > that. > > > > If we still see unmovable pages after that, it is cheap > > enough to set the page block back to its previous state. > > Yeah seems like the whole has_unmovable_pages() thing isn't much > useful > here. It might prevent some unnecessary action like isolating > something, > then finding non-movable page and rolling back the isolation. But > maybe > it's not worth the savings, and also has_unmovable_pages() being > false > doesn't guarantee succeed in the actual isolate+migrate attempt. And > if > it can cause a false negative due to lru pages not drained, then it's > actually worse than if it wasn't called at all. We'll experiment with that, and see how often it is an issue in practice. If this aspect of the code needs improving, I suspect Roman and I will find it soon enough.
diff --git a/mm/compaction.c b/mm/compaction.c index 672d3c78c6ab..f3e05c91df62 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -894,12 +894,12 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, /* * Regardless of being on LRU, compound pages such as THP and - * hugetlbfs are not to be compacted. We can potentially save - * a lot of iterations if we skip them at once. The check is - * racy, but we can consider only valid values and the only - * danger is skipping too much. + * hugetlbfs are not to be compacted most of the time. We can + * potentially save a lot of iterations if we skip them at + * once. The check is racy, but we can consider only valid + * values and the only danger is skipping too much. */ - if (PageCompound(page)) { + if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); if (likely(order < MAX_ORDER)) @@ -969,7 +969,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * and it's on LRU. It can only be a THP so the order * is safe to read and it's 0 for tail pages. */ - if (unlikely(PageCompound(page))) { + if (unlikely(PageCompound(page) && !cc->alloc_contig)) { low_pfn += compound_nr(page) - 1; goto isolate_fail; } @@ -981,7 +981,9 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (__isolate_lru_page(page, isolate_mode) != 0) goto isolate_fail; - VM_BUG_ON_PAGE(PageCompound(page), page); + /* The whole page is taken off the LRU; skip the tail pages. */ + if (PageCompound(page)) + low_pfn += compound_nr(page) - 1; /* Successfully isolated */ del_page_from_lru_list(page, lruvec, page_lru(page)); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a36736812596..38c8ddfcecc8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8253,14 +8253,16 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page, /* * Hugepages are not in LRU lists, but they're movable. + * THPs are on the LRU, but need to be counted as #small pages. * We need not scan over tail pages because we don't * handle each tail page individually in migration. */ - if (PageHuge(page)) { + if (PageTransHuge(page)) { struct page *head = compound_head(page); unsigned int skip_pages; - if (!hugepage_migration_supported(page_hstate(head))) + if (PageHuge(page) && + !hugepage_migration_supported(page_hstate(head))) return page; skip_pages = compound_nr(head) - (page - head);
The code to implement THP migrations already exists, and the code for CMA to clear out a region of memory already exists. Only a few small tweaks are needed to allow CMA to move THP memory when attempting an allocation from alloc_contig_range. With these changes, migrating THPs from a CMA area works when allocating a 1GB hugepage from CMA memory. Signed-off-by: Rik van Riel <riel@surriel.com> --- mm/compaction.c | 16 +++++++++------- mm/page_alloc.c | 6 ++++-- 2 files changed, 13 insertions(+), 9 deletions(-)