Message ID | 20240820032630.1894770-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: remove migration for HugePage in isolate_single_pageblock() | expand |
On 20.08.24 05:26, Kefeng Wang wrote: > The gigantic page size may larger than memory block size, so memory > offline always fails in this case after commit b2c9e2fbba32 ("mm: make > alloc_contig_range work at pageblock granularity"), > > offline_pages > start_isolate_page_range > start_isolate_page_range(isolate_before=true) > isolate [isolate_start, isolate_start + pageblock_nr_pages) > start_isolate_page_range(isolate_before=false) > isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock > __alloc_contig_migrate_range > isolate_migratepages_range > isolate_migratepages_block > isolate_or_dissolve_huge_page > if (hstate_is_gigantic(h)) > return -ENOMEM; > > [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range > > Gigantic PageHuge is bigger than a pageblock, but since it is freed as > order-0 pages, its pageblocks after being freed will get to the right > free list. There is no need to have special handling code for them in > start_isolate_page_range(). For both alloc_contig_range() and memory > offline cases, the migration code after start_isolate_page_range() will > be able to migrate gigantic PageHuge when possible. > > Let's clean up start_isolate_page_range() and fix the aforementioned > memory offline failure issue all together. > > Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > v2: > - update changelog, thanks Zi, David > > mm/page_isolation.c | 28 +++------------------------- > 1 file changed, 3 insertions(+), 25 deletions(-) > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 39fb8c07aeb7..7e04047977cf 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -403,30 +403,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, > unsigned long head_pfn = page_to_pfn(head); > unsigned long nr_pages = compound_nr(head); > > - if (head_pfn + nr_pages <= boundary_pfn) { > - pfn = head_pfn + nr_pages; > - continue; > - } > - > -#if defined CONFIG_COMPACTION || defined CONFIG_CMA > - if (PageHuge(page)) { > - int page_mt = get_pageblock_migratetype(page); > - struct compact_control cc = { > - .nr_migratepages = 0, > - .order = -1, > - .zone = page_zone(pfn_to_page(head_pfn)), > - .mode = MIGRATE_SYNC, > - .ignore_skip_hint = true, > - .no_set_skip_hint = true, > - .gfp_mask = gfp_flags, > - .alloc_contig = true, > - }; > - INIT_LIST_HEAD(&cc.migratepages); > - > - ret = __alloc_contig_migrate_range(&cc, head_pfn, > - head_pfn + nr_pages, page_mt); > - if (ret) > - goto failed; > + if (head_pfn + nr_pages <= boundary_pfn || > + PageHuge(page)) { I'm wondering if we should have here some kind of WARN_ON_ONCE if PageLRU + "spans more than a single pageblock" check. Then we could catch whenever we would have !hugetlb LRU folios that span more than a single pageblock. /* * We cannot currently handle movable (LRU) folios that span more than * a single pageblock. hugetlb folios are fine, though. */ WARN_ON_ONCE(PageLRU(page) && nr_pages > pageblock_nr_pages); But now I realized something I previously missed: We are only modifying behavior of hugetlb folios ... stupid misleading "PageHuge" check :) So that would be independent of this change. Acked-by: David Hildenbrand <david@redhat.com>
On 19 Aug 2024, at 23:26, Kefeng Wang wrote: > The gigantic page size may larger than memory block size, so memory > offline always fails in this case after commit b2c9e2fbba32 ("mm: make > alloc_contig_range work at pageblock granularity"), > > offline_pages > start_isolate_page_range > start_isolate_page_range(isolate_before=true) > isolate [isolate_start, isolate_start + pageblock_nr_pages) > start_isolate_page_range(isolate_before=false) > isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock > __alloc_contig_migrate_range > isolate_migratepages_range > isolate_migratepages_block > isolate_or_dissolve_huge_page > if (hstate_is_gigantic(h)) > return -ENOMEM; > > [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range > > Gigantic PageHuge is bigger than a pageblock, but since it is freed as > order-0 pages, its pageblocks after being freed will get to the right > free list. There is no need to have special handling code for them in > start_isolate_page_range(). For both alloc_contig_range() and memory > offline cases, the migration code after start_isolate_page_range() will > be able to migrate gigantic PageHuge when possible. > > Let's clean up start_isolate_page_range() and fix the aforementioned > memory offline failure issue all together. > > Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> > --- > v2: > - update changelog, thanks Zi, David > > mm/page_isolation.c | 28 +++------------------------- > 1 file changed, 3 insertions(+), 25 deletions(-) > LGTM. Thanks. Acked-by: Zi Yan <ziy@nvidia.com> -- Best Regards, Yan, Zi
On 2024/8/20 16:42, David Hildenbrand wrote: > On 20.08.24 05:26, Kefeng Wang wrote: >> The gigantic page size may larger than memory block size, so memory >> offline always fails in this case after commit b2c9e2fbba32 ("mm: make >> alloc_contig_range work at pageblock granularity"), >> >> offline_pages >> start_isolate_page_range >> start_isolate_page_range(isolate_before=true) >> isolate [isolate_start, isolate_start + pageblock_nr_pages) >> start_isolate_page_range(isolate_before=false) >> isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock >> __alloc_contig_migrate_range >> isolate_migratepages_range >> isolate_migratepages_block >> isolate_or_dissolve_huge_page >> if (hstate_is_gigantic(h)) >> return -ENOMEM; >> >> [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed >> due to failure to isolate range >> >> Gigantic PageHuge is bigger than a pageblock, but since it is freed as >> order-0 pages, its pageblocks after being freed will get to the right >> free list. There is no need to have special handling code for them in >> start_isolate_page_range(). For both alloc_contig_range() and memory >> offline cases, the migration code after start_isolate_page_range() will >> be able to migrate gigantic PageHuge when possible. >> >> Let's clean up start_isolate_page_range() and fix the aforementioned >> memory offline failure issue all together. >> >> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock >> granularity") >> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >> --- ... >> + if (head_pfn + nr_pages <= boundary_pfn || >> + PageHuge(page)) { > > I'm wondering if we should have here some kind of WARN_ON_ONCE if > PageLRU + "spans more than a single pageblock" check. > > Then we could catch whenever we would have !hugetlb LRU folios that span > more than a single pageblock. > > /* > * We cannot currently handle movable (LRU) folios that span more than > * a single pageblock. hugetlb folios are fine, though. > */ > WARN_ON_ONCE(PageLRU(page) && nr_pages > pageblock_nr_pages This should be already covered by following VM_WRAN, VM_WARN_ON_ONCE_PAGE(PageLRU(page), page); // only hint when head_pfn + nr_pages > boundary_pfn ( boundary_pfn is pageblock aligned) VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page); > > But now I realized something I previously missed: We are only modifying > behavior of hugetlb folios ... stupid misleading "PageHuge" check :) > > So that would be independent of this change. > > Acked-by: David Hildenbrand <david@redhat.com> >
On 20.08.24 16:00, Kefeng Wang wrote: > > > On 2024/8/20 16:42, David Hildenbrand wrote: >> On 20.08.24 05:26, Kefeng Wang wrote: >>> The gigantic page size may larger than memory block size, so memory >>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make >>> alloc_contig_range work at pageblock granularity"), >>> >>> offline_pages >>> start_isolate_page_range >>> start_isolate_page_range(isolate_before=true) >>> isolate [isolate_start, isolate_start + pageblock_nr_pages) >>> start_isolate_page_range(isolate_before=false) >>> isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock >>> __alloc_contig_migrate_range >>> isolate_migratepages_range >>> isolate_migratepages_block >>> isolate_or_dissolve_huge_page >>> if (hstate_is_gigantic(h)) >>> return -ENOMEM; >>> >>> [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed >>> due to failure to isolate range >>> >>> Gigantic PageHuge is bigger than a pageblock, but since it is freed as >>> order-0 pages, its pageblocks after being freed will get to the right >>> free list. There is no need to have special handling code for them in >>> start_isolate_page_range(). For both alloc_contig_range() and memory >>> offline cases, the migration code after start_isolate_page_range() will >>> be able to migrate gigantic PageHuge when possible. >>> >>> Let's clean up start_isolate_page_range() and fix the aforementioned >>> memory offline failure issue all together. >>> >>> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock >>> granularity") >>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> >>> --- > ... > >>> + if (head_pfn + nr_pages <= boundary_pfn || >>> + PageHuge(page)) { >> >> I'm wondering if we should have here some kind of WARN_ON_ONCE if >> PageLRU + "spans more than a single pageblock" check. >> >> Then we could catch whenever we would have !hugetlb LRU folios that span >> more than a single pageblock. >> >> /* >> * We cannot currently handle movable (LRU) folios that span more than >> * a single pageblock. hugetlb folios are fine, though. >> */ >> WARN_ON_ONCE(PageLRU(page) && nr_pages > pageblock_nr_pages > > This should be already covered by following VM_WRAN, > > VM_WARN_ON_ONCE_PAGE(PageLRU(page), page); // only hint when head_pfn > + nr_pages > boundary_pfn ( boundary_pfn is pageblock aligned) Ahh, good!
diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 39fb8c07aeb7..7e04047977cf 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -403,30 +403,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, unsigned long head_pfn = page_to_pfn(head); unsigned long nr_pages = compound_nr(head); - if (head_pfn + nr_pages <= boundary_pfn) { - pfn = head_pfn + nr_pages; - continue; - } - -#if defined CONFIG_COMPACTION || defined CONFIG_CMA - if (PageHuge(page)) { - int page_mt = get_pageblock_migratetype(page); - struct compact_control cc = { - .nr_migratepages = 0, - .order = -1, - .zone = page_zone(pfn_to_page(head_pfn)), - .mode = MIGRATE_SYNC, - .ignore_skip_hint = true, - .no_set_skip_hint = true, - .gfp_mask = gfp_flags, - .alloc_contig = true, - }; - INIT_LIST_HEAD(&cc.migratepages); - - ret = __alloc_contig_migrate_range(&cc, head_pfn, - head_pfn + nr_pages, page_mt); - if (ret) - goto failed; + if (head_pfn + nr_pages <= boundary_pfn || + PageHuge(page)) { pfn = head_pfn + nr_pages; continue; } @@ -440,7 +418,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags, */ VM_WARN_ON_ONCE_PAGE(PageLRU(page), page); VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page); -#endif + goto failed; }
The gigantic page size may larger than memory block size, so memory offline always fails in this case after commit b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity"), offline_pages start_isolate_page_range start_isolate_page_range(isolate_before=true) isolate [isolate_start, isolate_start + pageblock_nr_pages) start_isolate_page_range(isolate_before=false) isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock __alloc_contig_migrate_range isolate_migratepages_range isolate_migratepages_block isolate_or_dissolve_huge_page if (hstate_is_gigantic(h)) return -ENOMEM; [ 15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range Gigantic PageHuge is bigger than a pageblock, but since it is freed as order-0 pages, its pageblocks after being freed will get to the right free list. There is no need to have special handling code for them in start_isolate_page_range(). For both alloc_contig_range() and memory offline cases, the migration code after start_isolate_page_range() will be able to migrate gigantic PageHuge when possible. Let's clean up start_isolate_page_range() and fix the aforementioned memory offline failure issue all together. Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> --- v2: - update changelog, thanks Zi, David mm/page_isolation.c | 28 +++------------------------- 1 file changed, 3 insertions(+), 25 deletions(-)