Message ID | 20250401021025.637333-1-vishal.moola@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/compaction: Fix bug in hugetlb handling pathway | expand |
On Mon, Mar 31, 2025 at 07:10:24PM -0700, Vishal Moola (Oracle) wrote: > The compaction code doesn't take references on pages until we're certain > we should attempt to handle it. > > In the hugetlb case, isolate_or_dissolve_huge_page() may return -EBUSY > without taking a reference to the folio associated with our pfn. If our > folio's refcount drops to 0, compound_nr() becomes unpredictable, making > low_pfn and nr_scanned unreliable. > The user-visible effect is minimal - this should rarely happen (if ever). So, with compound_order() we either return the real order of the compound page or '0', right? > Fix this by storing the folio statistics earlier on the stack (just like > the THP and Buddy cases). > > Also revert commit 66fe1cf7f581 ("mm: compaction: use helper compound_nr > in isolate_migratepages_block") > to make backporting easier. > > Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Oscar Salvador <osalvador@suse.de> > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Oscar Salvador <osalvador@suse.de>
On 31 Mar 2025, at 22:10, Vishal Moola (Oracle) wrote: > The compaction code doesn't take references on pages until we're certain > we should attempt to handle it. > > In the hugetlb case, isolate_or_dissolve_huge_page() may return -EBUSY > without taking a reference to the folio associated with our pfn. If our > folio's refcount drops to 0, compound_nr() becomes unpredictable, making > low_pfn and nr_scanned unreliable. > The user-visible effect is minimal - this should rarely happen (if ever). > > Fix this by storing the folio statistics earlier on the stack (just like > the THP and Buddy cases). > > Also revert commit 66fe1cf7f581 ("mm: compaction: use helper compound_nr > in isolate_migratepages_block") > to make backporting easier. > > Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Oscar Salvador <osalvador@suse.de> > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > --- > mm/compaction.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > <snip> > @@ -1011,8 +1011,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > /* Do not report -EBUSY down the chain */ > if (ret == -EBUSY) > ret = 0; > - low_pfn += compound_nr(page) - 1; > - nr_scanned += compound_nr(page) - 1; > + low_pfn += (1UL << order) - 1; > + nr_scanned += (1UL << order) - 1; > goto isolate_fail; > } > Right after this, there is “low_pfn += folio_nr_pages(folio) - 1” for isolated hugetlb. I wonder if that can use order as well. Maybe not, since the order is obtained without taking a reference, but folio_nr_pages() is called with a reference. They might be different. Anyway, Reviewed-by: Zi Yan <ziy@nvidia.com> Best Regards, Yan, Zi
On Tue, Apr 01, 2025 at 04:59:46PM +0200, Oscar Salvador wrote: > On Mon, Mar 31, 2025 at 07:10:24PM -0700, Vishal Moola (Oracle) wrote: > > The compaction code doesn't take references on pages until we're certain > > we should attempt to handle it. > > > > In the hugetlb case, isolate_or_dissolve_huge_page() may return -EBUSY > > without taking a reference to the folio associated with our pfn. If our > > folio's refcount drops to 0, compound_nr() becomes unpredictable, making > > low_pfn and nr_scanned unreliable. > > The user-visible effect is minimal - this should rarely happen (if ever). > > So, with compound_order() we either return the real order of the > compound page or '0', right? Yup. There's a world in which that folio could be freed and reallocated as part of another large order page as well (where it would return the order of that folio). > > Fix this by storing the folio statistics earlier on the stack (just like > > the THP and Buddy cases). > > > > Also revert commit 66fe1cf7f581 ("mm: compaction: use helper compound_nr > > in isolate_migratepages_block") > > to make backporting easier. > > > > Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") > > Cc: Miaohe Lin <linmiaohe@huawei.com> > > Cc: Oscar Salvador <osalvador@suse.de> > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > > Acked-by: Oscar Salvador <osalvador@suse.de> > > > -- > Oscar Salvador > SUSE Labs
On Tue, Apr 01, 2025 at 11:20:48AM -0400, Zi Yan wrote: > On 31 Mar 2025, at 22:10, Vishal Moola (Oracle) wrote: > > > The compaction code doesn't take references on pages until we're certain > > we should attempt to handle it. > > > > In the hugetlb case, isolate_or_dissolve_huge_page() may return -EBUSY > > without taking a reference to the folio associated with our pfn. If our > > folio's refcount drops to 0, compound_nr() becomes unpredictable, making > > low_pfn and nr_scanned unreliable. > > The user-visible effect is minimal - this should rarely happen (if ever). > > > > Fix this by storing the folio statistics earlier on the stack (just like > > the THP and Buddy cases). > > > > Also revert commit 66fe1cf7f581 ("mm: compaction: use helper compound_nr > > in isolate_migratepages_block") > > to make backporting easier. > > > > Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") > > Cc: Miaohe Lin <linmiaohe@huawei.com> > > Cc: Oscar Salvador <osalvador@suse.de> > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > > --- > > mm/compaction.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > <snip> > > > @@ -1011,8 +1011,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > /* Do not report -EBUSY down the chain */ > > if (ret == -EBUSY) > > ret = 0; > > - low_pfn += compound_nr(page) - 1; > > - nr_scanned += compound_nr(page) - 1; > > + low_pfn += (1UL << order) - 1; > > + nr_scanned += (1UL << order) - 1; > > goto isolate_fail; > > } > > > > Right after this, there is “low_pfn += folio_nr_pages(folio) - 1” for > isolated hugetlb. I wonder if that can use order as well. Maybe not, > since the order is obtained without taking a reference, but folio_nr_pages() > is called with a reference. They might be different. I thought about that as well. There's a very unlikely case where they could be different: We had a hugetlb page, free'd it, then allocated it to another hugetlb page. At this point (the success path) we would want the rest of the compaction counters all the sync up to whichever folio we ended up isolating anyways. > Anyway, Reviewed-by: Zi Yan <ziy@nvidia.com> Thanks! > > Best Regards, > Yan, Zi
diff --git a/mm/compaction.c b/mm/compaction.c index 139f00c0308a..ca71fd3c3181 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -981,13 +981,13 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, } if (PageHuge(page)) { + const unsigned int order = compound_order(page); /* * skip hugetlbfs if we are not compacting for pages * bigger than its order. THPs and other compound pages * are handled below. */ if (!cc->alloc_contig) { - const unsigned int order = compound_order(page); if (order <= MAX_PAGE_ORDER) { low_pfn += (1UL << order) - 1; @@ -1011,8 +1011,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, /* Do not report -EBUSY down the chain */ if (ret == -EBUSY) ret = 0; - low_pfn += compound_nr(page) - 1; - nr_scanned += compound_nr(page) - 1; + low_pfn += (1UL << order) - 1; + nr_scanned += (1UL << order) - 1; goto isolate_fail; }
The compaction code doesn't take references on pages until we're certain we should attempt to handle it. In the hugetlb case, isolate_or_dissolve_huge_page() may return -EBUSY without taking a reference to the folio associated with our pfn. If our folio's refcount drops to 0, compound_nr() becomes unpredictable, making low_pfn and nr_scanned unreliable. The user-visible effect is minimal - this should rarely happen (if ever). Fix this by storing the folio statistics earlier on the stack (just like the THP and Buddy cases). Also revert commit 66fe1cf7f581 ("mm: compaction: use helper compound_nr in isolate_migratepages_block") to make backporting easier. Fixes: 369fa227c219 ("mm: make alloc_contig_range handle free hugetlb pages") Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> --- mm/compaction.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)