Message ID | 20210208103812.32056-2-osalvador@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Make alloc_contig_range handle Hugetlb pages | expand |
On 08.02.21 11:38, Oscar Salvador wrote: > alloc_contig_range is not prepared to handle hugetlb pages and will > fail if it ever sees one, but since they can be migrated as any other > page (LRU and Movable), it makes sense to also handle them. > > For now, do it only when coming from alloc_contig_range. > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > --- > mm/compaction.c | 17 +++++++++++++++++ > mm/vmscan.c | 5 +++-- > 2 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index e5acb9714436..89cd2e60da29 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > goto isolate_fail; > } > > + /* > + * Handle hugetlb pages only when coming from alloc_contig > + */ > + if (PageHuge(page) && cc->alloc_contig) { > + if (page_count(page)) { I wonder if we should care about races here. What if someone concurrently allocates/frees? Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i assume we'll have to handle that as well. I wonder if it would make sense to move some of the magic to hugetlb code and handle it there with less chances for races (isolate if used, alloc-and-dissolve if not). > + /* > + * Hugetlb page in-use. Isolate and migrate. > + */ > + if (isolate_huge_page(page, &cc->migratepages)) { > + low_pfn += compound_nr(page) - 1; > + goto isolate_success_no_list; > + } > + }
On Wed, Feb 10, 2021 at 09:56:37AM +0100, David Hildenbrand wrote: > On 08.02.21 11:38, Oscar Salvador wrote: > > alloc_contig_range is not prepared to handle hugetlb pages and will > > fail if it ever sees one, but since they can be migrated as any other > > page (LRU and Movable), it makes sense to also handle them. > > > > For now, do it only when coming from alloc_contig_range. > > > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > > --- > > mm/compaction.c | 17 +++++++++++++++++ > > mm/vmscan.c | 5 +++-- > > 2 files changed, 20 insertions(+), 2 deletions(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index e5acb9714436..89cd2e60da29 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > goto isolate_fail; > > } > > + /* > > + * Handle hugetlb pages only when coming from alloc_contig > > + */ > > + if (PageHuge(page) && cc->alloc_contig) { > > + if (page_count(page)) { > > I wonder if we should care about races here. What if someone concurrently > allocates/frees? > > Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i > assume we'll have to handle that as well. > > I wonder if it would make sense to move some of the magic to hugetlb code > and handle it there with less chances for races (isolate if used, > alloc-and-dissolve if not). Yes, it makes sense to keep the magic in hugetlb code. Note, though, that removing all races might be tricky. isolate_huge_page() checks for PageHuge under hugetlb_lock, so there is a race between a call to PageHuge(x) and a subsequent call to isolate_huge_page(). But we should be fine as isolate_huge_page will fail in case the page is no longer HugeTLB. Also, since isolate_migratepages_block() gets called with ranges pageblock aligned, we should never be handling tail pages in the core of the function. E.g: the same way we handle THP: /* The whole page is taken off the LRU; skip the tail pages. */ if (PageCompound(page)) low_pfn += compound_nr(page) - 1; But all in all, the code has to be more bullet-proof. This RFC was more like a PoC to see whether something crazy was done. And as I said, moving the handling of hugetlb pages to hugetlb.c might help towards a better error-race-handling. Thanks for having a look ;-)
On 10.02.21 15:09, Oscar Salvador wrote: > On Wed, Feb 10, 2021 at 09:56:37AM +0100, David Hildenbrand wrote: >> On 08.02.21 11:38, Oscar Salvador wrote: >>> alloc_contig_range is not prepared to handle hugetlb pages and will >>> fail if it ever sees one, but since they can be migrated as any other >>> page (LRU and Movable), it makes sense to also handle them. >>> >>> For now, do it only when coming from alloc_contig_range. >>> >>> Signed-off-by: Oscar Salvador <osalvador@suse.de> >>> --- >>> mm/compaction.c | 17 +++++++++++++++++ >>> mm/vmscan.c | 5 +++-- >>> 2 files changed, 20 insertions(+), 2 deletions(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index e5acb9714436..89cd2e60da29 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>> goto isolate_fail; >>> } >>> + /* >>> + * Handle hugetlb pages only when coming from alloc_contig >>> + */ >>> + if (PageHuge(page) && cc->alloc_contig) { >>> + if (page_count(page)) { >> >> I wonder if we should care about races here. What if someone concurrently >> allocates/frees? >> >> Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i >> assume we'll have to handle that as well. >> >> I wonder if it would make sense to move some of the magic to hugetlb code >> and handle it there with less chances for races (isolate if used, >> alloc-and-dissolve if not). > > Yes, it makes sense to keep the magic in hugetlb code. > Note, though, that removing all races might be tricky. > > isolate_huge_page() checks for PageHuge under hugetlb_lock, > so there is a race between a call to PageHuge(x) and a subsequent > call to isolate_huge_page(). > But we should be fine as isolate_huge_page will fail in case the page is > no longer HugeTLB. > > Also, since isolate_migratepages_block() gets called with ranges > pageblock aligned, we should never be handling tail pages in the core > of the function. E.g: the same way we handle THP: Gigantic pages? (spoiler: see my comments to next patch :) )
On Wed, Feb 10, 2021 at 03:11:05PM +0100, David Hildenbrand wrote: > On 10.02.21 15:09, Oscar Salvador wrote: > > On Wed, Feb 10, 2021 at 09:56:37AM +0100, David Hildenbrand wrote: > > > On 08.02.21 11:38, Oscar Salvador wrote: > > > > alloc_contig_range is not prepared to handle hugetlb pages and will > > > > fail if it ever sees one, but since they can be migrated as any other > > > > page (LRU and Movable), it makes sense to also handle them. > > > > > > > > For now, do it only when coming from alloc_contig_range. > > > > > > > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > > > > --- > > > > mm/compaction.c | 17 +++++++++++++++++ > > > > mm/vmscan.c | 5 +++-- > > > > 2 files changed, 20 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > > > index e5acb9714436..89cd2e60da29 100644 > > > > --- a/mm/compaction.c > > > > +++ b/mm/compaction.c > > > > @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > > > goto isolate_fail; > > > > } > > > > + /* > > > > + * Handle hugetlb pages only when coming from alloc_contig > > > > + */ > > > > + if (PageHuge(page) && cc->alloc_contig) { > > > > + if (page_count(page)) { > > > > > > I wonder if we should care about races here. What if someone concurrently > > > allocates/frees? > > > > > > Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i > > > assume we'll have to handle that as well. > > > > > > I wonder if it would make sense to move some of the magic to hugetlb code > > > and handle it there with less chances for races (isolate if used, > > > alloc-and-dissolve if not). > > > > Yes, it makes sense to keep the magic in hugetlb code. > > Note, though, that removing all races might be tricky. > > > > isolate_huge_page() checks for PageHuge under hugetlb_lock, > > so there is a race between a call to PageHuge(x) and a subsequent > > call to isolate_huge_page(). > > But we should be fine as isolate_huge_page will fail in case the page is > > no longer HugeTLB. > > > > Also, since isolate_migratepages_block() gets called with ranges > > pageblock aligned, we should never be handling tail pages in the core > > of the function. E.g: the same way we handle THP: > > Gigantic pages? (spoiler: see my comments to next patch :) ) Oh, yeah, that sucks. We had the same problem in scan_movable_pages/has_unmovable_pages with such pages. Uhm, I will try to be more careful :-)
On 10.02.21 15:14, Oscar Salvador wrote: > On Wed, Feb 10, 2021 at 03:11:05PM +0100, David Hildenbrand wrote: >> On 10.02.21 15:09, Oscar Salvador wrote: >>> On Wed, Feb 10, 2021 at 09:56:37AM +0100, David Hildenbrand wrote: >>>> On 08.02.21 11:38, Oscar Salvador wrote: >>>>> alloc_contig_range is not prepared to handle hugetlb pages and will >>>>> fail if it ever sees one, but since they can be migrated as any other >>>>> page (LRU and Movable), it makes sense to also handle them. >>>>> >>>>> For now, do it only when coming from alloc_contig_range. >>>>> >>>>> Signed-off-by: Oscar Salvador <osalvador@suse.de> >>>>> --- >>>>> mm/compaction.c | 17 +++++++++++++++++ >>>>> mm/vmscan.c | 5 +++-- >>>>> 2 files changed, 20 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>>> index e5acb9714436..89cd2e60da29 100644 >>>>> --- a/mm/compaction.c >>>>> +++ b/mm/compaction.c >>>>> @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>>>> goto isolate_fail; >>>>> } >>>>> + /* >>>>> + * Handle hugetlb pages only when coming from alloc_contig >>>>> + */ >>>>> + if (PageHuge(page) && cc->alloc_contig) { >>>>> + if (page_count(page)) { >>>> >>>> I wonder if we should care about races here. What if someone concurrently >>>> allocates/frees? >>>> >>>> Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i >>>> assume we'll have to handle that as well. >>>> >>>> I wonder if it would make sense to move some of the magic to hugetlb code >>>> and handle it there with less chances for races (isolate if used, >>>> alloc-and-dissolve if not). >>> >>> Yes, it makes sense to keep the magic in hugetlb code. >>> Note, though, that removing all races might be tricky. >>> >>> isolate_huge_page() checks for PageHuge under hugetlb_lock, >>> so there is a race between a call to PageHuge(x) and a subsequent >>> call to isolate_huge_page(). >>> But we should be fine as isolate_huge_page will fail in case the page is >>> no longer HugeTLB. >>> >>> Also, since isolate_migratepages_block() gets called with ranges >>> pageblock aligned, we should never be handling tail pages in the core >>> of the function. E.g: the same way we handle THP: >> >> Gigantic pages? (spoiler: see my comments to next patch :) ) > > Oh, yeah, that sucks. > We had the same problem in scan_movable_pages/has_unmovable_pages > with such pages. > > Uhm, I will try to be more careful :-) > Gigantic pages are a minefield. Not your fault :)
On 2/8/21 2:38 AM, Oscar Salvador wrote: > alloc_contig_range is not prepared to handle hugetlb pages and will > fail if it ever sees one, but since they can be migrated as any other > page (LRU and Movable), it makes sense to also handle them. > > For now, do it only when coming from alloc_contig_range. > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > --- > mm/compaction.c | 17 +++++++++++++++++ > mm/vmscan.c | 5 +++-- > 2 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index e5acb9714436..89cd2e60da29 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > goto isolate_fail; > } > > + /* > + * Handle hugetlb pages only when coming from alloc_contig > + */ > + if (PageHuge(page) && cc->alloc_contig) { > + if (page_count(page)) { Thanks for doing this! I agree with everything in the discussion you and David had. This code is racy, but since we are scanning lockless there is no way to eliminate them all. Best to just minimize the windows and document.
On 11.02.21 01:56, Mike Kravetz wrote: > On 2/8/21 2:38 AM, Oscar Salvador wrote: >> alloc_contig_range is not prepared to handle hugetlb pages and will >> fail if it ever sees one, but since they can be migrated as any other >> page (LRU and Movable), it makes sense to also handle them. >> >> For now, do it only when coming from alloc_contig_range. >> >> Signed-off-by: Oscar Salvador <osalvador@suse.de> >> --- >> mm/compaction.c | 17 +++++++++++++++++ >> mm/vmscan.c | 5 +++-- >> 2 files changed, 20 insertions(+), 2 deletions(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index e5acb9714436..89cd2e60da29 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> goto isolate_fail; >> } >> >> + /* >> + * Handle hugetlb pages only when coming from alloc_contig >> + */ >> + if (PageHuge(page) && cc->alloc_contig) { >> + if (page_count(page)) { > > Thanks for doing this! > > I agree with everything in the discussion you and David had. This code > is racy, but since we are scanning lockless there is no way to eliminate > them all. Best to just minimize the windows and document. > Agreed - and make sure that we don't have strange side. (e.g., in the next patch, allocate a new page, try to dissolve. Dissolving fails, what happens to the just-allocated page?)
diff --git a/mm/compaction.c b/mm/compaction.c index e5acb9714436..89cd2e60da29 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, goto isolate_fail; } + /* + * Handle hugetlb pages only when coming from alloc_contig + */ + if (PageHuge(page) && cc->alloc_contig) { + if (page_count(page)) { + /* + * Hugetlb page in-use. Isolate and migrate. + */ + if (isolate_huge_page(page, &cc->migratepages)) { + low_pfn += compound_nr(page) - 1; + goto isolate_success_no_list; + } + } + goto isolate_fail; + } + /* * Check may be lockless but that's ok as we recheck later. * It's possible to migrate LRU and non-lru movable pages. @@ -1041,6 +1057,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, isolate_success: list_add(&page->lru, &cc->migratepages); +isolate_success_no_list: cc->nr_migratepages += compound_nr(page); nr_isolated += compound_nr(page); diff --git a/mm/vmscan.c b/mm/vmscan.c index b1b574ad199d..0803adca4469 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1506,8 +1506,9 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone, LIST_HEAD(clean_pages); list_for_each_entry_safe(page, next, page_list, lru) { - if (page_is_file_lru(page) && !PageDirty(page) && - !__PageMovable(page) && !PageUnevictable(page)) { + if (!PageHuge(page) && page_is_file_lru(page) && + !PageDirty(page) && !__PageMovable(page) && + !PageUnevictable(page)) { ClearPageActive(page); list_move(&page->lru, &clean_pages); }
alloc_contig_range is not prepared to handle hugetlb pages and will fail if it ever sees one, but since they can be migrated as any other page (LRU and Movable), it makes sense to also handle them. For now, do it only when coming from alloc_contig_range. Signed-off-by: Oscar Salvador <osalvador@suse.de> --- mm/compaction.c | 17 +++++++++++++++++ mm/vmscan.c | 5 +++-- 2 files changed, 20 insertions(+), 2 deletions(-)