Message ID | 20210110124017.86750-2-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix some bugs about HugeTLB code | expand |
On Sun 10-01-21 20:40:12, Muchun Song wrote: > If the refcount is one when it is migrated, it means that the page > was freed from under us. So we are done and do not need to migrate. I would consider the following easier to understand. Feel free to reuse. " All pages isolated for the migration have an elevated reference count and therefore seeing a reference count equal to 1 means that the last user of the page has dropped the reference and the page has became unused and there doesn't make much sense to migrate it anymore. This has been done for regular pages and this patch does the same for hugetlb pages. Although the likelyhood of the race is rather small for hugetlb pages it makes sense the two code paths in sync. " > > This optimization is consistent with the regular pages, just like > unmap_and_move() does. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > Acked-by: Yang Shi <shy828301@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> > --- > mm/migrate.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 4385f2fb5d18..a6631c4eb6a6 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > return -ENOSYS; > } > > + if (page_count(hpage) == 1) { > + /* page was freed from under us. So we are done. */ > + putback_active_hugepage(hpage); > + return MIGRATEPAGE_SUCCESS; > + } > + > new_hpage = get_new_page(hpage, private); > if (!new_hpage) > return -ENOMEM; > -- > 2.11.0
On Tue, Jan 12, 2021 at 5:42 PM Michal Hocko <mhocko@suse.com> wrote: > > On Sun 10-01-21 20:40:12, Muchun Song wrote: > > If the refcount is one when it is migrated, it means that the page > > was freed from under us. So we are done and do not need to migrate. > > I would consider the following easier to understand. Feel free to reuse. > " > All pages isolated for the migration have an elevated reference count > and therefore seeing a reference count equal to 1 means that the last > user of the page has dropped the reference and the page has became > unused and there doesn't make much sense to migrate it anymore. This has > been done for regular pages and this patch does the same for hugetlb > pages. Although the likelyhood of the race is rather small for hugetlb > pages it makes sense the two code paths in sync. > " Thanks. > > > > > This optimization is consistent with the regular pages, just like > > unmap_and_move() does. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > > Acked-by: Yang Shi <shy828301@gmail.com> > > Acked-by: Michal Hocko <mhocko@suse.com> Thanks. > > > --- > > mm/migrate.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/mm/migrate.c b/mm/migrate.c > > index 4385f2fb5d18..a6631c4eb6a6 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > > return -ENOSYS; > > } > > > > + if (page_count(hpage) == 1) { > > + /* page was freed from under us. So we are done. */ > > + putback_active_hugepage(hpage); > > + return MIGRATEPAGE_SUCCESS; > > + } > > + > > new_hpage = get_new_page(hpage, private); > > if (!new_hpage) > > return -ENOMEM; > > -- > > 2.11.0 > > -- > Michal Hocko > SUSE Labs
On 10.01.21 13:40, Muchun Song wrote: > If the refcount is one when it is migrated, it means that the page > was freed from under us. So we are done and do not need to migrate. > > This optimization is consistent with the regular pages, just like > unmap_and_move() does. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > Acked-by: Yang Shi <shy828301@gmail.com> > --- > mm/migrate.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 4385f2fb5d18..a6631c4eb6a6 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > return -ENOSYS; > } > > + if (page_count(hpage) == 1) { > + /* page was freed from under us. So we are done. */ > + putback_active_hugepage(hpage); > + return MIGRATEPAGE_SUCCESS; > + } > + > new_hpage = get_new_page(hpage, private); > if (!new_hpage) > return -ENOMEM; > Question: What if called via alloc_contig_range() where we even want to "migrate" free pages, meaning, relocate it?
On 12.01.21 12:00, David Hildenbrand wrote: > On 10.01.21 13:40, Muchun Song wrote: >> If the refcount is one when it is migrated, it means that the page >> was freed from under us. So we are done and do not need to migrate. >> >> This optimization is consistent with the regular pages, just like >> unmap_and_move() does. >> >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> >> Acked-by: Yang Shi <shy828301@gmail.com> >> --- >> mm/migrate.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mm/migrate.c b/mm/migrate.c >> index 4385f2fb5d18..a6631c4eb6a6 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, >> return -ENOSYS; >> } >> >> + if (page_count(hpage) == 1) { >> + /* page was freed from under us. So we are done. */ >> + putback_active_hugepage(hpage); >> + return MIGRATEPAGE_SUCCESS; >> + } >> + >> new_hpage = get_new_page(hpage, private); >> if (!new_hpage) >> return -ENOMEM; >> > > Question: What if called via alloc_contig_range() where we even want to > "migrate" free pages, meaning, relocate it? > To be more precise: a) We don't have dissolve_free_huge_pages() calls on the alloc_contig_range() path. So we *need* migration IIUC. b) dissolve_free_huge_pages() will fail if going below the reservation. In that case we really want to migrate free pages. This even applies to memory offlining. Either I am missing something important or this patch is more dangerous than it looks like.
On Tue 12-01-21 12:11:21, David Hildenbrand wrote: > On 12.01.21 12:00, David Hildenbrand wrote: > > On 10.01.21 13:40, Muchun Song wrote: > >> If the refcount is one when it is migrated, it means that the page > >> was freed from under us. So we are done and do not need to migrate. > >> > >> This optimization is consistent with the regular pages, just like > >> unmap_and_move() does. > >> > >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > >> Acked-by: Yang Shi <shy828301@gmail.com> > >> --- > >> mm/migrate.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/mm/migrate.c b/mm/migrate.c > >> index 4385f2fb5d18..a6631c4eb6a6 100644 > >> --- a/mm/migrate.c > >> +++ b/mm/migrate.c > >> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > >> return -ENOSYS; > >> } > >> > >> + if (page_count(hpage) == 1) { > >> + /* page was freed from under us. So we are done. */ > >> + putback_active_hugepage(hpage); > >> + return MIGRATEPAGE_SUCCESS; > >> + } > >> + > >> new_hpage = get_new_page(hpage, private); > >> if (!new_hpage) > >> return -ENOMEM; > >> > > > > Question: What if called via alloc_contig_range() where we even want to > > "migrate" free pages, meaning, relocate it? > > > > To be more precise: > > a) We don't have dissolve_free_huge_pages() calls on the > alloc_contig_range() path. So we *need* migration IIUC. > > b) dissolve_free_huge_pages() will fail if going below the reservation. > In that case we really want to migrate free pages. This even applies to > memory offlining. > > Either I am missing something important or this patch is more dangerous > than it looks like. This is an interesting point. But do we try to migrate hugetlb pages in alloc_contig_range? isolate_migratepages_block !PageLRU need to be marked as PageMovable AFAICS. This would be quite easy to implement but a more fundamental question is whether we really want to mess with existing pools for alloc_contig_range. Anyway you are quite right that this change has more side effects than it is easy to see while it doesn't really bring any major advantage other than the code consistency.
On 12.01.21 12:27, Michal Hocko wrote: > On Tue 12-01-21 12:11:21, David Hildenbrand wrote: >> On 12.01.21 12:00, David Hildenbrand wrote: >>> On 10.01.21 13:40, Muchun Song wrote: >>>> If the refcount is one when it is migrated, it means that the page >>>> was freed from under us. So we are done and do not need to migrate. >>>> >>>> This optimization is consistent with the regular pages, just like >>>> unmap_and_move() does. >>>> >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> >>>> Acked-by: Yang Shi <shy828301@gmail.com> >>>> --- >>>> mm/migrate.c | 6 ++++++ >>>> 1 file changed, 6 insertions(+) >>>> >>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>> index 4385f2fb5d18..a6631c4eb6a6 100644 >>>> --- a/mm/migrate.c >>>> +++ b/mm/migrate.c >>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, >>>> return -ENOSYS; >>>> } >>>> >>>> + if (page_count(hpage) == 1) { >>>> + /* page was freed from under us. So we are done. */ >>>> + putback_active_hugepage(hpage); >>>> + return MIGRATEPAGE_SUCCESS; >>>> + } >>>> + >>>> new_hpage = get_new_page(hpage, private); >>>> if (!new_hpage) >>>> return -ENOMEM; >>>> >>> >>> Question: What if called via alloc_contig_range() where we even want to >>> "migrate" free pages, meaning, relocate it? >>> >> >> To be more precise: >> >> a) We don't have dissolve_free_huge_pages() calls on the >> alloc_contig_range() path. So we *need* migration IIUC. >> >> b) dissolve_free_huge_pages() will fail if going below the reservation. >> In that case we really want to migrate free pages. This even applies to >> memory offlining. >> >> Either I am missing something important or this patch is more dangerous >> than it looks like. > > This is an interesting point. But do we try to migrate hugetlb pages in > alloc_contig_range? isolate_migratepages_block !PageLRU need to be I didn't test it so far (especially in the context of virtio-mem or CMA), but have a TODO item on my long list of things to look at in the future. > marked as PageMovable AFAICS. This would be quite easy to implement but > a more fundamental question is whether we really want to mess with > existing pools for alloc_contig_range. Can these pages fall onto ZONE_MOVABLE or even MIGRATE_CMA? If yes, we really want to. And I think both is the case for "ordinary" huge pages allocated via the buddy. > > Anyway you are quite right that this change has more side effects than > it is easy to see while it doesn't really bring any major advantage > other than the consistency. Free hugetlbfs pages are special. E.g., they cannot simply be skipped when offlining. So I don't think consistency actually really applies.
On Tue 12-01-21 12:34:01, David Hildenbrand wrote: > On 12.01.21 12:27, Michal Hocko wrote: > > On Tue 12-01-21 12:11:21, David Hildenbrand wrote: > >> On 12.01.21 12:00, David Hildenbrand wrote: > >>> On 10.01.21 13:40, Muchun Song wrote: > >>>> If the refcount is one when it is migrated, it means that the page > >>>> was freed from under us. So we are done and do not need to migrate. > >>>> > >>>> This optimization is consistent with the regular pages, just like > >>>> unmap_and_move() does. > >>>> > >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > >>>> Acked-by: Yang Shi <shy828301@gmail.com> > >>>> --- > >>>> mm/migrate.c | 6 ++++++ > >>>> 1 file changed, 6 insertions(+) > >>>> > >>>> diff --git a/mm/migrate.c b/mm/migrate.c > >>>> index 4385f2fb5d18..a6631c4eb6a6 100644 > >>>> --- a/mm/migrate.c > >>>> +++ b/mm/migrate.c > >>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > >>>> return -ENOSYS; > >>>> } > >>>> > >>>> + if (page_count(hpage) == 1) { > >>>> + /* page was freed from under us. So we are done. */ > >>>> + putback_active_hugepage(hpage); > >>>> + return MIGRATEPAGE_SUCCESS; > >>>> + } > >>>> + > >>>> new_hpage = get_new_page(hpage, private); > >>>> if (!new_hpage) > >>>> return -ENOMEM; > >>>> > >>> > >>> Question: What if called via alloc_contig_range() where we even want to > >>> "migrate" free pages, meaning, relocate it? > >>> > >> > >> To be more precise: > >> > >> a) We don't have dissolve_free_huge_pages() calls on the > >> alloc_contig_range() path. So we *need* migration IIUC. > >> > >> b) dissolve_free_huge_pages() will fail if going below the reservation. > >> In that case we really want to migrate free pages. This even applies to > >> memory offlining. > >> > >> Either I am missing something important or this patch is more dangerous > >> than it looks like. > > > > This is an interesting point. But do we try to migrate hugetlb pages in > > alloc_contig_range? isolate_migratepages_block !PageLRU need to be > > I didn't test it so far (especially in the context of virtio-mem or > CMA), but have a TODO item on my long list of things to look at in the > future. I have looked around and it seems this would more work than just to allow the migration in a-c-r. migrate_huge_page_move_mapping resp. hugetlbfs_migrate_page would need to special case pool pages. Likely a non trivial work and potentially another land mine territory. > > > marked as PageMovable AFAICS. This would be quite easy to implement but > > a more fundamental question is whether we really want to mess with > > existing pools for alloc_contig_range. > > Can these pages fall onto ZONE_MOVABLE or even MIGRATE_CMA? If yes, we > really want to. And I think both is the case for "ordinary" huge pages > allocated via the buddy. Yes movable hugetlb pages can sit in movable zone and CMA as well (see htlb_modify_alloc_mask). > > Anyway you are quite right that this change has more side effects than > > it is easy to see while it doesn't really bring any major advantage > > other than the consistency. > > Free hugetlbfs pages are special. E.g., they cannot simply be skipped > when offlining. So I don't think consistency actually really applies. Well, currently pool pages are not migrateable but you are right that this is likely something that we will need to look into in the future and this optimization would stand in the way.
On Tue, Jan 12, 2021 at 7:11 PM David Hildenbrand <david@redhat.com> wrote: > > On 12.01.21 12:00, David Hildenbrand wrote: > > On 10.01.21 13:40, Muchun Song wrote: > >> If the refcount is one when it is migrated, it means that the page > >> was freed from under us. So we are done and do not need to migrate. > >> > >> This optimization is consistent with the regular pages, just like > >> unmap_and_move() does. > >> > >> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > >> Acked-by: Yang Shi <shy828301@gmail.com> > >> --- > >> mm/migrate.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/mm/migrate.c b/mm/migrate.c > >> index 4385f2fb5d18..a6631c4eb6a6 100644 > >> --- a/mm/migrate.c > >> +++ b/mm/migrate.c > >> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > >> return -ENOSYS; > >> } > >> > >> + if (page_count(hpage) == 1) { > >> + /* page was freed from under us. So we are done. */ > >> + putback_active_hugepage(hpage); > >> + return MIGRATEPAGE_SUCCESS; > >> + } > >> + > >> new_hpage = get_new_page(hpage, private); > >> if (!new_hpage) > >> return -ENOMEM; > >> > > > > Question: What if called via alloc_contig_range() where we even want to > > "migrate" free pages, meaning, relocate it? > > > > To be more precise: > > a) We don't have dissolve_free_huge_pages() calls on the > alloc_contig_range() path. So we *need* migration IIUC. Without this patch, if you want to migrate a HUgeTLB page, the page is freed to the hugepage pool. With this patch, the page is also freed to the hugepage pool. I didn't see any different. I am missing something? > > b) dissolve_free_huge_pages() will fail if going below the reservation. > In that case we really want to migrate free pages. This even applies to > memory offlining. > > Either I am missing something important or this patch is more dangerous > than it looks like. > > -- > Thanks, > > David / dhildenb >
On 12.01.21 14:40, Muchun Song wrote: > On Tue, Jan 12, 2021 at 7:11 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 12.01.21 12:00, David Hildenbrand wrote: >>> On 10.01.21 13:40, Muchun Song wrote: >>>> If the refcount is one when it is migrated, it means that the page >>>> was freed from under us. So we are done and do not need to migrate. >>>> >>>> This optimization is consistent with the regular pages, just like >>>> unmap_and_move() does. >>>> >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> >>>> Acked-by: Yang Shi <shy828301@gmail.com> >>>> --- >>>> mm/migrate.c | 6 ++++++ >>>> 1 file changed, 6 insertions(+) >>>> >>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>> index 4385f2fb5d18..a6631c4eb6a6 100644 >>>> --- a/mm/migrate.c >>>> +++ b/mm/migrate.c >>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, >>>> return -ENOSYS; >>>> } >>>> >>>> + if (page_count(hpage) == 1) { >>>> + /* page was freed from under us. So we are done. */ >>>> + putback_active_hugepage(hpage); >>>> + return MIGRATEPAGE_SUCCESS; >>>> + } >>>> + >>>> new_hpage = get_new_page(hpage, private); >>>> if (!new_hpage) >>>> return -ENOMEM; >>>> >>> >>> Question: What if called via alloc_contig_range() where we even want to >>> "migrate" free pages, meaning, relocate it? >>> >> >> To be more precise: >> >> a) We don't have dissolve_free_huge_pages() calls on the >> alloc_contig_range() path. So we *need* migration IIUC. > > Without this patch, if you want to migrate a HUgeTLB page, > the page is freed to the hugepage pool. With this patch, > the page is also freed to the hugepage pool. > I didn't see any different. I am missing something? I am definitely not an expert on hugetlb pools, that's why I am asking. Isn't it, that with your code, no new page is allocated - so dissolve_free_huge_pages() might just refuse to dissolve due to reservations, bailing out, no? (as discussed, looks like alloc_contig_range() needs to be fixed to handle this correctly)
On Tue, Jan 12, 2021 at 9:51 PM David Hildenbrand <david@redhat.com> wrote: > > On 12.01.21 14:40, Muchun Song wrote: > > On Tue, Jan 12, 2021 at 7:11 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 12.01.21 12:00, David Hildenbrand wrote: > >>> On 10.01.21 13:40, Muchun Song wrote: > >>>> If the refcount is one when it is migrated, it means that the page > >>>> was freed from under us. So we are done and do not need to migrate. > >>>> > >>>> This optimization is consistent with the regular pages, just like > >>>> unmap_and_move() does. > >>>> > >>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > >>>> Acked-by: Yang Shi <shy828301@gmail.com> > >>>> --- > >>>> mm/migrate.c | 6 ++++++ > >>>> 1 file changed, 6 insertions(+) > >>>> > >>>> diff --git a/mm/migrate.c b/mm/migrate.c > >>>> index 4385f2fb5d18..a6631c4eb6a6 100644 > >>>> --- a/mm/migrate.c > >>>> +++ b/mm/migrate.c > >>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > >>>> return -ENOSYS; > >>>> } > >>>> > >>>> + if (page_count(hpage) == 1) { > >>>> + /* page was freed from under us. So we are done. */ > >>>> + putback_active_hugepage(hpage); > >>>> + return MIGRATEPAGE_SUCCESS; > >>>> + } > >>>> + > >>>> new_hpage = get_new_page(hpage, private); > >>>> if (!new_hpage) > >>>> return -ENOMEM; > >>>> > >>> > >>> Question: What if called via alloc_contig_range() where we even want to > >>> "migrate" free pages, meaning, relocate it? > >>> > >> > >> To be more precise: > >> > >> a) We don't have dissolve_free_huge_pages() calls on the > >> alloc_contig_range() path. So we *need* migration IIUC. > > > > Without this patch, if you want to migrate a HUgeTLB page, > > the page is freed to the hugepage pool. With this patch, > > the page is also freed to the hugepage pool. > > I didn't see any different. I am missing something? > > I am definitely not an expert on hugetlb pools, that's why I am asking. > > Isn't it, that with your code, no new page is allocated - so > dissolve_free_huge_pages() might just refuse to dissolve due to > reservations, bailing out, no? Without this patch, the new page can be allocated from the hugepage pool. The dissolve_free_huge_pages() also can refuse to dissolve due to reservations. Right? > > (as discussed, looks like alloc_contig_range() needs to be fixed to > handle this correctly) > > -- > Thanks, > > David / dhildenb >
On Tue 12-01-21 13:16:45, Michal Hocko wrote: [...] > Well, currently pool pages are not migrateable but you are right that > this is likely something that we will need to look into in the future > and this optimization would stand in the way. After some more thinking I believe I was wrong in my last statement. This optimization shouldn't have any effect on pages on the pool as those stay at reference count 0 and they cannot be isolated either (clear_page_huge_active before it is enqueued). That being said, the migration code would still have to learn about about this pages but that is out of scope of this discussion. Sorry about the confusion from my side.
On 12.01.21 15:17, Muchun Song wrote: > On Tue, Jan 12, 2021 at 9:51 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 12.01.21 14:40, Muchun Song wrote: >>> On Tue, Jan 12, 2021 at 7:11 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 12.01.21 12:00, David Hildenbrand wrote: >>>>> On 10.01.21 13:40, Muchun Song wrote: >>>>>> If the refcount is one when it is migrated, it means that the page >>>>>> was freed from under us. So we are done and do not need to migrate. >>>>>> >>>>>> This optimization is consistent with the regular pages, just like >>>>>> unmap_and_move() does. >>>>>> >>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> >>>>>> Acked-by: Yang Shi <shy828301@gmail.com> >>>>>> --- >>>>>> mm/migrate.c | 6 ++++++ >>>>>> 1 file changed, 6 insertions(+) >>>>>> >>>>>> diff --git a/mm/migrate.c b/mm/migrate.c >>>>>> index 4385f2fb5d18..a6631c4eb6a6 100644 >>>>>> --- a/mm/migrate.c >>>>>> +++ b/mm/migrate.c >>>>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, >>>>>> return -ENOSYS; >>>>>> } >>>>>> >>>>>> + if (page_count(hpage) == 1) { >>>>>> + /* page was freed from under us. So we are done. */ >>>>>> + putback_active_hugepage(hpage); >>>>>> + return MIGRATEPAGE_SUCCESS; >>>>>> + } >>>>>> + >>>>>> new_hpage = get_new_page(hpage, private); >>>>>> if (!new_hpage) >>>>>> return -ENOMEM; >>>>>> >>>>> >>>>> Question: What if called via alloc_contig_range() where we even want to >>>>> "migrate" free pages, meaning, relocate it? >>>>> >>>> >>>> To be more precise: >>>> >>>> a) We don't have dissolve_free_huge_pages() calls on the >>>> alloc_contig_range() path. So we *need* migration IIUC. >>> >>> Without this patch, if you want to migrate a HUgeTLB page, >>> the page is freed to the hugepage pool. With this patch, >>> the page is also freed to the hugepage pool. >>> I didn't see any different. I am missing something? >> >> I am definitely not an expert on hugetlb pools, that's why I am asking. >> >> Isn't it, that with your code, no new page is allocated - so >> dissolve_free_huge_pages() might just refuse to dissolve due to >> reservations, bailing out, no? > > Without this patch, the new page can be allocated from the > hugepage pool. The dissolve_free_huge_pages() also > can refuse to dissolve due to reservations. Right? Oh, you mean the migration target might be coming from the pool? I guess yes, looking at alloc_migration_target()->alloc_huge_page_nodemask(). In that case, yes, I think we run into a similar issue already. Instead of trying to allocate new huge pages in dissolve_free_huge_pages() to "relocate free pages", we bail out. This all feels kind of wrong. After we migrated a huge page we should free it back to the buddy, so most of our machinery just keeps working without caring about free huge pages. I can see how your patch will not change the current (IMHO broken) behavior.
On 12.01.21 15:23, Michal Hocko wrote: > On Tue 12-01-21 13:16:45, Michal Hocko wrote: > [...] >> Well, currently pool pages are not migrateable but you are right that >> this is likely something that we will need to look into in the future >> and this optimization would stand in the way. > > After some more thinking I believe I was wrong in my last statement. > This optimization shouldn't have any effect on pages on the pool as > those stay at reference count 0 and they cannot be isolated either > (clear_page_huge_active before it is enqueued). > > That being said, the migration code would still have to learn about > about this pages but that is out of scope of this discussion. > > Sorry about the confusion from my side. > At this point I am fairly confused what's working at what's not :D I think this will require more thought, on how to teach alloc_contig_range() (and eventually in some cases offline_pages()?) to do the right thing.
On Tue 12-01-21 15:41:02, David Hildenbrand wrote: > On 12.01.21 15:23, Michal Hocko wrote: > > On Tue 12-01-21 13:16:45, Michal Hocko wrote: > > [...] > >> Well, currently pool pages are not migrateable but you are right that > >> this is likely something that we will need to look into in the future > >> and this optimization would stand in the way. > > > > After some more thinking I believe I was wrong in my last statement. > > This optimization shouldn't have any effect on pages on the pool as > > those stay at reference count 0 and they cannot be isolated either > > (clear_page_huge_active before it is enqueued). > > > > That being said, the migration code would still have to learn about > > about this pages but that is out of scope of this discussion. > > > > Sorry about the confusion from my side. > > > > At this point I am fairly confused what's working at what's not :D heh, tell me something about that. Hugetlb is a maze full of land mines. > I think this will require more thought, on how to teach > alloc_contig_range() (and eventually in some cases offline_pages()?) to > do the right thing. Well, offlining sort of works because it retries both migrates and dissolves. It can fail with the later due to reservations but that can be expected. We can still try harder to rellocate/rebalance per numa pools to keep the reservation but I strongly suspect nobody has noticed this to be a problem so there we are.
On Tue, Jan 12, 2021 at 10:28 PM David Hildenbrand <david@redhat.com> wrote: > > On 12.01.21 15:17, Muchun Song wrote: > > On Tue, Jan 12, 2021 at 9:51 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 12.01.21 14:40, Muchun Song wrote: > >>> On Tue, Jan 12, 2021 at 7:11 PM David Hildenbrand <david@redhat.com> wrote: > >>>> > >>>> On 12.01.21 12:00, David Hildenbrand wrote: > >>>>> On 10.01.21 13:40, Muchun Song wrote: > >>>>>> If the refcount is one when it is migrated, it means that the page > >>>>>> was freed from under us. So we are done and do not need to migrate. > >>>>>> > >>>>>> This optimization is consistent with the regular pages, just like > >>>>>> unmap_and_move() does. > >>>>>> > >>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>>>> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > >>>>>> Acked-by: Yang Shi <shy828301@gmail.com> > >>>>>> --- > >>>>>> mm/migrate.c | 6 ++++++ > >>>>>> 1 file changed, 6 insertions(+) > >>>>>> > >>>>>> diff --git a/mm/migrate.c b/mm/migrate.c > >>>>>> index 4385f2fb5d18..a6631c4eb6a6 100644 > >>>>>> --- a/mm/migrate.c > >>>>>> +++ b/mm/migrate.c > >>>>>> @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > >>>>>> return -ENOSYS; > >>>>>> } > >>>>>> > >>>>>> + if (page_count(hpage) == 1) { > >>>>>> + /* page was freed from under us. So we are done. */ > >>>>>> + putback_active_hugepage(hpage); > >>>>>> + return MIGRATEPAGE_SUCCESS; > >>>>>> + } > >>>>>> + > >>>>>> new_hpage = get_new_page(hpage, private); > >>>>>> if (!new_hpage) > >>>>>> return -ENOMEM; > >>>>>> > >>>>> > >>>>> Question: What if called via alloc_contig_range() where we even want to > >>>>> "migrate" free pages, meaning, relocate it? > >>>>> > >>>> > >>>> To be more precise: > >>>> > >>>> a) We don't have dissolve_free_huge_pages() calls on the > >>>> alloc_contig_range() path. So we *need* migration IIUC. > >>> > >>> Without this patch, if you want to migrate a HUgeTLB page, > >>> the page is freed to the hugepage pool. With this patch, > >>> the page is also freed to the hugepage pool. > >>> I didn't see any different. I am missing something? > >> > >> I am definitely not an expert on hugetlb pools, that's why I am asking. > >> > >> Isn't it, that with your code, no new page is allocated - so > >> dissolve_free_huge_pages() might just refuse to dissolve due to > >> reservations, bailing out, no? > > > > Without this patch, the new page can be allocated from the > > hugepage pool. The dissolve_free_huge_pages() also > > can refuse to dissolve due to reservations. Right? > > Oh, you mean the migration target might be coming from the pool? I guess > yes, looking at alloc_migration_target()->alloc_huge_page_nodemask(). Yeah, you are right. If we want to free a HugeTLB page to the buddy allocator, we should dissolve_free_huge_page() to do that. Migrating cannot guarantee this at least now. > > In that case, yes, I think we run into a similar issue already. > > Instead of trying to allocate new huge pages in > dissolve_free_huge_pages() to "relocate free pages", we bail out. > > This all feels kind of wrong. After we migrated a huge page we should > free it back to the buddy, so most of our machinery just keeps working > without caring about free huge pages. > > > I can see how your patch will not change the current (IMHO broken) behavior. > > -- > Thanks, > > David / dhildenb >
On 1/12/21 6:53 AM, Michal Hocko wrote: > On Tue 12-01-21 15:41:02, David Hildenbrand wrote: >> On 12.01.21 15:23, Michal Hocko wrote: >>> On Tue 12-01-21 13:16:45, Michal Hocko wrote: >>> [...] >>>> Well, currently pool pages are not migrateable but you are right that >>>> this is likely something that we will need to look into in the future >>>> and this optimization would stand in the way. >>> >>> After some more thinking I believe I was wrong in my last statement. >>> This optimization shouldn't have any effect on pages on the pool as >>> those stay at reference count 0 and they cannot be isolated either >>> (clear_page_huge_active before it is enqueued). >>> >>> That being said, the migration code would still have to learn about >>> about this pages but that is out of scope of this discussion. >>> >>> Sorry about the confusion from my side. >>> >> >> At this point I am fairly confused what's working at what's not :D > > heh, tell me something about that. Hugetlb is a maze full of land mines. > >> I think this will require more thought, on how to teach >> alloc_contig_range() (and eventually in some cases offline_pages()?) to >> do the right thing. > > Well, offlining sort of works because it retries both migrates and > dissolves. It can fail with the later due to reservations but that can > be expected. We can still try harder to rellocate/rebalance per numa > pools to keep the reservation but I strongly suspect nobody has noticed > this to be a problem so there we are. Due to my time zone, I get to read all the previous comments before commenting myself. :) To be clear, this patch is handling a very specific case where a hugetlb page was isolated for migration and after being isolated the last reference to the page was dropped. Normally, dropping the last reference would send the page to free_huge_page processing. free_huge_page processing would either dissolve the page to the buddy allocator or more likely place the page on the free list of the pool. However, since isolation also holds a reference to the page, processing is continued down the migration path. Today there is no code to migrate free huge pages in the pool. Only active in use pages are migrated. Without this patch, 'free pages' in the very specific state above would be migrated. But to be clear, that was never the intention of any hugetlb migration code. In that respect, I believe this patch helps the current code work as designed. David brings up the valid point that alloc_contig_range needs to deal with free hugetlb pool pages. However, that is code which needs to be written as it does not exist today. I suspect nobody thought about free hugetlb pages when alloc_contig_range was written. This patch should in no way hinder development of that code. Free huge pages have a ref count of 0, and this code is checking for ref count of 1. That is a long way of saying that I still agree with this patch. The only modification/suggestion would be enhancing the commit message as suggested by Michal.
diff --git a/mm/migrate.c b/mm/migrate.c index 4385f2fb5d18..a6631c4eb6a6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1279,6 +1279,12 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, return -ENOSYS; } + if (page_count(hpage) == 1) { + /* page was freed from under us. So we are done. */ + putback_active_hugepage(hpage); + return MIGRATEPAGE_SUCCESS; + } + new_hpage = get_new_page(hpage, private); if (!new_hpage) return -ENOMEM;