diff mbox series

[RFC] mm: skip gigantic pages in isolate_single_pageblock() when mem offline

Message ID 20240813125226.1478800-1-wangkefeng.wang@huawei.com (mailing list archive)
State New
Headers show
Series [RFC] mm: skip gigantic pages in isolate_single_pageblock() when mem offline | expand

Commit Message

Kefeng Wang Aug. 13, 2024, 12:52 p.m. UTC
The gigantic page size may larger than memory block size, so memory
offline always fails in this case after commit b2c9e2fbba32 ("mm: make
alloc_contig_range work at pageblock granularity"),

offline_pages
  start_isolate_page_range
    start_isolate_page_range(isolate_before=true)
      isolate [isolate_start, isolate_start + pageblock_nr_pages)
    start_isolate_page_range(isolate_before=false)
      isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
       	__alloc_contig_migrate_range
          isolate_migratepages_range
            isolate_migratepages_block
              isolate_or_dissolve_huge_page
                if (hstate_is_gigantic(h))
                    return -ENOMEM;

[   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range

Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
pages when memory offline, which return back to the original logic to
handle the gigantic pages.

Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/page_isolation.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Matthew Wilcox Aug. 13, 2024, 2:03 p.m. UTC | #1
On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
> The gigantic page size may larger than memory block size, so memory
> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
> alloc_contig_range work at pageblock granularity"),
> 
> offline_pages
>   start_isolate_page_range
>     start_isolate_page_range(isolate_before=true)
>       isolate [isolate_start, isolate_start + pageblock_nr_pages)
>     start_isolate_page_range(isolate_before=false)
>       isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>        	__alloc_contig_migrate_range
>           isolate_migratepages_range
>             isolate_migratepages_block
>               isolate_or_dissolve_huge_page
>                 if (hstate_is_gigantic(h))
>                     return -ENOMEM;
> 
> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
> 
> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
> pages when memory offline, which return back to the original logic to
> handle the gigantic pages.

This seems like the wrong way to fix this.  The logic in the next
PageHuge() section seems like it's specifically supposed to handle
gigantic pages.  So you've just made that dead code, but instead of
removing it, you've left it there to confuse everyone?

I admit to not understanding this code terribly well.

> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  mm/page_isolation.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 042937d5abe4..25db4040e70a 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -400,6 +400,16 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>  				continue;
>  			}
>  
> +			if ((flags & MEMORY_OFFLINE) && PageHuge(page)) {
> +				struct hstate *h;
> +
> +				h = size_to_hstate(nr_pages << PAGE_SHIFT);
> +				if (hstate_is_gigantic(h)) {
> +					pfn = head_pfn + nr_pages;
> +					continue;
> +				}
> +			}
> +
>  #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>  			if (PageHuge(page)) {
>  				int page_mt = get_pageblock_migratetype(page);
> -- 
> 2.27.0
> 
>
Kefeng Wang Aug. 13, 2024, 2:46 p.m. UTC | #2
On 2024/8/13 22:03, Matthew Wilcox wrote:
> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>> The gigantic page size may larger than memory block size, so memory
>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>> alloc_contig_range work at pageblock granularity"),
>>
>> offline_pages
>>    start_isolate_page_range
>>      start_isolate_page_range(isolate_before=true)
>>        isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>      start_isolate_page_range(isolate_before=false)
>>        isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>         	__alloc_contig_migrate_range
>>            isolate_migratepages_range
>>              isolate_migratepages_block
>>                isolate_or_dissolve_huge_page
>>                  if (hstate_is_gigantic(h))
>>                      return -ENOMEM;
>>
>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>
>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>> pages when memory offline, which return back to the original logic to
>> handle the gigantic pages.
> 
> This seems like the wrong way to fix this.  The logic in the next
> PageHuge() section seems like it's specifically supposed to handle
> gigantic pages.  So you've just made that dead code, but instead of
> removing it, you've left it there to confuse everyone?

isolate_single_pageblock() in start_isolate_page_range() will be called
from memory offline and contig allocation (alloc_contig_pages()), this
changes only restore the behavior from memory offline code, but we still
fail in contig allocation.

 From memory offline, we has own path to isolate/migrate page or dissolve
free hugetlb folios, so I think we don't depends on the 
__alloc_contig_migrate_range().
> 
> I admit to not understanding this code terribly well.
> 
A quick search from [1], the isolate_single_pageblock() is added for
contig allocation, but it has negative effects on memory hotplug,
Zi Yan, could you give some comments?

[1] 
https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/

>> Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   mm/page_isolation.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 042937d5abe4..25db4040e70a 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -400,6 +400,16 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>   				continue;
>>   			}
>>   
>> +			if ((flags & MEMORY_OFFLINE) && PageHuge(page)) {
>> +				struct hstate *h;
>> +
>> +				h = size_to_hstate(nr_pages << PAGE_SHIFT);
>> +				if (hstate_is_gigantic(h)) {
>> +					pfn = head_pfn + nr_pages;
>> +					continue;
>> +				}
>> +			}
>> +
>>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>   			if (PageHuge(page)) {
>>   				int page_mt = get_pageblock_migratetype(page);
>> -- 
>> 2.27.0
>>
>>
>
Zi Yan Aug. 13, 2024, 2:59 p.m. UTC | #3
On 13 Aug 2024, at 10:46, Kefeng Wang wrote:

> On 2024/8/13 22:03, Matthew Wilcox wrote:
>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>> The gigantic page size may larger than memory block size, so memory
>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>> alloc_contig_range work at pageblock granularity"),
>>>
>>> offline_pages
>>>    start_isolate_page_range
>>>      start_isolate_page_range(isolate_before=true)
>>>        isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>      start_isolate_page_range(isolate_before=false)
>>>        isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>         	__alloc_contig_migrate_range
>>>            isolate_migratepages_range
>>>              isolate_migratepages_block
>>>                isolate_or_dissolve_huge_page
>>>                  if (hstate_is_gigantic(h))
>>>                      return -ENOMEM;
>>>
>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>
>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>> pages when memory offline, which return back to the original logic to
>>> handle the gigantic pages.
>>
>> This seems like the wrong way to fix this.  The logic in the next
>> PageHuge() section seems like it's specifically supposed to handle
>> gigantic pages.  So you've just made that dead code, but instead of
>> removing it, you've left it there to confuse everyone?
>
> isolate_single_pageblock() in start_isolate_page_range() will be called
> from memory offline and contig allocation (alloc_contig_pages()), this
> changes only restore the behavior from memory offline code, but we still
> fail in contig allocation.
>
> From memory offline, we has own path to isolate/migrate page or dissolve
> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>
>> I admit to not understanding this code terribly well.
>>
> A quick search from [1], the isolate_single_pageblock() is added for
> contig allocation, but it has negative effects on memory hotplug,
> Zi Yan, could you give some comments?
>
> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/

Probably we can isolate the hugetlb page and use migrate_page() instead of
__alloc_contig_migrate_range() in the section below, since we are targeting
only hugetlb pages here. It should solve the issue.

When I sent the original patchset, I over-thought about the situation and
included PageLRU and __PageMovable, so used __alloc_contig_migrate_range().
That was probably not the right approach.

I am aware of that the current page isolation code is very complicated and
planning to clean it up. My current plan is:
1. turn MIGRATE_ISOLATE a standalone bit instead of a migratetype (I have
a local patch)
2. refactor page isolation code, since after 1, migratetype is preserved
across isolations
3. clean up alloc_contig_range().


Best Regards,
Yan, Zi
Kefeng Wang Aug. 14, 2024, 2:01 a.m. UTC | #4
On 2024/8/13 22:59, Zi Yan wrote:
> On 13 Aug 2024, at 10:46, Kefeng Wang wrote:
> 
>> On 2024/8/13 22:03, Matthew Wilcox wrote:
>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>>> The gigantic page size may larger than memory block size, so memory
>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>>> alloc_contig_range work at pageblock granularity"),
>>>>
>>>> offline_pages
>>>>     start_isolate_page_range
>>>>       start_isolate_page_range(isolate_before=true)
>>>>         isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>>       start_isolate_page_range(isolate_before=false)
>>>>         isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>>          	__alloc_contig_migrate_range
>>>>             isolate_migratepages_range
>>>>               isolate_migratepages_block
>>>>                 isolate_or_dissolve_huge_page
>>>>                   if (hstate_is_gigantic(h))
>>>>                       return -ENOMEM;
>>>>
>>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>>
>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>>> pages when memory offline, which return back to the original logic to
>>>> handle the gigantic pages.
>>>
>>> This seems like the wrong way to fix this.  The logic in the next
>>> PageHuge() section seems like it's specifically supposed to handle
>>> gigantic pages.  So you've just made that dead code, but instead of
>>> removing it, you've left it there to confuse everyone?
>>
>> isolate_single_pageblock() in start_isolate_page_range() will be called
>> from memory offline and contig allocation (alloc_contig_pages()), this
>> changes only restore the behavior from memory offline code, but we still
>> fail in contig allocation.
>>
>>  From memory offline, we has own path to isolate/migrate page or dissolve
>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>>
>>> I admit to not understanding this code terribly well.
>>>
>> A quick search from [1], the isolate_single_pageblock() is added for
>> contig allocation, but it has negative effects on memory hotplug,
>> Zi Yan, could you give some comments?
>>
>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/
> 
> Probably we can isolate the hugetlb page and use migrate_page() instead of
> __alloc_contig_migrate_range() in the section below, since we are targeting
> only hugetlb pages here. It should solve the issue.

For contig allocation, I think we must isolate/migrate page in
__alloc_contig_migrate_range(), but for memory offline,(especially for
gigantic hugepage)as mentioned above, we already have own path to
isolate/migrate used page and dissolve the free pages,the
start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE,
that is what we did before b2c9e2fbba32,

start_isolate_page_range
scan_movable_pages
   do_migrate_range
dissolve_free_hugetlb_folios

Do we really need isolate/migrate the hugetlb page and for memory
offline path?


> 
> When I sent the original patchset, I over-thought about the situation and
> included PageLRU and __PageMovable, so used __alloc_contig_migrate_range().
> That was probably not the right approach.
> 
> I am aware of that the current page isolation code is very complicated and
> planning to clean it up. My current plan is:
> 1. turn MIGRATE_ISOLATE a standalone bit instead of a migratetype (I have
> a local patch)
> 2. refactor page isolation code, since after 1, migratetype is preserved
> across isolations
> 3. clean up alloc_contig_range().
> 

OK, this is another issue not related about memory hotplug, but we
should fix the current memory offline issue.

Thanks.
> 
> Best Regards,
> Yan, Zi
Zi Yan Aug. 14, 2024, 2:53 p.m. UTC | #5
On 13 Aug 2024, at 22:01, Kefeng Wang wrote:

> On 2024/8/13 22:59, Zi Yan wrote:
>> On 13 Aug 2024, at 10:46, Kefeng Wang wrote:
>>
>>> On 2024/8/13 22:03, Matthew Wilcox wrote:
>>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>>>> The gigantic page size may larger than memory block size, so memory
>>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>>>> alloc_contig_range work at pageblock granularity"),
>>>>>
>>>>> offline_pages
>>>>>     start_isolate_page_range
>>>>>       start_isolate_page_range(isolate_before=true)
>>>>>         isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>>>       start_isolate_page_range(isolate_before=false)
>>>>>         isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>>>          	__alloc_contig_migrate_range
>>>>>             isolate_migratepages_range
>>>>>               isolate_migratepages_block
>>>>>                 isolate_or_dissolve_huge_page
>>>>>                   if (hstate_is_gigantic(h))
>>>>>                       return -ENOMEM;
>>>>>
>>>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>>>
>>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>>>> pages when memory offline, which return back to the original logic to
>>>>> handle the gigantic pages.
>>>>
>>>> This seems like the wrong way to fix this.  The logic in the next
>>>> PageHuge() section seems like it's specifically supposed to handle
>>>> gigantic pages.  So you've just made that dead code, but instead of
>>>> removing it, you've left it there to confuse everyone?
>>>
>>> isolate_single_pageblock() in start_isolate_page_range() will be called
>>> from memory offline and contig allocation (alloc_contig_pages()), this
>>> changes only restore the behavior from memory offline code, but we still
>>> fail in contig allocation.
>>>
>>>  From memory offline, we has own path to isolate/migrate page or dissolve
>>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>>>
>>>> I admit to not understanding this code terribly well.
>>>>
>>> A quick search from [1], the isolate_single_pageblock() is added for
>>> contig allocation, but it has negative effects on memory hotplug,
>>> Zi Yan, could you give some comments?
>>>
>>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/
>>
>> Probably we can isolate the hugetlb page and use migrate_page() instead of
>> __alloc_contig_migrate_range() in the section below, since we are targeting
>> only hugetlb pages here. It should solve the issue.
>
> For contig allocation, I think we must isolate/migrate page in
> __alloc_contig_migrate_range(), but for memory offline,(especially for
> gigantic hugepage)as mentioned above, we already have own path to
> isolate/migrate used page and dissolve the free pages,the
> start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE,
> that is what we did before b2c9e2fbba32,
>
> start_isolate_page_range
> scan_movable_pages
>   do_migrate_range
> dissolve_free_hugetlb_folios
>
> Do we really need isolate/migrate the hugetlb page and for memory
> offline path?

For memory offline path, there is do_migrate_range() to move the pages.
For contig allocation, there is __alloc_contig_migrate_range() after
isolation to migrate the pages.

The migration code in isolate_single_pageblock() is not needed.
Something like this would be OK, just skip the page and let either
do_migrate_range() or __alloc_contig_migrate_range() to handle it:

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 042937d5abe4..587d723711c5 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -402,23 +402,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,

 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
                        if (PageHuge(page)) {
-                               int page_mt = get_pageblock_migratetype(page);
-                               struct compact_control cc = {
-                                       .nr_migratepages = 0,
-                                       .order = -1,
-                                       .zone = page_zone(pfn_to_page(head_pfn)),
-                                       .mode = MIGRATE_SYNC,
-                                       .ignore_skip_hint = true,
-                                       .no_set_skip_hint = true,
-                                       .gfp_mask = gfp_flags,
-                                       .alloc_contig = true,
-                               };
-                               INIT_LIST_HEAD(&cc.migratepages);
-
-                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
-                                                       head_pfn + nr_pages, page_mt);
-                               if (ret)
-                                       goto failed;
                                pfn = head_pfn + nr_pages;
                                continue;
                        }

>
>
>>
>> When I sent the original patchset, I over-thought about the situation and
>> included PageLRU and __PageMovable, so used __alloc_contig_migrate_range().
>> That was probably not the right approach.
>>
>> I am aware of that the current page isolation code is very complicated and
>> planning to clean it up. My current plan is:
>> 1. turn MIGRATE_ISOLATE a standalone bit instead of a migratetype (I have
>> a local patch)
>> 2. refactor page isolation code, since after 1, migratetype is preserved
>> across isolations
>> 3. clean up alloc_contig_range().
>>
>
> OK, this is another issue not related about memory hotplug, but we
> should fix the current memory offline issue.
>
> Thanks.
>>
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi
Kefeng Wang Aug. 15, 2024, 2:58 a.m. UTC | #6
On 2024/8/14 22:53, Zi Yan wrote:
> On 13 Aug 2024, at 22:01, Kefeng Wang wrote:
> 
>> On 2024/8/13 22:59, Zi Yan wrote:
>>> On 13 Aug 2024, at 10:46, Kefeng Wang wrote:
>>>
>>>> On 2024/8/13 22:03, Matthew Wilcox wrote:
>>>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>>>>> The gigantic page size may larger than memory block size, so memory
>>>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>>>>> alloc_contig_range work at pageblock granularity"),
>>>>>>
>>>>>> offline_pages
>>>>>>      start_isolate_page_range
>>>>>>        start_isolate_page_range(isolate_before=true)
>>>>>>          isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>>>>        start_isolate_page_range(isolate_before=false)
>>>>>>          isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>>>>           	__alloc_contig_migrate_range
>>>>>>              isolate_migratepages_range
>>>>>>                isolate_migratepages_block
>>>>>>                  isolate_or_dissolve_huge_page
>>>>>>                    if (hstate_is_gigantic(h))
>>>>>>                        return -ENOMEM;
>>>>>>
>>>>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>>>>
>>>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>>>>> pages when memory offline, which return back to the original logic to
>>>>>> handle the gigantic pages.
>>>>>
>>>>> This seems like the wrong way to fix this.  The logic in the next
>>>>> PageHuge() section seems like it's specifically supposed to handle
>>>>> gigantic pages.  So you've just made that dead code, but instead of
>>>>> removing it, you've left it there to confuse everyone?
>>>>
>>>> isolate_single_pageblock() in start_isolate_page_range() will be called
>>>> from memory offline and contig allocation (alloc_contig_pages()), this
>>>> changes only restore the behavior from memory offline code, but we still
>>>> fail in contig allocation.
>>>>
>>>>   From memory offline, we has own path to isolate/migrate page or dissolve
>>>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>>>>
>>>>> I admit to not understanding this code terribly well.
>>>>>
>>>> A quick search from [1], the isolate_single_pageblock() is added for
>>>> contig allocation, but it has negative effects on memory hotplug,
>>>> Zi Yan, could you give some comments?
>>>>
>>>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/
>>>
>>> Probably we can isolate the hugetlb page and use migrate_page() instead of
>>> __alloc_contig_migrate_range() in the section below, since we are targeting
>>> only hugetlb pages here. It should solve the issue.
>>
>> For contig allocation, I think we must isolate/migrate page in
>> __alloc_contig_migrate_range(), but for memory offline,(especially for
>> gigantic hugepage)as mentioned above, we already have own path to
>> isolate/migrate used page and dissolve the free pages,the
>> start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE,
>> that is what we did before b2c9e2fbba32,
>>
>> start_isolate_page_range
>> scan_movable_pages
>>    do_migrate_range
>> dissolve_free_hugetlb_folios
>>
>> Do we really need isolate/migrate the hugetlb page and for memory
>> offline path?
> 
> For memory offline path, there is do_migrate_range() to move the pages.
> For contig allocation, there is __alloc_contig_migrate_range() after
> isolation to migrate the pages.
> 
> The migration code in isolate_single_pageblock() is not needed.
> Something like this would be OK, just skip the page and let either
> do_migrate_range() or __alloc_contig_migrate_range() to handle it:

Oh, right, for alloc_contig_range(), we do have another 
__alloc_contig_migrate_range() after start_isolate_page_range(), then we
could drop the following code,

> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 042937d5abe4..587d723711c5 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -402,23 +402,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
> 
>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>                          if (PageHuge(page)) {
> -                               int page_mt = get_pageblock_migratetype(page);
> -                               struct compact_control cc = {
> -                                       .nr_migratepages = 0,
> -                                       .order = -1,
> -                                       .zone = page_zone(pfn_to_page(head_pfn)),
> -                                       .mode = MIGRATE_SYNC,
> -                                       .ignore_skip_hint = true,
> -                                       .no_set_skip_hint = true,
> -                                       .gfp_mask = gfp_flags,
> -                                       .alloc_contig = true,
> -                               };
> -                               INIT_LIST_HEAD(&cc.migratepages);
> -
> -                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
> -                                                       head_pfn + nr_pages, page_mt);
> -                               if (ret)
> -                                       goto failed;
>                                  pfn = head_pfn + nr_pages;
>                                  continue;
>                          }


But we need to remove the CONFIG_COMPACTION/CMA too, thought?

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 042937d5abe4..785c2d320631 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -395,30 +395,8 @@ static int isolate_single_pageblock(unsigned long 
boundary_pfn, int flags,
                         unsigned long head_pfn = page_to_pfn(head);
                         unsigned long nr_pages = compound_nr(head);

-                       if (head_pfn + nr_pages <= boundary_pfn) {
-                               pfn = head_pfn + nr_pages;
-                               continue;
-                       }
-
-#if defined CONFIG_COMPACTION || defined CONFIG_CMA
-                       if (PageHuge(page)) {
-                               int page_mt = 
get_pageblock_migratetype(page);
-                               struct compact_control cc = {
-                                       .nr_migratepages = 0,
-                                       .order = -1,
-                                       .zone = 
page_zone(pfn_to_page(head_pfn)),
-                                       .mode = MIGRATE_SYNC,
-                                       .ignore_skip_hint = true,
-                                       .no_set_skip_hint = true,
-                                       .gfp_mask = gfp_flags,
-                                       .alloc_contig = true,
-                               };
-                               INIT_LIST_HEAD(&cc.migratepages);
-
-                               ret = __alloc_contig_migrate_range(&cc, 
head_pfn,
-                                                       head_pfn + 
nr_pages, page_mt);
-                               if (ret)
-                                       goto failed;
+                       if (head_pfn + nr_pages <= boundary_pfn ||
+                           PageHuge(page))
                                 pfn = head_pfn + nr_pages;
                                 continue;
                         }
@@ -432,7 +410,6 @@ static int isolate_single_pageblock(unsigned long 
boundary_pfn, int flags,
                          */
                         VM_WARN_ON_ONCE_PAGE(PageLRU(page), page);
                         VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page);
-#endif
                         goto failed;
                 }
Zi Yan Aug. 15, 2024, 4:43 p.m. UTC | #7
On 14 Aug 2024, at 22:58, Kefeng Wang wrote:

> On 2024/8/14 22:53, Zi Yan wrote:
>> On 13 Aug 2024, at 22:01, Kefeng Wang wrote:
>>
>>> On 2024/8/13 22:59, Zi Yan wrote:
>>>> On 13 Aug 2024, at 10:46, Kefeng Wang wrote:
>>>>
>>>>> On 2024/8/13 22:03, Matthew Wilcox wrote:
>>>>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>>>>>> The gigantic page size may larger than memory block size, so memory
>>>>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>>>>>> alloc_contig_range work at pageblock granularity"),
>>>>>>>
>>>>>>> offline_pages
>>>>>>>      start_isolate_page_range
>>>>>>>        start_isolate_page_range(isolate_before=true)
>>>>>>>          isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>>>>>        start_isolate_page_range(isolate_before=false)
>>>>>>>          isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>>>>>           	__alloc_contig_migrate_range
>>>>>>>              isolate_migratepages_range
>>>>>>>                isolate_migratepages_block
>>>>>>>                  isolate_or_dissolve_huge_page
>>>>>>>                    if (hstate_is_gigantic(h))
>>>>>>>                        return -ENOMEM;
>>>>>>>
>>>>>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>>>>>
>>>>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>>>>>> pages when memory offline, which return back to the original logic to
>>>>>>> handle the gigantic pages.
>>>>>>
>>>>>> This seems like the wrong way to fix this.  The logic in the next
>>>>>> PageHuge() section seems like it's specifically supposed to handle
>>>>>> gigantic pages.  So you've just made that dead code, but instead of
>>>>>> removing it, you've left it there to confuse everyone?
>>>>>
>>>>> isolate_single_pageblock() in start_isolate_page_range() will be called
>>>>> from memory offline and contig allocation (alloc_contig_pages()), this
>>>>> changes only restore the behavior from memory offline code, but we still
>>>>> fail in contig allocation.
>>>>>
>>>>>   From memory offline, we has own path to isolate/migrate page or dissolve
>>>>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>>>>>
>>>>>> I admit to not understanding this code terribly well.
>>>>>>
>>>>> A quick search from [1], the isolate_single_pageblock() is added for
>>>>> contig allocation, but it has negative effects on memory hotplug,
>>>>> Zi Yan, could you give some comments?
>>>>>
>>>>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/
>>>>
>>>> Probably we can isolate the hugetlb page and use migrate_page() instead of
>>>> __alloc_contig_migrate_range() in the section below, since we are targeting
>>>> only hugetlb pages here. It should solve the issue.
>>>
>>> For contig allocation, I think we must isolate/migrate page in
>>> __alloc_contig_migrate_range(), but for memory offline,(especially for
>>> gigantic hugepage)as mentioned above, we already have own path to
>>> isolate/migrate used page and dissolve the free pages,the
>>> start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE,
>>> that is what we did before b2c9e2fbba32,
>>>
>>> start_isolate_page_range
>>> scan_movable_pages
>>>    do_migrate_range
>>> dissolve_free_hugetlb_folios
>>>
>>> Do we really need isolate/migrate the hugetlb page and for memory
>>> offline path?
>>
>> For memory offline path, there is do_migrate_range() to move the pages.
>> For contig allocation, there is __alloc_contig_migrate_range() after
>> isolation to migrate the pages.
>>
>> The migration code in isolate_single_pageblock() is not needed.
>> Something like this would be OK, just skip the page and let either
>> do_migrate_range() or __alloc_contig_migrate_range() to handle it:
>
> Oh, right, for alloc_contig_range(), we do have another __alloc_contig_migrate_range() after start_isolate_page_range(), then we
> could drop the following code,
>
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 042937d5abe4..587d723711c5 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -402,23 +402,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>
>>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>                          if (PageHuge(page)) {
>> -                               int page_mt = get_pageblock_migratetype(page);
>> -                               struct compact_control cc = {
>> -                                       .nr_migratepages = 0,
>> -                                       .order = -1,
>> -                                       .zone = page_zone(pfn_to_page(head_pfn)),
>> -                                       .mode = MIGRATE_SYNC,
>> -                                       .ignore_skip_hint = true,
>> -                                       .no_set_skip_hint = true,
>> -                                       .gfp_mask = gfp_flags,
>> -                                       .alloc_contig = true,
>> -                               };
>> -                               INIT_LIST_HEAD(&cc.migratepages);
>> -
>> -                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
>> -                                                       head_pfn + nr_pages, page_mt);
>> -                               if (ret)
>> -                                       goto failed;
>>                                  pfn = head_pfn + nr_pages;
>>                                  continue;
>>                          }
>
>
> But we need to remove the CONFIG_COMPACTION/CMA too, thought?
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 042937d5abe4..785c2d320631 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -395,30 +395,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>                         unsigned long head_pfn = page_to_pfn(head);
>                         unsigned long nr_pages = compound_nr(head);
>
> -                       if (head_pfn + nr_pages <= boundary_pfn) {
> -                               pfn = head_pfn + nr_pages;
> -                               continue;
> -                       }
> -
> -#if defined CONFIG_COMPACTION || defined CONFIG_CMA
> -                       if (PageHuge(page)) {
> -                               int page_mt = get_pageblock_migratetype(page);
> -                               struct compact_control cc = {
> -                                       .nr_migratepages = 0,
> -                                       .order = -1,
> -                                       .zone = page_zone(pfn_to_page(head_pfn)),
> -                                       .mode = MIGRATE_SYNC,
> -                                       .ignore_skip_hint = true,
> -                                       .no_set_skip_hint = true,
> -                                       .gfp_mask = gfp_flags,
> -                                       .alloc_contig = true,
> -                               };
> -                               INIT_LIST_HEAD(&cc.migratepages);
> -
> -                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
> -                                                       head_pfn + nr_pages, page_mt);
> -                               if (ret)
> -                                       goto failed;
> +                       if (head_pfn + nr_pages <= boundary_pfn ||
> +                           PageHuge(page))
>                                 pfn = head_pfn + nr_pages;
>                                 continue;
>                         }
> @@ -432,7 +410,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>                          */
>                         VM_WARN_ON_ONCE_PAGE(PageLRU(page), page);
>                         VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page);
> -#endif
>                         goto failed;
>                 }

That looks good to me.

Best Regards,
Yan, Zi
Kefeng Wang Aug. 16, 2024, 12:56 a.m. UTC | #8
On 2024/8/16 0:43, Zi Yan wrote:
> On 14 Aug 2024, at 22:58, Kefeng Wang wrote:
> 
>> On 2024/8/14 22:53, Zi Yan wrote:
>>> On 13 Aug 2024, at 22:01, Kefeng Wang wrote:
>>>
>>>> On 2024/8/13 22:59, Zi Yan wrote:
>>>>> On 13 Aug 2024, at 10:46, Kefeng Wang wrote:
>>>>>
>>>>>> On 2024/8/13 22:03, Matthew Wilcox wrote:
>>>>>>> On Tue, Aug 13, 2024 at 08:52:26PM +0800, Kefeng Wang wrote:
>>>>>>>> The gigantic page size may larger than memory block size, so memory
>>>>>>>> offline always fails in this case after commit b2c9e2fbba32 ("mm: make
>>>>>>>> alloc_contig_range work at pageblock granularity"),
>>>>>>>>
>>>>>>>> offline_pages
>>>>>>>>       start_isolate_page_range
>>>>>>>>         start_isolate_page_range(isolate_before=true)
>>>>>>>>           isolate [isolate_start, isolate_start + pageblock_nr_pages)
>>>>>>>>         start_isolate_page_range(isolate_before=false)
>>>>>>>>           isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock
>>>>>>>>            	__alloc_contig_migrate_range
>>>>>>>>               isolate_migratepages_range
>>>>>>>>                 isolate_migratepages_block
>>>>>>>>                   isolate_or_dissolve_huge_page
>>>>>>>>                     if (hstate_is_gigantic(h))
>>>>>>>>                         return -ENOMEM;
>>>>>>>>
>>>>>>>> [   15.815756] memory offlining [mem 0x3c0000000-0x3c7ffffff] failed due to failure to isolate range
>>>>>>>>
>>>>>>>> Fix it by skipping the __alloc_contig_migrate_range() if met gigantic
>>>>>>>> pages when memory offline, which return back to the original logic to
>>>>>>>> handle the gigantic pages.
>>>>>>>
>>>>>>> This seems like the wrong way to fix this.  The logic in the next
>>>>>>> PageHuge() section seems like it's specifically supposed to handle
>>>>>>> gigantic pages.  So you've just made that dead code, but instead of
>>>>>>> removing it, you've left it there to confuse everyone?
>>>>>>
>>>>>> isolate_single_pageblock() in start_isolate_page_range() will be called
>>>>>> from memory offline and contig allocation (alloc_contig_pages()), this
>>>>>> changes only restore the behavior from memory offline code, but we still
>>>>>> fail in contig allocation.
>>>>>>
>>>>>>    From memory offline, we has own path to isolate/migrate page or dissolve
>>>>>> free hugetlb folios, so I think we don't depends on the __alloc_contig_migrate_range().
>>>>>>>
>>>>>>> I admit to not understanding this code terribly well.
>>>>>>>
>>>>>> A quick search from [1], the isolate_single_pageblock() is added for
>>>>>> contig allocation, but it has negative effects on memory hotplug,
>>>>>> Zi Yan, could you give some comments?
>>>>>>
>>>>>> [1] https://lore.kernel.org/linux-mm/20220425143118.2850746-1-zi.yan@sent.com/
>>>>>
>>>>> Probably we can isolate the hugetlb page and use migrate_page() instead of
>>>>> __alloc_contig_migrate_range() in the section below, since we are targeting
>>>>> only hugetlb pages here. It should solve the issue.
>>>>
>>>> For contig allocation, I think we must isolate/migrate page in
>>>> __alloc_contig_migrate_range(), but for memory offline,(especially for
>>>> gigantic hugepage)as mentioned above, we already have own path to
>>>> isolate/migrate used page and dissolve the free pages,the
>>>> start_isolate_page_range() only need to mark page range MIGRATE_ISOLATE,
>>>> that is what we did before b2c9e2fbba32,
>>>>
>>>> start_isolate_page_range
>>>> scan_movable_pages
>>>>     do_migrate_range
>>>> dissolve_free_hugetlb_folios
>>>>
>>>> Do we really need isolate/migrate the hugetlb page and for memory
>>>> offline path?
>>>
>>> For memory offline path, there is do_migrate_range() to move the pages.
>>> For contig allocation, there is __alloc_contig_migrate_range() after
>>> isolation to migrate the pages.
>>>
>>> The migration code in isolate_single_pageblock() is not needed.
>>> Something like this would be OK, just skip the page and let either
>>> do_migrate_range() or __alloc_contig_migrate_range() to handle it:
>>
>> Oh, right, for alloc_contig_range(), we do have another __alloc_contig_migrate_range() after start_isolate_page_range(), then we
>> could drop the following code,
>>
>>>
>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>>> index 042937d5abe4..587d723711c5 100644
>>> --- a/mm/page_isolation.c
>>> +++ b/mm/page_isolation.c
>>> @@ -402,23 +402,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>>
>>>    #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>>>                           if (PageHuge(page)) {
>>> -                               int page_mt = get_pageblock_migratetype(page);
>>> -                               struct compact_control cc = {
>>> -                                       .nr_migratepages = 0,
>>> -                                       .order = -1,
>>> -                                       .zone = page_zone(pfn_to_page(head_pfn)),
>>> -                                       .mode = MIGRATE_SYNC,
>>> -                                       .ignore_skip_hint = true,
>>> -                                       .no_set_skip_hint = true,
>>> -                                       .gfp_mask = gfp_flags,
>>> -                                       .alloc_contig = true,
>>> -                               };
>>> -                               INIT_LIST_HEAD(&cc.migratepages);
>>> -
>>> -                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
>>> -                                                       head_pfn + nr_pages, page_mt);
>>> -                               if (ret)
>>> -                                       goto failed;
>>>                                   pfn = head_pfn + nr_pages;
>>>                                   continue;
>>>                           }
>>
>>
>> But we need to remove the CONFIG_COMPACTION/CMA too, thought?
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 042937d5abe4..785c2d320631 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -395,30 +395,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>                          unsigned long head_pfn = page_to_pfn(head);
>>                          unsigned long nr_pages = compound_nr(head);
>>
>> -                       if (head_pfn + nr_pages <= boundary_pfn) {
>> -                               pfn = head_pfn + nr_pages;
>> -                               continue;
>> -                       }
>> -
>> -#if defined CONFIG_COMPACTION || defined CONFIG_CMA
>> -                       if (PageHuge(page)) {
>> -                               int page_mt = get_pageblock_migratetype(page);
>> -                               struct compact_control cc = {
>> -                                       .nr_migratepages = 0,
>> -                                       .order = -1,
>> -                                       .zone = page_zone(pfn_to_page(head_pfn)),
>> -                                       .mode = MIGRATE_SYNC,
>> -                                       .ignore_skip_hint = true,
>> -                                       .no_set_skip_hint = true,
>> -                                       .gfp_mask = gfp_flags,
>> -                                       .alloc_contig = true,
>> -                               };
>> -                               INIT_LIST_HEAD(&cc.migratepages);
>> -
>> -                               ret = __alloc_contig_migrate_range(&cc, head_pfn,
>> -                                                       head_pfn + nr_pages, page_mt);
>> -                               if (ret)
>> -                                       goto failed;
>> +                       if (head_pfn + nr_pages <= boundary_pfn ||
>> +                           PageHuge(page))
>>                                  pfn = head_pfn + nr_pages;
>>                                  continue;
>>                          }
>> @@ -432,7 +410,6 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>                           */
>>                          VM_WARN_ON_ONCE_PAGE(PageLRU(page), page);
>>                          VM_WARN_ON_ONCE_PAGE(__PageMovable(page), page);
>> -#endif
>>                          goto failed;
>>                  }
> 
> That looks good to me.

Thanks for your comments, will send a new version soon.
> 
> Best Regards,
> Yan, Zi
diff mbox series

Patch

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 042937d5abe4..25db4040e70a 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -400,6 +400,16 @@  static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
 				continue;
 			}
 
+			if ((flags & MEMORY_OFFLINE) && PageHuge(page)) {
+				struct hstate *h;
+
+				h = size_to_hstate(nr_pages << PAGE_SHIFT);
+				if (hstate_is_gigantic(h)) {
+					pfn = head_pfn + nr_pages;
+					continue;
+				}
+			}
+
 #if defined CONFIG_COMPACTION || defined CONFIG_CMA
 			if (PageHuge(page)) {
 				int page_mt = get_pageblock_migratetype(page);