diff mbox series

[2/4] mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()

Message ID 20240725011647.1306045-3-wangkefeng.wang@huawei.com (mailing list archive)
State New
Headers show
Series mm: memory_hotplug: improve do_migrate_range() | expand

Commit Message

Kefeng Wang July 25, 2024, 1:16 a.m. UTC
The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") don't handle the hugetlb pages, the dead loop
still occur if offline a hwpoison hugetlb, luckly, after the commit
e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
page will be clear, and the hwpoison hugetlb page will be skipped in
scan_movable_pages(), so the deed loop issue is fixed.

However if the HPageMigratable() check passed(without reference and
lock), the hugetlb page may be hwpoisoned, it won't cause issue since
the hwpoisoned page will be handled correctly in the next movable
pages scan loop, and it will be isolated in do_migrate_range() and
but fails to migrated. In order to avoid the unnecessary isolation and
unify all hwpoisoned page handling, let's unconditionally check hwpoison
firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
the catch all safety net like normal page does.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/memory_hotplug.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

Comments

David Hildenbrand July 30, 2024, 10:26 a.m. UTC | #1
On 25.07.24 03:16, Kefeng Wang wrote:
> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
> pages to be offlined") don't handle the hugetlb pages, the dead loop
> still occur if offline a hwpoison hugetlb, luckly, after the commit
> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
> page will be clear, and the hwpoison hugetlb page will be skipped in
> scan_movable_pages(), so the deed loop issue is fixed.

did you mean "endless loop" ?

> 
> However if the HPageMigratable() check passed(without reference and
> lock), the hugetlb page may be hwpoisoned, it won't cause issue since
> the hwpoisoned page will be handled correctly in the next movable
> pages scan loop, and it will be isolated in do_migrate_range() and
> but fails to migrated. In order to avoid the unnecessary isolation and
> unify all hwpoisoned page handling, let's unconditionally check hwpoison
> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
> the catch all safety net like normal page does.

But what's the benefit here besides slightly faster handling in an 
absolute corner case (I strongly suspect that we don't care)?

> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   mm/memory_hotplug.c | 27 ++++++++++++++++-----------
>   1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 66267c26ca1b..ccaf4c480aed 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>   		folio = page_folio(page);
>   		head = &folio->page;
>   
> -		if (PageHuge(page)) {
> -			pfn = page_to_pfn(head) + compound_nr(head) - 1;
> -			isolate_hugetlb(folio, &source);
> -			continue;
> -		} else if (PageTransHuge(page))
> -			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> -
>   		/*
>   		 * HWPoison pages have elevated reference counts so the migration would
>   		 * fail on them. It also doesn't make any sense to migrate them in the
>   		 * first place. Still try to unmap such a page in case it is still mapped
> -		 * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
> -		 * the unmap as the catch all safety net).
> +		 * (keep the unmap as the catch all safety net).
>   		 */
> -		if (PageHWPoison(page)) {
> +		if (unlikely(PageHWPoison(page))) {

We're not checking the head page here, will this work reliably for 
hugetlb? (I recall some difference in per-page hwpoison handling between 
hugetlb and THP due to the vmemmap optimization)
Kefeng Wang July 31, 2024, 5:09 a.m. UTC | #2
On 2024/7/30 18:26, David Hildenbrand wrote:
> On 25.07.24 03:16, Kefeng Wang wrote:
>> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
>> pages to be offlined") don't handle the hugetlb pages, the dead loop
>> still occur if offline a hwpoison hugetlb, luckly, after the commit
>> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
>> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
>> page will be clear, and the hwpoison hugetlb page will be skipped in
>> scan_movable_pages(), so the deed loop issue is fixed.
> 
> did you mean "endless loop" ?

Exactly, will fix the words.

> 
>>
>> However if the HPageMigratable() check passed(without reference and
>> lock), the hugetlb page may be hwpoisoned, it won't cause issue since
>> the hwpoisoned page will be handled correctly in the next movable
>> pages scan loop, and it will be isolated in do_migrate_range() and
>> but fails to migrated. In order to avoid the unnecessary isolation and
>> unify all hwpoisoned page handling, let's unconditionally check hwpoison
>> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
>> the catch all safety net like normal page does.
> 
> But what's the benefit here besides slightly faster handling in an 
> absolute corner case (I strongly suspect that we don't care)?

Yes, it is a very corner case, the goal is to move isolate_hugetlb()
after HWpoison check, then to unify isolation and folio conversion
(patch4). But we must correctly handle the hugetlb unmap when meet
a hwpoisoned page.

> 
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   mm/memory_hotplug.c | 27 ++++++++++++++++-----------
>>   1 file changed, 16 insertions(+), 11 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 66267c26ca1b..ccaf4c480aed 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long 
>> start_pfn, unsigned long end_pfn)
>>           folio = page_folio(page);
>>           head = &folio->page;
>> -        if (PageHuge(page)) {
>> -            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> -            isolate_hugetlb(folio, &source);
>> -            continue;
>> -        } else if (PageTransHuge(page))
>> -            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>> -
>>           /*
>>            * HWPoison pages have elevated reference counts so the 
>> migration would
>>            * fail on them. It also doesn't make any sense to migrate 
>> them in the
>>            * first place. Still try to unmap such a page in case it is 
>> still mapped
>> -         * (e.g. current hwpoison implementation doesn't unmap KSM 
>> pages but keep
>> -         * the unmap as the catch all safety net).
>> +         * (keep the unmap as the catch all safety net).
>>            */
>> -        if (PageHWPoison(page)) {
>> +        if (unlikely(PageHWPoison(page))) {
> 
> We're not checking the head page here, will this work reliably for 
> hugetlb? (I recall some difference in per-page hwpoison handling between 
> hugetlb and THP due to the vmemmap optimization)

Before this changes, the hwposioned hugetlb page won't try to unmap in
do_migrate_range(), we hope it already unmapped in memory_failure(), as 
mentioned from comments, there maybe fail to unmap, so a new safeguard 
to try to unmap it again here, but we don't need to guarantee it.

The unmap_posioned_folio() used to correctly handle hugetlb pages in 
shared mappings if we met a hwpoisoned page(maybe headpage/may subpage).

> 
>
David Hildenbrand Aug. 1, 2024, 8:10 p.m. UTC | #3
>>
>> We're not checking the head page here, will this work reliably for
>> hugetlb? (I recall some difference in per-page hwpoison handling between
>> hugetlb and THP due to the vmemmap optimization)
> 
> Before this changes, the hwposioned hugetlb page won't try to unmap in
> do_migrate_range(), we hope it already unmapped in memory_failure(), as
> mentioned from comments, there maybe fail to unmap, so a new safeguard
> to try to unmap it again here, but we don't need to guarantee it.

Thanks for clarifying!

But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.

IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page 
not the subpage. Reason is that due to the vmemmap optimization we might 
not be able to modify subpages to set hwpoison.
David Hildenbrand Aug. 1, 2024, 8:14 p.m. UTC | #4
On 25.07.24 03:16, Kefeng Wang wrote:
> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
> pages to be offlined") don't handle the hugetlb pages, the dead loop
> still occur if offline a hwpoison hugetlb, luckly, after the commit
> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
> page will be clear, and the hwpoison hugetlb page will be skipped in
> scan_movable_pages(), so the deed loop issue is fixed.
> 
> However if the HPageMigratable() check passed(without reference and
> lock), the hugetlb page may be hwpoisoned, it won't cause issue since
> the hwpoisoned page will be handled correctly in the next movable
> pages scan loop, and it will be isolated in do_migrate_range() and
> but fails to migrated. In order to avoid the unnecessary isolation and
> unify all hwpoisoned page handling, let's unconditionally check hwpoison
> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
> the catch all safety net like normal page does.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   mm/memory_hotplug.c | 27 ++++++++++++++++-----------
>   1 file changed, 16 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 66267c26ca1b..ccaf4c480aed 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
>   		folio = page_folio(page);
>   		head = &folio->page;
>   
> -		if (PageHuge(page)) {
> -			pfn = page_to_pfn(head) + compound_nr(head) - 1;
> -			isolate_hugetlb(folio, &source);
> -			continue;
> -		} else if (PageTransHuge(page))
> -			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> -
>   		/*
>   		 * HWPoison pages have elevated reference counts so the migration would
>   		 * fail on them. It also doesn't make any sense to migrate them in the
>   		 * first place. Still try to unmap such a page in case it is still mapped
> -		 * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
> -		 * the unmap as the catch all safety net).
> +		 * (keep the unmap as the catch all safety net).
>   		 */
> -		if (PageHWPoison(page)) {
> +		if (unlikely(PageHWPoison(page))) {
> +			folio = page_folio(page);
> +
>   			if (WARN_ON(folio_test_lru(folio)))
>   				folio_isolate_lru(folio);
> +
>   			if (folio_mapped(folio))
> -				try_to_unmap(folio, TTU_IGNORE_MLOCK);
> +				unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
> +
> +			if (folio_test_large(folio))
> +				pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>   			continue;
>   		}
>   
> +		if (PageHuge(page)) {
> +			pfn = page_to_pfn(head) + compound_nr(head) - 1;
> +			isolate_hugetlb(folio, &source);
> +			continue;
> +		} else if (PageTransHuge(page))
> +			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;

If we can use a folio in the PageHWPoison() case, can we use one here as 
well? I know that it's all unreliable when not holding a folio 
reference, and we have to be a bit careful.

It feels like using folios here would mostly be fine, because things 
like PageHuge() already use folios internally.

And using it in the PageHWPoison() but not here looks a bit odd.

The important part is that we don't segfault if we'd overshoot our target.
Kefeng Wang Aug. 2, 2024, 7:50 a.m. UTC | #5
On 2024/8/2 4:10, David Hildenbrand wrote:
> 
>>>
>>> We're not checking the head page here, will this work reliably for
>>> hugetlb? (I recall some difference in per-page hwpoison handling between
>>> hugetlb and THP due to the vmemmap optimization)
>>
>> Before this changes, the hwposioned hugetlb page won't try to unmap in
>> do_migrate_range(), we hope it already unmapped in memory_failure(), as
>> mentioned from comments, there maybe fail to unmap, so a new safeguard
>> to try to unmap it again here, but we don't need to guarantee it.
> 
> Thanks for clarifying!
> 
> But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.
> 
> IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page 
> not the subpage. Reason is that due to the vmemmap optimization we might 
> not be able to modify subpages to set hwpoison.

Yes, HVO is special(only head page with hwpoison), since we always want
to check head page here (next pfn = head_pfn + nr), so it might be
enough to only use PageHWpoison, but just in case, adding hwpoison check
for the head page,

	if (unlikely(PageHWPoison(page) || folio_test_hwpoison(folio)))
Kefeng Wang Aug. 2, 2024, 8:02 a.m. UTC | #6
On 2024/8/2 4:14, David Hildenbrand wrote:
> On 25.07.24 03:16, Kefeng Wang wrote:
>> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
>> pages to be offlined") don't handle the hugetlb pages, the dead loop
>> still occur if offline a hwpoison hugetlb, luckly, after the commit
>> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
>> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
>> page will be clear, and the hwpoison hugetlb page will be skipped in
>> scan_movable_pages(), so the deed loop issue is fixed.
>>
>> However if the HPageMigratable() check passed(without reference and
>> lock), the hugetlb page may be hwpoisoned, it won't cause issue since
>> the hwpoisoned page will be handled correctly in the next movable
>> pages scan loop, and it will be isolated in do_migrate_range() and
>> but fails to migrated. In order to avoid the unnecessary isolation and
>> unify all hwpoisoned page handling, let's unconditionally check hwpoison
>> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
>> the catch all safety net like normal page does.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   mm/memory_hotplug.c | 27 ++++++++++++++++-----------
>>   1 file changed, 16 insertions(+), 11 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 66267c26ca1b..ccaf4c480aed 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long 
>> start_pfn, unsigned long end_pfn)
>>           folio = page_folio(page);
>>           head = &folio->page;
>> -        if (PageHuge(page)) {
>> -            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> -            isolate_hugetlb(folio, &source);
>> -            continue;
>> -        } else if (PageTransHuge(page))
>> -            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>> -
>>           /*
>>            * HWPoison pages have elevated reference counts so the 
>> migration would
>>            * fail on them. It also doesn't make any sense to migrate 
>> them in the
>>            * first place. Still try to unmap such a page in case it is 
>> still mapped
>> -         * (e.g. current hwpoison implementation doesn't unmap KSM 
>> pages but keep
>> -         * the unmap as the catch all safety net).
>> +         * (keep the unmap as the catch all safety net).
>>            */
>> -        if (PageHWPoison(page)) {
>> +        if (unlikely(PageHWPoison(page))) {
>> +            folio = page_folio(page);
>> +
>>               if (WARN_ON(folio_test_lru(folio)))
>>                   folio_isolate_lru(folio);
>> +
>>               if (folio_mapped(folio))
>> -                try_to_unmap(folio, TTU_IGNORE_MLOCK);
>> +                unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
>> +
>> +            if (folio_test_large(folio))
>> +                pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>               continue;
>>           }
>> +        if (PageHuge(page)) {
>> +            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>> +            isolate_hugetlb(folio, &source);
>> +            continue;
>> +        } else if (PageTransHuge(page))
>> +            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> 
> If we can use a folio in the PageHWPoison() case, can we use one here as 
> well? I know that it's all unreliable when not holding a folio 
> reference, and we have to be a bit careful.

Using a folio here is part of patch4, I want to unify hugetlb/thp(or 
large folio) with "pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1" 
when large folio after get a ref.

> 
> It feels like using folios here would mostly be fine, because things 
> like PageHuge() already use folios internally.
> 
> And using it in the PageHWPoison() but not here looks a bit odd.

We will convert to use folio in the following patch.

> 
> The important part is that we don't segfault if we'd overshoot our target.
>
Kefeng Wang Aug. 6, 2024, 3:44 a.m. UTC | #7
Hi David, I have some question,

On 2024/8/2 16:02, Kefeng Wang wrote:
> 
...
>>>            */
>>> -        if (PageHWPoison(page)) {
>>> +        if (unlikely(PageHWPoison(page))) {
>>> +            folio = page_folio(page);
>>> +
>>>               if (WARN_ON(folio_test_lru(folio)))
>>>                   folio_isolate_lru(folio);
>>> +
>>>               if (folio_mapped(folio))
>>> -                try_to_unmap(folio, TTU_IGNORE_MLOCK);
>>> +                unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
>>> +
>>> +            if (folio_test_large(folio))
>>> +                pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>>               continue;
>>>           }
>>> +        if (PageHuge(page)) {
>>> +            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>> +            isolate_hugetlb(folio, &source);
>>> +            continue;
>>> +        } else if (PageTransHuge(page))

If the page is a tail page, we will BUG_ON(DEBUG_VM enabled) here, but
it seems that we don't guarantee the page won't be a tail page.

>>> +            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;

thp_nr_pages() need a head page, I think it should use head here, so we
can directly use folio_nr_pages().

>>
>> If we can use a folio in the PageHWPoison() case, can we use one here 
>> as well? I know that it's all unreliable when not holding a folio 
>> reference, and we have to be a bit careful.
> 
> Using a folio here is part of patch4, I want to unify hugetlb/thp(or 
> large folio) with "pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1" 
> when large folio after get a ref.

Think it again, even the folio don't hold a ref(splitting concurrently
or something else), folio_nr_pages return incorrect, it won't cause 
issue since we will loop and find movable pages again in 
scan_movable_pages() and try to isolate pages, so directly use

if (folio_test_large(folio)) {
	pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
	if (folio_test_hugetlb(folio))
		isolate_hugetlb(folio, &source);
}

Correct me if I am wrong, thanks.

> 
>>
>> It feels like using folios here would mostly be fine, because things 
>> like PageHuge() already use folios internally.
>>
>> And using it in the PageHWPoison() but not here looks a bit odd.
> 
> We will convert to use folio in the following patch.
> 
>>
>> The important part is that we don't segfault if we'd overshoot our 
>> target.
>>
>
David Hildenbrand Aug. 6, 2024, 9:15 a.m. UTC | #8
On 02.08.24 10:02, Kefeng Wang wrote:
> 
> 
> On 2024/8/2 4:14, David Hildenbrand wrote:
>> On 25.07.24 03:16, Kefeng Wang wrote:
>>> The commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned
>>> pages to be offlined") don't handle the hugetlb pages, the dead loop
>>> still occur if offline a hwpoison hugetlb, luckly, after the commit
>>> e591ef7d96d6 ("mm,hwpoison,hugetlb,memory_hotplug: hotremove memory
>>> section with hwpoisoned hugepage"), the HPageMigratable of hugetlb
>>> page will be clear, and the hwpoison hugetlb page will be skipped in
>>> scan_movable_pages(), so the deed loop issue is fixed.
>>>
>>> However if the HPageMigratable() check passed(without reference and
>>> lock), the hugetlb page may be hwpoisoned, it won't cause issue since
>>> the hwpoisoned page will be handled correctly in the next movable
>>> pages scan loop, and it will be isolated in do_migrate_range() and
>>> but fails to migrated. In order to avoid the unnecessary isolation and
>>> unify all hwpoisoned page handling, let's unconditionally check hwpoison
>>> firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as
>>> the catch all safety net like normal page does.
>>>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>>    mm/memory_hotplug.c | 27 ++++++++++++++++-----------
>>>    1 file changed, 16 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 66267c26ca1b..ccaf4c480aed 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -1788,28 +1788,33 @@ static void do_migrate_range(unsigned long
>>> start_pfn, unsigned long end_pfn)
>>>            folio = page_folio(page);
>>>            head = &folio->page;
>>> -        if (PageHuge(page)) {
>>> -            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>> -            isolate_hugetlb(folio, &source);
>>> -            continue;
>>> -        } else if (PageTransHuge(page))
>>> -            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>>> -
>>>            /*
>>>             * HWPoison pages have elevated reference counts so the
>>> migration would
>>>             * fail on them. It also doesn't make any sense to migrate
>>> them in the
>>>             * first place. Still try to unmap such a page in case it is
>>> still mapped
>>> -         * (e.g. current hwpoison implementation doesn't unmap KSM
>>> pages but keep
>>> -         * the unmap as the catch all safety net).
>>> +         * (keep the unmap as the catch all safety net).
>>>             */
>>> -        if (PageHWPoison(page)) {
>>> +        if (unlikely(PageHWPoison(page))) {
>>> +            folio = page_folio(page);
>>> +
>>>                if (WARN_ON(folio_test_lru(folio)))
>>>                    folio_isolate_lru(folio);
>>> +
>>>                if (folio_mapped(folio))
>>> -                try_to_unmap(folio, TTU_IGNORE_MLOCK);
>>> +                unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
>>> +
>>> +            if (folio_test_large(folio))
>>> +                pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>>                continue;
>>>            }
>>> +        if (PageHuge(page)) {
>>> +            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>> +            isolate_hugetlb(folio, &source);
>>> +            continue;
>>> +        } else if (PageTransHuge(page))
>>> +            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
>>
>> If we can use a folio in the PageHWPoison() case, can we use one here as
>> well? I know that it's all unreliable when not holding a folio
>> reference, and we have to be a bit careful.
> 
> Using a folio here is part of patch4, I want to unify hugetlb/thp(or
> large folio) with "pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1"
> when large folio after get a ref.

The thing is that it looks weird in the context of this patch to use a 
folio on path a but not on path b.
David Hildenbrand Aug. 6, 2024, 9:24 a.m. UTC | #9
On 06.08.24 05:44, Kefeng Wang wrote:
> Hi David, I have some question,
> 
> On 2024/8/2 16:02, Kefeng Wang wrote:
>>
> ...
>>>>             */
>>>> -        if (PageHWPoison(page)) {
>>>> +        if (unlikely(PageHWPoison(page))) {
>>>> +            folio = page_folio(page);
>>>> +
>>>>                if (WARN_ON(folio_test_lru(folio)))
>>>>                    folio_isolate_lru(folio);
>>>> +
>>>>                if (folio_mapped(folio))
>>>> -                try_to_unmap(folio, TTU_IGNORE_MLOCK);
>>>> +                unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
>>>> +
>>>> +            if (folio_test_large(folio))
>>>> +                pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
>>>>                continue;
>>>>            }
>>>> +        if (PageHuge(page)) {
>>>> +            pfn = page_to_pfn(head) + compound_nr(head) - 1;
>>>> +            isolate_hugetlb(folio, &source);
>>>> +            continue;
>>>> +        } else if (PageTransHuge(page))
> 
> If the page is a tail page, we will BUG_ON(DEBUG_VM enabled) here, but
> it seems that we don't guarantee the page won't be a tail page.

Maybe at some point we might want to remove these sanity checks or have 
explicit, expected-to-be-racy folio functions.

Like folio_test_hugetlb_racy(), folio_test_large_racy(), 
folio_nr_pages_racy().

Because the VM_DEBUG checks for folio_test_large() etc. actually make 
sense in other context where we know that concurrent splitting is 
impossible.

But maybe part of the puzzle will be in the future that we want to do a 
RCU read lock here and perform freeing/splitting under RCU, when we'll 
also have to alloc/free the "struct folio".

> 
>>>> +            pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
> 
> thp_nr_pages() need a head page, I think it should use head here, so we
> can directly use folio_nr_pages().
> 
>>>
>>> If we can use a folio in the PageHWPoison() case, can we use one here
>>> as well? I know that it's all unreliable when not holding a folio
>>> reference, and we have to be a bit careful.
>>
>> Using a folio here is part of patch4, I want to unify hugetlb/thp(or
>> large folio) with "pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1"
>> when large folio after get a ref.
> 
> Think it again, even the folio don't hold a ref(splitting concurrently
> or something else), folio_nr_pages return incorrect, it won't cause
> issue since we will loop and find movable pages again in
> scan_movable_pages() and try to isolate pages, so directly use
> 
> if (folio_test_large(folio)) {
> 	pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
> 	if (folio_test_hugetlb(folio))
> 		isolate_hugetlb(folio, &source);
> }

Likely we should add a comment here that a large folio might get split 
concurrently and that folio_nr_pages() might read garbage. But out loop 
should handle that and we would revisit the split folio later.
David Hildenbrand Aug. 6, 2024, 9:29 a.m. UTC | #10
On 02.08.24 09:50, Kefeng Wang wrote:
> 
> 
> On 2024/8/2 4:10, David Hildenbrand wrote:
>>
>>>>
>>>> We're not checking the head page here, will this work reliably for
>>>> hugetlb? (I recall some difference in per-page hwpoison handling between
>>>> hugetlb and THP due to the vmemmap optimization)
>>>
>>> Before this changes, the hwposioned hugetlb page won't try to unmap in
>>> do_migrate_range(), we hope it already unmapped in memory_failure(), as
>>> mentioned from comments, there maybe fail to unmap, so a new safeguard
>>> to try to unmap it again here, but we don't need to guarantee it.
>>
>> Thanks for clarifying!
>>
>> But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.
>>
>> IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page
>> not the subpage. Reason is that due to the vmemmap optimization we might
>> not be able to modify subpages to set hwpoison.
> 
> Yes, HVO is special(only head page with hwpoison), since we always want
> to check head page here (next pfn = head_pfn + nr), so it might be
> enough to only use PageHWpoison, but just in case, adding hwpoison check
> for the head page,
> 
> 	if (unlikely(PageHWPoison(page) || folio_test_hwpoison(folio)))

I also do wonder if we have to check for large folios 
folio_test_has_hwpoison():
if any subpage is poisoned, not just the current page.

I don't think folio_set_has_hwpoisoned() is used for hugetlb.

What a mess.
Miaohe Lin Aug. 9, 2024, 2:02 a.m. UTC | #11
On 2024/8/7 19:14, David Hildenbrand wrote:
> On 07.08.24 09:39, Miaohe Lin wrote:
>> On 2024/8/6 17:29, David Hildenbrand wrote:
>>> On 02.08.24 09:50, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2024/8/2 4:10, David Hildenbrand wrote:
>>>>>
>>>>>>>
>>>>>>> We're not checking the head page here, will this work reliably for
>>>>>>> hugetlb? (I recall some difference in per-page hwpoison handling between
>>>>>>> hugetlb and THP due to the vmemmap optimization)
>>>>>>
>>>>>> Before this changes, the hwposioned hugetlb page won't try to unmap in
>>>>>> do_migrate_range(), we hope it already unmapped in memory_failure(), as
>>>>>> mentioned from comments, there maybe fail to unmap, so a new safeguard
>>>>>> to try to unmap it again here, but we don't need to guarantee it.
>>>>>
>>>>> Thanks for clarifying!
>>>>>
>>>>> But I do wonder if the PageHWPoison() is the right thing to do for hugetlb.
>>>>>
>>>>> IIUC, hugetlb requires folio_test_hwpoison() -- testing the head page
>>>>> not the subpage. Reason is that due to the vmemmap optimization we might
>>>>> not be able to modify subpages to set hwpoison.
>>>>
>>>> Yes, HVO is special(only head page with hwpoison), since we always want
>>>> to check head page here (next pfn = head_pfn + nr), so it might be
>>>> enough to only use PageHWpoison, but just in case, adding hwpoison check
>>>> for the head page,
>>>>
>>>>      if (unlikely(PageHWPoison(page) || folio_test_hwpoison(folio)))
>>>
>>> I also do wonder if we have to check for large folios folio_test_has_hwpoison():
>>> if any subpage is poisoned, not just the current page.
>>>
>>
>> IMHO, below if condition [1] should be fine to check for any hwpoisoned folio:
>>
>>   if (folio_test_hwpoison(folio) ||
>>        (folio_test_large(folio) && folio_test_has_hwpoisoned(folio))) {
>>
>> 1. For raw pages, folio_test_hwpoison(folio) works fine.
>> 2. For thp (memory_failure fails to split it in first place), folio_test_has_hwpoisoned(folio) works fine.
>> 3. For hugetlb, we always have hwpoison flag set for folio. So folio_test_hwpoison(folio) works fine.

It seems I missed one corner case. When memory_failure meets an isolated thp, get_hwpoison_page() will return EIO and
thp won't have has_hwpoison flag set. Above pattern might not work with it. :(

>>
>> But folio might not be the right hwpoisoned page, i.e. subpages might be hwpoisoned instead.
>> Or am I miss something?
> 
> Yes, but we can only migrate full folios, and if any subpage is poisoned we're in trouble and have to effectively force-unmap it?

Yes, I agree with you.

> 
> At least that's my understanding :)

Thanks.
.
diff mbox series

Patch

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 66267c26ca1b..ccaf4c480aed 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1788,28 +1788,33 @@  static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
 		folio = page_folio(page);
 		head = &folio->page;
 
-		if (PageHuge(page)) {
-			pfn = page_to_pfn(head) + compound_nr(head) - 1;
-			isolate_hugetlb(folio, &source);
-			continue;
-		} else if (PageTransHuge(page))
-			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
-
 		/*
 		 * HWPoison pages have elevated reference counts so the migration would
 		 * fail on them. It also doesn't make any sense to migrate them in the
 		 * first place. Still try to unmap such a page in case it is still mapped
-		 * (e.g. current hwpoison implementation doesn't unmap KSM pages but keep
-		 * the unmap as the catch all safety net).
+		 * (keep the unmap as the catch all safety net).
 		 */
-		if (PageHWPoison(page)) {
+		if (unlikely(PageHWPoison(page))) {
+			folio = page_folio(page);
+
 			if (WARN_ON(folio_test_lru(folio)))
 				folio_isolate_lru(folio);
+
 			if (folio_mapped(folio))
-				try_to_unmap(folio, TTU_IGNORE_MLOCK);
+				unmap_posioned_folio(folio, TTU_IGNORE_MLOCK);
+
+			if (folio_test_large(folio))
+				pfn = folio_pfn(folio) + folio_nr_pages(folio) - 1;
 			continue;
 		}
 
+		if (PageHuge(page)) {
+			pfn = page_to_pfn(head) + compound_nr(head) - 1;
+			isolate_hugetlb(folio, &source);
+			continue;
+		} else if (PageTransHuge(page))
+			pfn = page_to_pfn(head) + thp_nr_pages(page) - 1;
+
 		if (!get_page_unless_zero(page))
 			continue;
 		/*