diff mbox series

[2/2] mm: support poison recovery from copy_present_page()

Message ID 20240906024201.1214712-3-wangkefeng.wang@huawei.com (mailing list archive)
State New
Headers show
Series mm: hwpoison: two more poison recovery | expand

Commit Message

Kefeng Wang Sept. 6, 2024, 2:42 a.m. UTC
Similar to other poison recovery, use copy_mc_user_highpage() to
avoid potentially kernel panic during copy page in copy_present_page()
from fork, once copy failed due to hwpoison in source page, we need
to break out of copy in copy_pte_range() and release prealloc folio,
so copy_mc_user_highpage() is moved ahead before set *prealloc to NULL.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/memory.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Jane Chu Sept. 6, 2024, 11:14 p.m. UTC | #1
On 9/5/2024 7:42 PM, Kefeng Wang wrote:

> Similar to other poison recovery, use copy_mc_user_highpage() to
> avoid potentially kernel panic during copy page in copy_present_page()
> from fork, once copy failed due to hwpoison in source page, we need
> to break out of copy in copy_pte_range() and release prealloc folio,
> so copy_mc_user_highpage() is moved ahead before set *prealloc to NULL.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   mm/memory.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index d310c073a1b3..6e7b78e49d1a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -926,8 +926,11 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
>   	 * We have a prealloc page, all good!  Take it
>   	 * over and copy the page & arm it.
>   	 */
> +
> +	if (copy_mc_user_highpage(&new_folio->page, page, addr, src_vma))
> +		return -EHWPOISON;
> +
>   	*prealloc = NULL;
> -	copy_user_highpage(&new_folio->page, page, addr, src_vma);
>   	__folio_mark_uptodate(new_folio);
>   	folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE);
>   	folio_add_lru_vma(new_folio, dst_vma);
> @@ -1166,8 +1169,9 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>   		/*
>   		 * If we need a pre-allocated page for this pte, drop the
>   		 * locks, allocate, and try again.
> +		 * If copy failed due to hwpoison in source page, break out.
>   		 */
> -		if (unlikely(ret == -EAGAIN))
> +		if (unlikely(ret == -EAGAIN || ret == -EHWPOISON))
>   			break;
>   		if (unlikely(prealloc)) {
>   			/*
> @@ -1197,7 +1201,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>   			goto out;
>   		}
>   		entry.val = 0;
> -	} else if (ret == -EBUSY) {
> +	} else if (ret == -EBUSY || unlikely(ret == -EHWPOISON)) {
>   		goto out;
>   	} else if (ret ==  -EAGAIN) {
>   		prealloc = folio_prealloc(src_mm, src_vma, addr, false);

Looks good.

Reviewed-by: Jane Chu <jane.chu@oracle.com>

-jane
Miaohe Lin Sept. 10, 2024, 2:19 a.m. UTC | #2
On 2024/9/6 10:42, Kefeng Wang wrote:
> Similar to other poison recovery, use copy_mc_user_highpage() to
> avoid potentially kernel panic during copy page in copy_present_page()
> from fork, once copy failed due to hwpoison in source page, we need
> to break out of copy in copy_pte_range() and release prealloc folio,
> so copy_mc_user_highpage() is moved ahead before set *prealloc to NULL.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  mm/memory.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d310c073a1b3..6e7b78e49d1a 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -926,8 +926,11 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
>  	 * We have a prealloc page, all good!  Take it
>  	 * over and copy the page & arm it.
>  	 */
> +
> +	if (copy_mc_user_highpage(&new_folio->page, page, addr, src_vma))
> +		return -EHWPOISON;
> +
>  	*prealloc = NULL;
> -	copy_user_highpage(&new_folio->page, page, addr, src_vma);
>  	__folio_mark_uptodate(new_folio);
>  	folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE);
>  	folio_add_lru_vma(new_folio, dst_vma);
> @@ -1166,8 +1169,9 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>  		/*
>  		 * If we need a pre-allocated page for this pte, drop the
>  		 * locks, allocate, and try again.
> +		 * If copy failed due to hwpoison in source page, break out.
>  		 */
> -		if (unlikely(ret == -EAGAIN))
> +		if (unlikely(ret == -EAGAIN || ret == -EHWPOISON))

Will it be better to put checking ret against -EHWPOISON in a new line? -EAGAIN case will enter the
loop again but -EHWPOISON case never does.

>  			break;
>  		if (unlikely(prealloc)) {
>  			/*
> @@ -1197,7 +1201,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>  			goto out;
>  		}
>  		entry.val = 0;
> -	} else if (ret == -EBUSY) {
> +	} else if (ret == -EBUSY || unlikely(ret == -EHWPOISON)) {

The caller of copy_pte_range() always set errno to -ENOMEM. So fork will failed with ENOMEM even if the real
cause is failed to copy due to hwpoison in source page. It's a pity.

Thanks.
.
Kefeng Wang Sept. 10, 2024, 6:35 a.m. UTC | #3
On 2024/9/10 10:19, Miaohe Lin wrote:
> On 2024/9/6 10:42, Kefeng Wang wrote:
>> Similar to other poison recovery, use copy_mc_user_highpage() to
>> avoid potentially kernel panic during copy page in copy_present_page()
>> from fork, once copy failed due to hwpoison in source page, we need
>> to break out of copy in copy_pte_range() and release prealloc folio,
>> so copy_mc_user_highpage() is moved ahead before set *prealloc to NULL.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   mm/memory.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index d310c073a1b3..6e7b78e49d1a 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -926,8 +926,11 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
>>   	 * We have a prealloc page, all good!  Take it
>>   	 * over and copy the page & arm it.
>>   	 */
>> +
>> +	if (copy_mc_user_highpage(&new_folio->page, page, addr, src_vma))
>> +		return -EHWPOISON;
>> +
>>   	*prealloc = NULL;
>> -	copy_user_highpage(&new_folio->page, page, addr, src_vma);
>>   	__folio_mark_uptodate(new_folio);
>>   	folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE);
>>   	folio_add_lru_vma(new_folio, dst_vma);
>> @@ -1166,8 +1169,9 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>>   		/*
>>   		 * If we need a pre-allocated page for this pte, drop the
>>   		 * locks, allocate, and try again.
>> +		 * If copy failed due to hwpoison in source page, break out.
>>   		 */
>> -		if (unlikely(ret == -EAGAIN))
>> +		if (unlikely(ret == -EAGAIN || ret == -EHWPOISON))
> 
> Will it be better to put checking ret against -EHWPOISON in a new line? -EAGAIN case will enter the
> loop again but -EHWPOISON case never does.

Maybe not a newline since we will recheck the ret in the below and 
return directly from copy_present_page().

> 
>>   			break;
>>   		if (unlikely(prealloc)) {
>>   			/*
>> @@ -1197,7 +1201,7 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>>   			goto out;
>>   		}
>>   		entry.val = 0;
>> -	} else if (ret == -EBUSY) {
>> +	} else if (ret == -EBUSY || unlikely(ret == -EHWPOISON)) {
> 
> The caller of copy_pte_range() always set errno to -ENOMEM. So fork will failed with ENOMEM even if the real
> cause is failed to copy due to hwpoison in source page. It's a pity.

Yes, it's not just the new -EHWPOISON, all other errnos(ENOENT/EIO) will 
be ignored and return -ENOMEM instead.

> 
> Thanks.
> .
Miaohe Lin Sept. 12, 2024, 2:06 a.m. UTC | #4
On 2024/9/10 14:35, Kefeng Wang wrote:
> 
> 
> On 2024/9/10 10:19, Miaohe Lin wrote:
>> On 2024/9/6 10:42, Kefeng Wang wrote:
>>> Similar to other poison recovery, use copy_mc_user_highpage() to
>>> avoid potentially kernel panic during copy page in copy_present_page()
>>> from fork, once copy failed due to hwpoison in source page, we need
>>> to break out of copy in copy_pte_range() and release prealloc folio,
>>> so copy_mc_user_highpage() is moved ahead before set *prealloc to NULL.
>>>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>>   mm/memory.c | 10 +++++++---
>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index d310c073a1b3..6e7b78e49d1a 100644
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -926,8 +926,11 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
>>>        * We have a prealloc page, all good!  Take it
>>>        * over and copy the page & arm it.
>>>        */
>>> +
>>> +    if (copy_mc_user_highpage(&new_folio->page, page, addr, src_vma))
>>> +        return -EHWPOISON;
>>> +
>>>       *prealloc = NULL;
>>> -    copy_user_highpage(&new_folio->page, page, addr, src_vma);
>>>       __folio_mark_uptodate(new_folio);
>>>       folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE);
>>>       folio_add_lru_vma(new_folio, dst_vma);
>>> @@ -1166,8 +1169,9 @@ copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>>>           /*
>>>            * If we need a pre-allocated page for this pte, drop the
>>>            * locks, allocate, and try again.
>>> +         * If copy failed due to hwpoison in source page, break out.
>>>            */
>>> -        if (unlikely(ret == -EAGAIN))
>>> +        if (unlikely(ret == -EAGAIN || ret == -EHWPOISON))
>>
>> Will it be better to put checking ret against -EHWPOISON in a new line? -EAGAIN case will enter the
>> loop again but -EHWPOISON case never does.
> 
> Maybe not a newline since we will recheck the ret in the below and return directly from copy_present_page().

Anyway, this patch looks good to me.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Thanks.
.
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index d310c073a1b3..6e7b78e49d1a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -926,8 +926,11 @@  copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
 	 * We have a prealloc page, all good!  Take it
 	 * over and copy the page & arm it.
 	 */
+
+	if (copy_mc_user_highpage(&new_folio->page, page, addr, src_vma))
+		return -EHWPOISON;
+
 	*prealloc = NULL;
-	copy_user_highpage(&new_folio->page, page, addr, src_vma);
 	__folio_mark_uptodate(new_folio);
 	folio_add_new_anon_rmap(new_folio, dst_vma, addr, RMAP_EXCLUSIVE);
 	folio_add_lru_vma(new_folio, dst_vma);
@@ -1166,8 +1169,9 @@  copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
 		/*
 		 * If we need a pre-allocated page for this pte, drop the
 		 * locks, allocate, and try again.
+		 * If copy failed due to hwpoison in source page, break out.
 		 */
-		if (unlikely(ret == -EAGAIN))
+		if (unlikely(ret == -EAGAIN || ret == -EHWPOISON))
 			break;
 		if (unlikely(prealloc)) {
 			/*
@@ -1197,7 +1201,7 @@  copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
 			goto out;
 		}
 		entry.val = 0;
-	} else if (ret == -EBUSY) {
+	} else if (ret == -EBUSY || unlikely(ret == -EHWPOISON)) {
 		goto out;
 	} else if (ret ==  -EAGAIN) {
 		prealloc = folio_prealloc(src_mm, src_vma, addr, false);