diff mbox series

[12/16] mm/migration: fix potential page refcounts leak in migrate_pages

Message ID 20220304093409.25829-13-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series A few cleanup and fixup patches for migration | expand

Commit Message

Miaohe Lin March 4, 2022, 9:34 a.m. UTC
In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
left in thp_split_pages list. We should move them back to migration
list so that they could be put back to the right list by the caller
otherwise the page refcnt will be leaked here. Also adjust nr_failed
and nr_thp_failed accordingly to make vm events account more accurate.

Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/migrate.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Zi Yan March 4, 2022, 3:21 p.m. UTC | #1
On 4 Mar 2022, at 4:34, Miaohe Lin wrote:

> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
> left in thp_split_pages list. We should move them back to migration
> list so that they could be put back to the right list by the caller
> otherwise the page refcnt will be leaked here. Also adjust nr_failed
> and nr_thp_failed accordingly to make vm events account more accurate.
>
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/migrate.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e0db06927f02..6c2dfed2ddb8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  				}
>
>  				nr_failed_pages += nr_subpages;
> +				/*
> +				 * There might be some subpages of fail-to-migrate THPs
> +				 * left in thp_split_pages list. Move them back to migration
> +				 * list so that they could be put back to the right list by
> +				 * the caller otherwise the page refcnt will be leaked.
> +				 */
> +				list_splice_init(&thp_split_pages, from);
> +				nr_failed += retry;
> +				nr_thp_failed += thp_retry;
>  				goto out;
>  			case -EAGAIN:
>  				if (is_thp)
> -- 
> 2.23.0

LGTM. Thanks. Reviewed-by: Zi Yan <ziy@nvidia.com>


--
Best Regards,
Yan, Zi
Baolin Wang March 7, 2022, 1:57 a.m. UTC | #2
Hi Miaohe,

On 3/4/2022 5:34 PM, Miaohe Lin wrote:
> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
> left in thp_split_pages list. We should move them back to migration
> list so that they could be put back to the right list by the caller
> otherwise the page refcnt will be leaked here. Also adjust nr_failed
> and nr_thp_failed accordingly to make vm events account more accurate.
> 
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>   mm/migrate.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e0db06927f02..6c2dfed2ddb8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   				}
>   
>   				nr_failed_pages += nr_subpages;
> +				/*
> +				 * There might be some subpages of fail-to-migrate THPs
> +				 * left in thp_split_pages list. Move them back to migration
> +				 * list so that they could be put back to the right list by
> +				 * the caller otherwise the page refcnt will be leaked.
> +				 */
> +				list_splice_init(&thp_split_pages, from);
> +				nr_failed += retry;
> +				nr_thp_failed += thp_retry;

Yes, I think we missed this case before, and your patch looks right. But 
we should also update the 'rc' to return the correct number of pages 
that were not migrated, right?
Huang, Ying March 7, 2022, 5:01 a.m. UTC | #3
Miaohe Lin <linmiaohe@huawei.com> writes:

> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
> left in thp_split_pages list. We should move them back to migration
> list so that they could be put back to the right list by the caller
> otherwise the page refcnt will be leaked here. Also adjust nr_failed
> and nr_thp_failed accordingly to make vm events account more accurate.
>
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/migrate.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e0db06927f02..6c2dfed2ddb8 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  				}
>  
>  				nr_failed_pages += nr_subpages;
> +				/*
> +				 * There might be some subpages of fail-to-migrate THPs
> +				 * left in thp_split_pages list. Move them back to migration
> +				 * list so that they could be put back to the right list by
> +				 * the caller otherwise the page refcnt will be leaked.
> +				 */
> +				list_splice_init(&thp_split_pages, from);
> +				nr_failed += retry;

It appears that we don't need to change nr_failed, because we don't use
it for this situation.  Otherwise looks good to me.

Reviewed-by: "Huang, Ying" <ying.huang@intel.com>

Best Regards,
Huang, Ying

> +				nr_thp_failed += thp_retry;
>  				goto out;
>  			case -EAGAIN:
>  				if (is_thp)
Huang, Ying March 7, 2022, 5:02 a.m. UTC | #4
Baolin Wang <baolin.wang@linux.alibaba.com> writes:

> Hi Miaohe,
>
> On 3/4/2022 5:34 PM, Miaohe Lin wrote:
>> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
>> left in thp_split_pages list. We should move them back to migration
>> list so that they could be put back to the right list by the caller
>> otherwise the page refcnt will be leaked here. Also adjust nr_failed
>> and nr_thp_failed accordingly to make vm events account more accurate.
>> Fixes: b5bade978e9b ("mm: migrate: fix the return value of
>> migrate_pages()")
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>   mm/migrate.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index e0db06927f02..6c2dfed2ddb8 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   				}
>>     				nr_failed_pages += nr_subpages;
>> +				/*
>> +				 * There might be some subpages of fail-to-migrate THPs
>> +				 * left in thp_split_pages list. Move them back to migration
>> +				 * list so that they could be put back to the right list by
>> +				 * the caller otherwise the page refcnt will be leaked.
>> +				 */
>> +				list_splice_init(&thp_split_pages, from);
>> +				nr_failed += retry;
>> +				nr_thp_failed += thp_retry;
>
> Yes, I think we missed this case before, and your patch looks
> right. But we should also update the 'rc' to return the correct number
> of pages that were not migrated, right?

Per my understanding, -ENOMEM should be returned to indicate an fatal
error.

Best Regards,
Huang, Ying
Baolin Wang March 7, 2022, 6 a.m. UTC | #5
On 3/7/2022 1:02 PM, Huang, Ying wrote:
> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> 
>> Hi Miaohe,
>>
>> On 3/4/2022 5:34 PM, Miaohe Lin wrote:
>>> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
>>> left in thp_split_pages list. We should move them back to migration
>>> list so that they could be put back to the right list by the caller
>>> otherwise the page refcnt will be leaked here. Also adjust nr_failed
>>> and nr_thp_failed accordingly to make vm events account more accurate.
>>> Fixes: b5bade978e9b ("mm: migrate: fix the return value of
>>> migrate_pages()")
>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>> ---
>>>    mm/migrate.c | 9 +++++++++
>>>    1 file changed, 9 insertions(+)
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index e0db06927f02..6c2dfed2ddb8 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>    				}
>>>      				nr_failed_pages += nr_subpages;
>>> +				/*
>>> +				 * There might be some subpages of fail-to-migrate THPs
>>> +				 * left in thp_split_pages list. Move them back to migration
>>> +				 * list so that they could be put back to the right list by
>>> +				 * the caller otherwise the page refcnt will be leaked.
>>> +				 */
>>> +				list_splice_init(&thp_split_pages, from);
>>> +				nr_failed += retry;
>>> +				nr_thp_failed += thp_retry;
>>
>> Yes, I think we missed this case before, and your patch looks
>> right. But we should also update the 'rc' to return the correct number
>> of pages that were not migrated, right?
> 
> Per my understanding, -ENOMEM should be returned to indicate an fatal
> error.
> 

Ah, right. Sorry for noise.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Miaohe Lin March 7, 2022, 12:01 p.m. UTC | #6
On 2022/3/7 9:57, Baolin Wang wrote:
> Hi Miaohe,
> 
> On 3/4/2022 5:34 PM, Miaohe Lin wrote:
>> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
>> left in thp_split_pages list. We should move them back to migration
>> list so that they could be put back to the right list by the caller
>> otherwise the page refcnt will be leaked here. Also adjust nr_failed
>> and nr_thp_failed accordingly to make vm events account more accurate.
>>
>> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>   mm/migrate.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index e0db06927f02..6c2dfed2ddb8 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>                   }
>>                     nr_failed_pages += nr_subpages;
>> +                /*
>> +                 * There might be some subpages of fail-to-migrate THPs
>> +                 * left in thp_split_pages list. Move them back to migration
>> +                 * list so that they could be put back to the right list by
>> +                 * the caller otherwise the page refcnt will be leaked.
>> +                 */
>> +                list_splice_init(&thp_split_pages, from);
>> +                nr_failed += retry;
>> +                nr_thp_failed += thp_retry;
> 
> Yes, I think we missed this case before, and your patch looks right. But we should also update the 'rc' to return the correct number of pages that were not migrated, right?

I'am not sure. -ENOMEM case always returns -ENOMEM since commit 95a402c3847c ("[PATCH] page migration:
use allocator function for migrate_pages()"). So I did not change rc. But I think you're right. We should
return the correct number of pages that were not migrated in this case.

Thanks.

> .
Miaohe Lin March 7, 2022, 12:03 p.m. UTC | #7
On 2022/3/7 14:00, Baolin Wang wrote:
> 
> 
> On 3/7/2022 1:02 PM, Huang, Ying wrote:
>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>
>>> Hi Miaohe,
>>>
>>> On 3/4/2022 5:34 PM, Miaohe Lin wrote:
>>>> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
>>>> left in thp_split_pages list. We should move them back to migration
>>>> list so that they could be put back to the right list by the caller
>>>> otherwise the page refcnt will be leaked here. Also adjust nr_failed
>>>> and nr_thp_failed accordingly to make vm events account more accurate.
>>>> Fixes: b5bade978e9b ("mm: migrate: fix the return value of
>>>> migrate_pages()")
>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>>>> ---
>>>>    mm/migrate.c | 9 +++++++++
>>>>    1 file changed, 9 insertions(+)
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index e0db06927f02..6c2dfed2ddb8 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>>                    }
>>>>                      nr_failed_pages += nr_subpages;
>>>> +                /*
>>>> +                 * There might be some subpages of fail-to-migrate THPs
>>>> +                 * left in thp_split_pages list. Move them back to migration
>>>> +                 * list so that they could be put back to the right list by
>>>> +                 * the caller otherwise the page refcnt will be leaked.
>>>> +                 */
>>>> +                list_splice_init(&thp_split_pages, from);
>>>> +                nr_failed += retry;
>>>> +                nr_thp_failed += thp_retry;
>>>
>>> Yes, I think we missed this case before, and your patch looks
>>> right. But we should also update the 'rc' to return the correct number
>>> of pages that were not migrated, right?
>>
>> Per my understanding, -ENOMEM should be returned to indicate an fatal
>> error.
>>
> 
> Ah, right. Sorry for noise.
> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Oh, I missed this email. So we should return -ENOMEM in this case. Many thanks for both of you.

> .
Miaohe Lin March 7, 2022, 12:11 p.m. UTC | #8
On 2022/3/7 13:01, Huang, Ying wrote:
> Miaohe Lin <linmiaohe@huawei.com> writes:
> 
>> In -ENOMEM case, there might be some subpages of fail-to-migrate THPs
>> left in thp_split_pages list. We should move them back to migration
>> list so that they could be put back to the right list by the caller
>> otherwise the page refcnt will be leaked here. Also adjust nr_failed
>> and nr_thp_failed accordingly to make vm events account more accurate.
>>
>> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/migrate.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index e0db06927f02..6c2dfed2ddb8 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1422,6 +1422,15 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>  				}
>>  
>>  				nr_failed_pages += nr_subpages;
>> +				/*
>> +				 * There might be some subpages of fail-to-migrate THPs
>> +				 * left in thp_split_pages list. Move them back to migration
>> +				 * list so that they could be put back to the right list by
>> +				 * the caller otherwise the page refcnt will be leaked.
>> +				 */
>> +				list_splice_init(&thp_split_pages, from);
>> +				nr_failed += retry;
> 
> It appears that we don't need to change nr_failed, because we don't use
> it for this situation.  Otherwise looks good to me.
> 

You're right. nr_failed is not used for this case.

> Reviewed-by: "Huang, Ying" <ying.huang@intel.com>

Many thanks for your review.

> 
> Best Regards,
> Huang, Ying
> 
>> +				nr_thp_failed += thp_retry;
>>  				goto out;
>>  			case -EAGAIN:
>>  				if (is_thp)
> .
>
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index e0db06927f02..6c2dfed2ddb8 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1422,6 +1422,15 @@  int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				}
 
 				nr_failed_pages += nr_subpages;
+				/*
+				 * There might be some subpages of fail-to-migrate THPs
+				 * left in thp_split_pages list. Move them back to migration
+				 * list so that they could be put back to the right list by
+				 * the caller otherwise the page refcnt will be leaked.
+				 */
+				list_splice_init(&thp_split_pages, from);
+				nr_failed += retry;
+				nr_thp_failed += thp_retry;
 				goto out;
 			case -EAGAIN:
 				if (is_thp)