diff mbox series

[15/16] mm/migration: fix possible do_pages_stat_array racing with memory offline

Message ID 20220304093409.25829-16-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series A few cleanup and fixup patches for migration | expand

Commit Message

Miaohe Lin March 4, 2022, 9:34 a.m. UTC
When follow_page peeks a page, the page could be reclaimed under heavy
memory pressure and thus be offlined while it's still being used by the
do_pages_stat_array(). Use FOLL_GET to hold the page refcnt to fix this
potential issue.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/migrate.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Huang, Ying March 7, 2022, 5:21 a.m. UTC | #1
Miaohe Lin <linmiaohe@huawei.com> writes:

> When follow_page peeks a page, the page could be reclaimed under heavy
> memory pressure

I don't think that memory pressure and reclaiming will be an issue.

> and thus be offlined while it's still being used by the
> do_pages_stat_array().

"offline" seems a possible problem.

Best Regards,
Huang, Ying

> Use FOLL_GET to hold the page refcnt to fix this
> potential issue.
>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/migrate.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 7b1c0b988234..98a968e6f465 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1788,13 +1788,18 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>  			goto set_status;
>  
>  		/* FOLL_DUMP to ignore special (like zero) pages */
> -		page = follow_page(vma, addr, FOLL_DUMP);
> +		page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
>  
>  		err = PTR_ERR(page);
>  		if (IS_ERR(page))
>  			goto set_status;
>  
> -		err = page ? page_to_nid(page) : -ENOENT;
> +		if (page) {
> +			err = page_to_nid(page);
> +			put_page(page);
> +		} else {
> +			err = -ENOENT;
> +		}
>  set_status:
>  		*status = err;
Muchun Song March 7, 2022, 7:01 a.m. UTC | #2
On Mon, Mar 7, 2022 at 1:22 PM Huang, Ying <ying.huang@intel.com> wrote:
>
> Miaohe Lin <linmiaohe@huawei.com> writes:
>
> > When follow_page peeks a page, the page could be reclaimed under heavy
> > memory pressure
>
> I don't think that memory pressure and reclaiming will be an issue.

I think he means a page first to be reclaimed then to be offline
could encounter this issue and reclaiming is a precondition.

Thanks.

>
> > and thus be offlined while it's still being used by the
> > do_pages_stat_array().
>
> "offline" seems a possible problem.
Huang, Ying March 7, 2022, 7:42 a.m. UTC | #3
Muchun Song <songmuchun@bytedance.com> writes:

> On Mon, Mar 7, 2022 at 1:22 PM Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Miaohe Lin <linmiaohe@huawei.com> writes:
>>
>> > When follow_page peeks a page, the page could be reclaimed under heavy
>> > memory pressure
>>
>> I don't think that memory pressure and reclaiming will be an issue.
>
> I think he means a page first to be reclaimed then to be offline
> could encounter this issue and reclaiming is a precondition.

I don't think reclaiming is a precondition.  It seems possible that the
virtual page is migrated, then the physical page is offlined.

Best Regards,
Huang, Ying

> Thanks.
>
>>
>> > and thus be offlined while it's still being used by the
>> > do_pages_stat_array().
>>
>> "offline" seems a possible problem.
Miaohe Lin March 8, 2022, 11:33 a.m. UTC | #4
On 2022/3/7 15:42, Huang, Ying wrote:
> Muchun Song <songmuchun@bytedance.com> writes:
> 
>> On Mon, Mar 7, 2022 at 1:22 PM Huang, Ying <ying.huang@intel.com> wrote:
>>>
>>> Miaohe Lin <linmiaohe@huawei.com> writes:
>>>
>>>> When follow_page peeks a page, the page could be reclaimed under heavy
>>>> memory pressure
>>>
>>> I don't think that memory pressure and reclaiming will be an issue.
>>
>> I think he means a page first to be reclaimed then to be offline
>> could encounter this issue and reclaiming is a precondition.
> 
> I don't think reclaiming is a precondition.  It seems possible that the
> virtual page is migrated, then the physical page is offlined.
> 

What I indeed mean is a page could first be reclaimed, migrated and so on.
And then be offlined. Sorry for confusing.

Thanks both of you.

> Best Regards,
> Huang, Ying
> 
>> Thanks.
>>
>>>
>>>> and thus be offlined while it's still being used by the
>>>> do_pages_stat_array().
>>>
>>> "offline" seems a possible problem.
> .
>
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index 7b1c0b988234..98a968e6f465 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1788,13 +1788,18 @@  static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
 			goto set_status;
 
 		/* FOLL_DUMP to ignore special (like zero) pages */
-		page = follow_page(vma, addr, FOLL_DUMP);
+		page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
 
 		err = PTR_ERR(page);
 		if (IS_ERR(page))
 			goto set_status;
 
-		err = page ? page_to_nid(page) : -ENOENT;
+		if (page) {
+			err = page_to_nid(page);
+			put_page(page);
+		} else {
+			err = -ENOENT;
+		}
 set_status:
 		*status = err;