diff mbox series

mm/migrate_device.c: Fix a misleading and out-dated comment

Message ID 20220825014905.977168-1-apopple@nvidia.com (mailing list archive)
State New
Headers show
Series mm/migrate_device.c: Fix a misleading and out-dated comment | expand

Commit Message

Alistair Popple Aug. 25, 2022, 1:49 a.m. UTC
Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
the way trylock_page() in migrate_vma_collect_pmd() works without
updating the comment. Reword the comment to be less misleading and a
better reflection of what happens.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Peter Xu <peterx@redhat.com>
Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
---
 mm/migrate_device.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Peter Xu Aug. 25, 2022, 11:51 p.m. UTC | #1
On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> the way trylock_page() in migrate_vma_collect_pmd() works without
> updating the comment. Reword the comment to be less misleading and a
> better reflection of what happens.
> 
> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> Reported-by: Peter Xu <peterx@redhat.com>
> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> ---
>  mm/migrate_device.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 5052093d0262..0736f846de0b 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  		get_page(page);
>  
>  		/*
> -		 * Optimize for the common case where page is only mapped once
> -		 * in one process. If we can lock the page, then we can safely
> -		 * set up a special migration page table entry now.
> +		 * If we can't lock the page we can't migrate it. If we can it's
> +		 * safe to set up a migration entry now. In the common case
> +		 * where the page is mapped once in a single process setting up
> +		 * the migration entry now is an optimisation to avoid walking
> +		 * the rmap later with try_to_migrate().
>  		 */

IMHO the last sentence can still be a bit confusing - here we 100% rely on
the trylock() to proceed or we'll stop migration right away. IMHO that
means this is not an optimization, since optimizations should always be
optional but not the case here.

Meanwhile it'll be great to also mention about why trylock is needed and no
further attempt to use lock_page().  The comment in prepare() previously
was great but unfortunately that code clip was removed.

In short, do you think something like this might be clearer?

		/*
		 * We rely on the trylock() to migrate the pte.  If this
		 * fails, we'll fail the migration of this page.  IOW, the
		 * migration is very much best-effort, just like we'll also
		 * bail out if we found page pinned by other users after
		 * page being locked.
		 *
		 * We don't use lock_page() here or even later.  It's
		 * because if to do so we may need to take the locks for
		 * multiple pages one by one, and the order to take the
		 * page locks is unclear.  For example, if two processes
		 * want to lock multiple pages but in different order it
		 * can lead to deadlock.
		 */

Thanks,
Alistair Popple Aug. 26, 2022, 12:34 a.m. UTC | #2
Peter Xu <peterx@redhat.com> writes:

> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
>> the way trylock_page() in migrate_vma_collect_pmd() works without
>> updating the comment. Reword the comment to be less misleading and a
>> better reflection of what happens.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>> Reported-by: Peter Xu <peterx@redhat.com>
>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
>> ---
>>  mm/migrate_device.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>> index 5052093d0262..0736f846de0b 100644
>> --- a/mm/migrate_device.c
>> +++ b/mm/migrate_device.c
>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>  		get_page(page);
>>
>>  		/*
>> -		 * Optimize for the common case where page is only mapped once
>> -		 * in one process. If we can lock the page, then we can safely
>> -		 * set up a special migration page table entry now.
>> +		 * If we can't lock the page we can't migrate it. If we can it's
>> +		 * safe to set up a migration entry now. In the common case
>> +		 * where the page is mapped once in a single process setting up
>> +		 * the migration entry now is an optimisation to avoid walking
>> +		 * the rmap later with try_to_migrate().
>>  		 */
>
> IMHO the last sentence can still be a bit confusing - here we 100% rely on
> the trylock() to proceed or we'll stop migration right away. IMHO that
> means this is not an optimization, since optimizations should always be
> optional but not the case here.

We have to lock the page here, we don't have to install the migration
entries. Installing the migration entries here is optional and is the
optimisation.

> Meanwhile it'll be great to also mention about why trylock is needed and no
> further attempt to use lock_page().  The comment in prepare() previously
> was great but unfortunately that code clip was removed.

Will add.

> In short, do you think something like this might be clearer?

I think it's important to mention the optimisation, otherwise the
temptation is to remove the installation of migration entries here and
rely on try_to_migrate() to do it later. I would actually like to be
able to do that because it simplifies the code in many ways but based on
my testing the optimisation turns out to be very worth while.

> 		/*
> 		 * We rely on the trylock() to migrate the pte.  If this
> 		 * fails, we'll fail the migration of this page.  IOW, the
> 		 * migration is very much best-effort, just like we'll also
> 		 * bail out if we found page pinned by other users after
> 		 * page being locked.

Honestly I think this describes what the code does rather than why and
is likely to become outdated and confusing. IMHO it's quite clear from
the code that the migration will fail here if we can't lock the page.

And the fact that migration can fail is already covered in the existing
documentation. See for example the extensive comment on
migrate_vma_setup() about how the whole flow works. Eg:

 * Once migrate_vma_pages() returns the caller may inspect which pages were
 * successfully migrated, and which were not.  Successfully migrated pages will
 * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.

> 		 *
> 		 * We don't use lock_page() here or even later.  It's
> 		 * because if to do so we may need to take the locks for
> 		 * multiple pages one by one, and the order to take the
> 		 * page locks is unclear.  For example, if two processes
> 		 * want to lock multiple pages but in different order it
> 		 * can lead to deadlock.
> 		 */
>
> Thanks,
Peter Xu Aug. 26, 2022, 3:03 p.m. UTC | #3
On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
> 
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> >> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> >> the way trylock_page() in migrate_vma_collect_pmd() works without
> >> updating the comment. Reword the comment to be less misleading and a
> >> better reflection of what happens.
> >>
> >> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> >> Reported-by: Peter Xu <peterx@redhat.com>
> >> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> >> ---
> >>  mm/migrate_device.c | 8 +++++---
> >>  1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >> index 5052093d0262..0736f846de0b 100644
> >> --- a/mm/migrate_device.c
> >> +++ b/mm/migrate_device.c
> >> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >>  		get_page(page);
> >>
> >>  		/*
> >> -		 * Optimize for the common case where page is only mapped once
> >> -		 * in one process. If we can lock the page, then we can safely
> >> -		 * set up a special migration page table entry now.
> >> +		 * If we can't lock the page we can't migrate it. If we can it's
> >> +		 * safe to set up a migration entry now. In the common case
> >> +		 * where the page is mapped once in a single process setting up
> >> +		 * the migration entry now is an optimisation to avoid walking
> >> +		 * the rmap later with try_to_migrate().
> >>  		 */
> >
> > IMHO the last sentence can still be a bit confusing - here we 100% rely on
> > the trylock() to proceed or we'll stop migration right away. IMHO that
> > means this is not an optimization, since optimizations should always be
> > optional but not the case here.
> 
> We have to lock the page here, we don't have to install the migration
> entries. Installing the migration entries here is optional and is the
> optimisation.

I see what you mean now.

> 
> > Meanwhile it'll be great to also mention about why trylock is needed and no
> > further attempt to use lock_page().  The comment in prepare() previously
> > was great but unfortunately that code clip was removed.
> 
> Will add.
> 
> > In short, do you think something like this might be clearer?
> 
> I think it's important to mention the optimisation, otherwise the
> temptation is to remove the installation of migration entries here and
> rely on try_to_migrate() to do it later. I would actually like to be
> able to do that because it simplifies the code in many ways but based on
> my testing the optimisation turns out to be very worth while.
> 
> > 		/*
> > 		 * We rely on the trylock() to migrate the pte.  If this
> > 		 * fails, we'll fail the migration of this page.  IOW, the
> > 		 * migration is very much best-effort, just like we'll also
> > 		 * bail out if we found page pinned by other users after
> > 		 * page being locked.
> 
> Honestly I think this describes what the code does rather than why and
> is likely to become outdated and confusing. IMHO it's quite clear from
> the code that the migration will fail here if we can't lock the page.

If you see that's what I was struggling to understand previously, so not
clear at least to me. :) Since normally a function like page migration
should (from the gut feeling) not rely on trylock only.

> 
> And the fact that migration can fail is already covered in the existing
> documentation. See for example the extensive comment on
> migrate_vma_setup() about how the whole flow works. Eg:
> 
>  * Once migrate_vma_pages() returns the caller may inspect which pages were
>  * successfully migrated, and which were not.  Successfully migrated pages will
>  * have the MIGRATE_PFN_MIGRATE flag set for their src array entry.

IIUC trylock failure could be detected even earlier than that.  E.g. the
ppc code has:

	ret = migrate_vma_setup(&mig);
	if (ret)
		return -1;

	spage = migrate_pfn_to_page(*mig.src);
	if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE))
		goto out_finalize;

So afaict it won't even reach migrate_vma_pages() by missing of the flag
MIGRATE_PFN_VALID in the src mpfn.

Probably because the migrate_vma_*() API is extensive enough so it's hard
to summarize its usage in a short paragraph.  I'm probably asking too much..

No strong opinion on the rest as long as we can add back some explanation
on why lock_page() is not used.

Thanks again for doing this.
David Hildenbrand Aug. 26, 2022, 5:17 p.m. UTC | #4
On 26.08.22 17:03, Peter Xu wrote:
> On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
>>
>> Peter Xu <peterx@redhat.com> writes:
>>
>>> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
>>>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
>>>> the way trylock_page() in migrate_vma_collect_pmd() works without
>>>> updating the comment. Reword the comment to be less misleading and a
>>>> better reflection of what happens.
>>>>
>>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>> Reported-by: Peter Xu <peterx@redhat.com>
>>>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
>>>> ---
>>>>  mm/migrate_device.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
>>>> index 5052093d0262..0736f846de0b 100644
>>>> --- a/mm/migrate_device.c
>>>> +++ b/mm/migrate_device.c
>>>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>>>>  		get_page(page);
>>>>
>>>>  		/*
>>>> -		 * Optimize for the common case where page is only mapped once
>>>> -		 * in one process. If we can lock the page, then we can safely
>>>> -		 * set up a special migration page table entry now.
>>>> +		 * If we can't lock the page we can't migrate it. If we can it's
>>>> +		 * safe to set up a migration entry now. In the common case
>>>> +		 * where the page is mapped once in a single process setting up
>>>> +		 * the migration entry now is an optimisation to avoid walking
>>>> +		 * the rmap later with try_to_migrate().
>>>>  		 */
>>>
>>> IMHO the last sentence can still be a bit confusing - here we 100% rely on
>>> the trylock() to proceed or we'll stop migration right away. IMHO that
>>> means this is not an optimization, since optimizations should always be
>>> optional but not the case here.
>>
>> We have to lock the page here, we don't have to install the migration
>> entries. Installing the migration entries here is optional and is the
>> optimisation.
> 
> I see what you mean now.
> 
>>
>>> Meanwhile it'll be great to also mention about why trylock is needed and no
>>> further attempt to use lock_page().  The comment in prepare() previously
>>> was great but unfortunately that code clip was removed.
>>
>> Will add.
>>
>>> In short, do you think something like this might be clearer?
>>
>> I think it's important to mention the optimisation, otherwise the
>> temptation is to remove the installation of migration entries here and
>> rely on try_to_migrate() to do it later. I would actually like to be
>> able to do that because it simplifies the code in many ways but based on
>> my testing the optimisation turns out to be very worth while.
>>
>>> 		/*
>>> 		 * We rely on the trylock() to migrate the pte.  If this
>>> 		 * fails, we'll fail the migration of this page.  IOW, the
>>> 		 * migration is very much best-effort, just like we'll also
>>> 		 * bail out if we found page pinned by other users after
>>> 		 * page being locked.
>>
>> Honestly I think this describes what the code does rather than why and
>> is likely to become outdated and confusing. IMHO it's quite clear from
>> the code that the migration will fail here if we can't lock the page.
> 
> If you see that's what I was struggling to understand previously, so not
> clear at least to me. :) Since normally a function like page migration
> should (from the gut feeling) not rely on trylock only.

IIRC, ordinary page migration will also only trylock. See
isolate_movable_page().
Peter Xu Aug. 26, 2022, 9:48 p.m. UTC | #5
On Fri, Aug 26, 2022 at 07:17:06PM +0200, David Hildenbrand wrote:
> On 26.08.22 17:03, Peter Xu wrote:
> > On Fri, Aug 26, 2022 at 10:34:15AM +1000, Alistair Popple wrote:
> >>
> >> Peter Xu <peterx@redhat.com> writes:
> >>
> >>> On Thu, Aug 25, 2022 at 11:49:05AM +1000, Alistair Popple wrote:
> >>>> Commit ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED") changed
> >>>> the way trylock_page() in migrate_vma_collect_pmd() works without
> >>>> updating the comment. Reword the comment to be less misleading and a
> >>>> better reflection of what happens.
> >>>>
> >>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
> >>>> Reported-by: Peter Xu <peterx@redhat.com>
> >>>> Fixes: ab09243aa95a ("mm/migrate.c: remove MIGRATE_PFN_LOCKED")
> >>>> ---
> >>>>  mm/migrate_device.c | 8 +++++---
> >>>>  1 file changed, 5 insertions(+), 3 deletions(-)
> >>>>
> >>>> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> >>>> index 5052093d0262..0736f846de0b 100644
> >>>> --- a/mm/migrate_device.c
> >>>> +++ b/mm/migrate_device.c
> >>>> @@ -179,9 +179,11 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
> >>>>  		get_page(page);
> >>>>
> >>>>  		/*
> >>>> -		 * Optimize for the common case where page is only mapped once
> >>>> -		 * in one process. If we can lock the page, then we can safely
> >>>> -		 * set up a special migration page table entry now.
> >>>> +		 * If we can't lock the page we can't migrate it. If we can it's
> >>>> +		 * safe to set up a migration entry now. In the common case
> >>>> +		 * where the page is mapped once in a single process setting up
> >>>> +		 * the migration entry now is an optimisation to avoid walking
> >>>> +		 * the rmap later with try_to_migrate().
> >>>>  		 */
> >>>
> >>> IMHO the last sentence can still be a bit confusing - here we 100% rely on
> >>> the trylock() to proceed or we'll stop migration right away. IMHO that
> >>> means this is not an optimization, since optimizations should always be
> >>> optional but not the case here.
> >>
> >> We have to lock the page here, we don't have to install the migration
> >> entries. Installing the migration entries here is optional and is the
> >> optimisation.
> > 
> > I see what you mean now.
> > 
> >>
> >>> Meanwhile it'll be great to also mention about why trylock is needed and no
> >>> further attempt to use lock_page().  The comment in prepare() previously
> >>> was great but unfortunately that code clip was removed.
> >>
> >> Will add.
> >>
> >>> In short, do you think something like this might be clearer?
> >>
> >> I think it's important to mention the optimisation, otherwise the
> >> temptation is to remove the installation of migration entries here and
> >> rely on try_to_migrate() to do it later. I would actually like to be
> >> able to do that because it simplifies the code in many ways but based on
> >> my testing the optimisation turns out to be very worth while.
> >>
> >>> 		/*
> >>> 		 * We rely on the trylock() to migrate the pte.  If this
> >>> 		 * fails, we'll fail the migration of this page.  IOW, the
> >>> 		 * migration is very much best-effort, just like we'll also
> >>> 		 * bail out if we found page pinned by other users after
> >>> 		 * page being locked.
> >>
> >> Honestly I think this describes what the code does rather than why and
> >> is likely to become outdated and confusing. IMHO it's quite clear from
> >> the code that the migration will fail here if we can't lock the page.
> > 
> > If you see that's what I was struggling to understand previously, so not
> > clear at least to me. :) Since normally a function like page migration
> > should (from the gut feeling) not rely on trylock only.
> 
> IIRC, ordinary page migration will also only trylock. See
> isolate_movable_page().

Fair enough.

I think it depends on what we want to do with the migration.  I think not
even all users of isolate_movable_page() only depend on trylock, and it can
retry with lock_page later.  E.g.:

- soft_offline_in_use_page
  - isolate_page
    - isolate_movable_page <----- trylock here
  - (if isolate_page fails...) migrate_pages
    - unmap_and_move       <----- lock_page here after 2 unsuccessful rounds

And I also agree migration may always fail (e.g. by spurious page refcounts
being taken), so feel free to ignore the "gut feeling" - that's indeed kind
of subjective.

Thanks,
Alistair Popple Aug. 29, 2022, 7:09 a.m. UTC | #6
Peter Xu <peterx@redhat.com> writes:

[...]

>> >>> Meanwhile it'll be great to also mention about why trylock is needed and no
>> >>> further attempt to use lock_page().  The comment in prepare() previously
>> >>> was great but unfortunately that code clip was removed.
>> >>
>> >> Will add.
>> >>
>> >>> In short, do you think something like this might be clearer?
>> >>
>> >> I think it's important to mention the optimisation, otherwise the
>> >> temptation is to remove the installation of migration entries here and
>> >> rely on try_to_migrate() to do it later. I would actually like to be
>> >> able to do that because it simplifies the code in many ways but based on
>> >> my testing the optimisation turns out to be very worth while.
>> >>
>> >>> 		/*
>> >>> 		 * We rely on the trylock() to migrate the pte.  If this
>> >>> 		 * fails, we'll fail the migration of this page.  IOW, the
>> >>> 		 * migration is very much best-effort, just like we'll also
>> >>> 		 * bail out if we found page pinned by other users after
>> >>> 		 * page being locked.
>> >>
>> >> Honestly I think this describes what the code does rather than why and
>> >> is likely to become outdated and confusing. IMHO it's quite clear from
>> >> the code that the migration will fail here if we can't lock the page.
>> >
>> > If you see that's what I was struggling to understand previously, so not
>> > clear at least to me. :) Since normally a function like page migration
>> > should (from the gut feeling) not rely on trylock only.
>>
>> IIRC, ordinary page migration will also only trylock. See
>> isolate_movable_page().
>
> Fair enough.
>
> I think it depends on what we want to do with the migration.  I think not
> even all users of isolate_movable_page() only depend on trylock, and it can
> retry with lock_page later.  E.g.:
>
> - soft_offline_in_use_page
>   - isolate_page
>     - isolate_movable_page <----- trylock here
>   - (if isolate_page fails...) migrate_pages
>     - unmap_and_move       <----- lock_page here after 2 unsuccessful rounds
>
> And I also agree migration may always fail (e.g. by spurious page refcounts
> being taken), so feel free to ignore the "gut feeling" - that's indeed kind
> of subjective.

So how about the following:

		/*
		 * We rely on trylock_page() to avoid deadlock between
		 * concurrent migrations where each is waiting on the others
		 * page lock. If we can't immediately lock the page we fail this
		 * migration as it is only best effort anyway.
		 *
		 * If we can lock the page it's safe to set up a migration entry
		 * now. In the common case where the page is mapped once in a
		 * single process setting up the migration entry now is an
		 * optimisation to avoid walking the rmap later with
		 * try_to_migrate().
		 */

Thanks.

 - Alistair
Peter Xu Aug. 29, 2022, 3:30 p.m. UTC | #7
On Mon, Aug 29, 2022 at 05:09:25PM +1000, Alistair Popple wrote:
> So how about the following:
> 
> 		/*
> 		 * We rely on trylock_page() to avoid deadlock between
> 		 * concurrent migrations where each is waiting on the others
> 		 * page lock. If we can't immediately lock the page we fail this
> 		 * migration as it is only best effort anyway.
> 		 *
> 		 * If we can lock the page it's safe to set up a migration entry
> 		 * now. In the common case where the page is mapped once in a
> 		 * single process setting up the migration entry now is an
> 		 * optimisation to avoid walking the rmap later with
> 		 * try_to_migrate().
> 		 */

Good to me, thanks.
diff mbox series

Patch

diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 5052093d0262..0736f846de0b 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -179,9 +179,11 @@  static int migrate_vma_collect_pmd(pmd_t *pmdp,
 		get_page(page);
 
 		/*
-		 * Optimize for the common case where page is only mapped once
-		 * in one process. If we can lock the page, then we can safely
-		 * set up a special migration page table entry now.
+		 * If we can't lock the page we can't migrate it. If we can it's
+		 * safe to set up a migration entry now. In the common case
+		 * where the page is mapped once in a single process setting up
+		 * the migration entry now is an optimisation to avoid walking
+		 * the rmap later with try_to_migrate().
 		 */
 		if (trylock_page(page)) {
 			bool anon_exclusive;