diff mbox series

mm/hmm: Prevent infinite loop in hmm_range_fault during EBUSY retries

Message ID 20250128063422.7604-1-sooraj20636@gmail.com (mailing list archive)
State New
Headers show
Series mm/hmm: Prevent infinite loop in hmm_range_fault during EBUSY retries | expand

Commit Message

sooraj Jan. 28, 2025, 6:34 a.m. UTC
When hmm_vma_walk_test() skips a VMA (e.g., unsupported VM_IO/PFNMAP range),
it must update hmm_vma_walk->last to the end of the skipped VMA. Failing to
do so causes hmm_range_fault() to restart from the same address during
-EBUSY retries, reprocessing the skipped VMA indefinitely. This results in
an infinite loop if the VMA remains non-processable.

Update hmm_vma_walk->last to the VMA's end address in hmm_vma_walk_test()
when skipping the range. This ensures subsequent iterations resume correctly
after the skipped VMA, preventing infinite retry loops.

Signed-off-by: sooraj <sooraj20636@gmail.com>
---
 mm/hmm.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Andrew Morton Jan. 28, 2025, 1 a.m. UTC | #1
On Tue, 28 Jan 2025 01:34:22 -0500 sooraj <sooraj20636@gmail.com> wrote:

> When hmm_vma_walk_test() skips a VMA (e.g., unsupported VM_IO/PFNMAP range),
> it must update hmm_vma_walk->last to the end of the skipped VMA. Failing to
> do so causes hmm_range_fault() to restart from the same address during
> -EBUSY retries, reprocessing the skipped VMA indefinitely. This results in
> an infinite loop if the VMA remains non-processable.
> 
> Update hmm_vma_walk->last to the VMA's end address in hmm_vma_walk_test()
> when skipping the range. This ensures subsequent iterations resume correctly
> after the skipped VMA, preventing infinite retry loops.
> 

Well that's unfortunate.  This code seems quite old - can you tell us
what your userspace is doing to trigger this behavior?

> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -547,6 +547,8 @@ static int hmm_vma_walk_test(unsigned long start, unsigned long end,
>  
>  	hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
>  
> +	/* Update last to the end of the skipped VMA to prevent reprocessing */
> +	hmm_vma_walk->last = end;
>  	/* Skip this vma and continue processing the next vma. */
>  	return 1;
>  }

This appears to deserve a cc:stable, but I suspect the bug is so old
that a Fixes: won't be needed.
Alistair Popple Jan. 28, 2025, 1:16 a.m. UTC | #2
On Tue, Jan 28, 2025 at 01:34:22AM -0500, sooraj wrote:
> When hmm_vma_walk_test() skips a VMA (e.g., unsupported VM_IO/PFNMAP range),
> it must update hmm_vma_walk->last to the end of the skipped VMA. Failing to
> do so causes hmm_range_fault() to restart from the same address during
> -EBUSY retries, reprocessing the skipped VMA indefinitely. This results in
> an infinite loop if the VMA remains non-processable.
> 
> Update hmm_vma_walk->last to the VMA's end address in hmm_vma_walk_test()
> when skipping the range. This ensures subsequent iterations resume correctly
> after the skipped VMA, preventing infinite retry loops.

I might be missing something here but I don't understand how this causes an
infinite loop. If we skip the VMA (ie. hmm_vma_walk_test() returns 1) and
hmm_range_fault() subsequently returns -EBUSY it's true that we will reprocess
the same non-processable VMA. But a non-processable VMA won't return -EBUSY and
therefore won't cause an infinite loop in hmm_range_fault() - it will just fill
out the pfns (which is redundant) and continue on to the next VMA.

So it seems this just prevents ueslessly filling out pfns again rather than an
infinite loop. What have I missed?

 - Alistair

> Signed-off-by: sooraj <sooraj20636@gmail.com>
> ---
>  mm/hmm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 7e0229ae4a5a..29e3678fede5 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -547,6 +547,8 @@ static int hmm_vma_walk_test(unsigned long start, unsigned long end,
>  
>  	hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
>  
> +	/* Update last to the end of the skipped VMA to prevent reprocessing */
> +	hmm_vma_walk->last = end;
>  	/* Skip this vma and continue processing the next vma. */
>  	return 1;
>  }
> -- 
> 2.45.2
> 
>
Jason Gunthorpe Jan. 28, 2025, 6:04 p.m. UTC | #3
On Tue, Jan 28, 2025 at 01:34:22AM -0500, sooraj wrote:
> When hmm_vma_walk_test() skips a VMA (e.g., unsupported VM_IO/PFNMAP range),
> it must update hmm_vma_walk->last to the end of the skipped VMA. Failing to
> do so causes hmm_range_fault() to restart from the same address during
> -EBUSY retries, reprocessing the skipped VMA indefinitely. This results in
> an infinite loop if the VMA remains non-processable.

This needs alot more explanation.

The only place last is read is here:

		ret = walk_page_range(mm, hmm_vma_walk.last, range->end,
				      &hmm_walk_ops, &hmm_vma_walk);
		/*
		 * When -EBUSY is returned the loop restarts with
		 * hmm_vma_walk.last set to an address that has not been stored
		 * in pfns. All entries < last in the pfn array are set to their
		 * output, and all >= are still at their input values.
		 */
	} while (ret == -EBUSY);

The idea being that when a lower function returns -EBUSY last tells
this loop where to restart the iteration.

> @@ -547,6 +547,8 @@ static int hmm_vma_walk_test(unsigned long start, unsigned long end,
>  
>  	hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
>  
> +	/* Update last to the end of the skipped VMA to prevent reprocessing */
> +	hmm_vma_walk->last = end;

This does not return EBUSY, so it does not need to set last.

I just checked against and all the places that return EBUSY, do set last:

static int hmm_vma_fault(unsigned long addr, unsigned long end,
			 unsigned int required_fault, struct mm_walk *walk)
{
	hmm_vma_walk->last = addr;
..
	return -EBUSY;

static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
			      unsigned long end, pmd_t *pmdp, pte_t *ptep,
			      unsigned long *hmm_pfn)
{
...
			hmm_vma_walk->last = addr;
			migration_entry_wait(walk->mm, pmdp, addr);
			return -EBUSY;

static int hmm_vma_walk_pmd(pmd_t *pmdp,
			    unsigned long start,
			    unsigned long end,
			    struct mm_walk *walk)
{
...
			hmm_vma_walk->last = addr;
			pmd_migration_entry_wait(walk->mm, pmdp);
			return -EBUSY;
		}

Thats it. If an -EBUSY case has been missed it should be fixed at the
-EBUSY return point, anything else is wrong.

If you are seeing an infinite loop that this actually fixes please
explain in the commit message exactly how it arises, and where the
-EBUSY came from to miss the last update.

If there is a loop due to a missed -EBUSY, then this is not the
correct way to fix it.

Jason
diff mbox series

Patch

diff --git a/mm/hmm.c b/mm/hmm.c
index 7e0229ae4a5a..29e3678fede5 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -547,6 +547,8 @@  static int hmm_vma_walk_test(unsigned long start, unsigned long end,
 
 	hmm_pfns_fill(start, end, range, HMM_PFN_ERROR);
 
+	/* Update last to the end of the skipped VMA to prevent reprocessing */
+	hmm_vma_walk->last = end;
 	/* Skip this vma and continue processing the next vma. */
 	return 1;
 }