mbox series

[hmm,0/8] Various error case bug fixes for hmm_range_fault()

Message ID 20200311183506.3997-1-jgg@ziepe.ca (mailing list archive)
Headers show
Series Various error case bug fixes for hmm_range_fault() | expand

Message

Jason Gunthorpe March 11, 2020, 6:34 p.m. UTC
From: Jason Gunthorpe <jgg@mellanox.com>

The hmm_range_fault() flow is fairly complicated. The scheme allows the
caller to specify if it needs a usable result for each page, or if it only
needs the current page table status filled in. This mixture of behavior is
useful for a caller that wants to build a 'prefetch around fault'
algorithm.

Although we have no in-tree users of this capability, I am working on
having RDMA ODP work in this manner, and removing these bugs from
hmm_range_fault() is a necessary step.

The basic principles are:

 - If the caller did not ask for a VA to be valid then hmm_range_fault()
   should not fail because of that VA

 - If 0 is returned from hmm_range_fault() then the entire pfns array
   contains valid data

 - HMM_PFN_ERROR is set if faulting fails, or if asking for faulting
   would fail

 - A 0 return from hmm_range_fault() does not have HMM_PFN_ERROR in any
   VA's the caller asked to be valid

This series does not get there completely, I have a followup series
closing some more complex cases.

I tested this series using Ralph's hmm tester he posted a while back,
other testing would be appreciated.

Jason Gunthorpe (8):
  mm/hmm: add missing unmaps of the ptep during hmm_vma_handle_pte()
  mm/hmm: don't free the cached pgmap while scanning
  mm/hmm: do not call hmm_vma_walk_hole() while holding a spinlock
  mm/hmm: add missing pfns set to hmm_vma_walk_pmd()
  mm/hmm: add missing call to hmm_range_need_fault() before returning
    EFAULT
  mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()
  mm/hmm: return -EFAULT when setting HMM_PFN_ERROR on requested valid
    pages
  mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL
    handling

 mm/hmm.c | 166 ++++++++++++++++++++++++++-----------------------------
 1 file changed, 79 insertions(+), 87 deletions(-)

Comments

Jason Gunthorpe March 12, 2020, 7:33 p.m. UTC | #1
pmd_to_hmm_pfn_flags() already checks it and makes the cpu flags 0. If no
fault is requested then the pfns should be returned with the not valid
flags.

It should not unconditionally fault if faulting is not requested.

Fixes: 2aee09d8c116 ("mm/hmm: change hmm_vma_fault() to allow write fault on page basis")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Bonus patch, this one got found after I made the series..

diff --git a/mm/hmm.c b/mm/hmm.c
index ca33d086bdc190..6d9da4b0f0a9f8 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -226,7 +226,7 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
 	hmm_range_need_fault(hmm_vma_walk, pfns, npages, cpu_flags,
 			     &fault, &write_fault);
 
-	if (pmd_protnone(pmd) || fault || write_fault)
+	if (fault || write_fault)
 		return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
 
 	pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
Ralph Campbell March 12, 2020, 11:50 p.m. UTC | #2
On 3/12/20 12:33 PM, Jason Gunthorpe wrote:
> pmd_to_hmm_pfn_flags() already checks it and makes the cpu flags 0. If no
> fault is requested then the pfns should be returned with the not valid
> flags.
> 
> It should not unconditionally fault if faulting is not requested.
> 
> Fixes: 2aee09d8c116 ("mm/hmm: change hmm_vma_fault() to allow write fault on page basis")
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good to me.
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   mm/hmm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Bonus patch, this one got found after I made the series..
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index ca33d086bdc190..6d9da4b0f0a9f8 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -226,7 +226,7 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
>   	hmm_range_need_fault(hmm_vma_walk, pfns, npages, cpu_flags,
>   			     &fault, &write_fault);
>   
> -	if (pmd_protnone(pmd) || fault || write_fault)
> +	if (fault || write_fault)
>   		return hmm_vma_walk_hole_(addr, end, fault, write_fault, walk);
>   
>   	pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
>
Jason Gunthorpe March 16, 2020, 6:25 p.m. UTC | #3
On Wed, Mar 11, 2020 at 03:34:58PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> The hmm_range_fault() flow is fairly complicated. The scheme allows the
> caller to specify if it needs a usable result for each page, or if it only
> needs the current page table status filled in. This mixture of behavior is
> useful for a caller that wants to build a 'prefetch around fault'
> algorithm.
> 
> Although we have no in-tree users of this capability, I am working on
> having RDMA ODP work in this manner, and removing these bugs from
> hmm_range_fault() is a necessary step.
> 
> The basic principles are:
> 
>  - If the caller did not ask for a VA to be valid then hmm_range_fault()
>    should not fail because of that VA
> 
>  - If 0 is returned from hmm_range_fault() then the entire pfns array
>    contains valid data
> 
>  - HMM_PFN_ERROR is set if faulting fails, or if asking for faulting
>    would fail
> 
>  - A 0 return from hmm_range_fault() does not have HMM_PFN_ERROR in any
>    VA's the caller asked to be valid
> 
> This series does not get there completely, I have a followup series
> closing some more complex cases.
> 
> I tested this series using Ralph's hmm tester he posted a while back,
> other testing would be appreciated.
> 
> Jason Gunthorpe (8):
>   mm/hmm: add missing unmaps of the ptep during hmm_vma_handle_pte()
>   mm/hmm: do not call hmm_vma_walk_hole() while holding a spinlock
>   mm/hmm: add missing pfns set to hmm_vma_walk_pmd()
>   mm/hmm: add missing call to hmm_range_need_fault() before returning
>     EFAULT
>   mm/hmm: reorganize how !pte_present is handled in hmm_vma_handle_pte()
>   mm/hmm: return -EFAULT when setting HMM_PFN_ERROR on requested valid
>     pages
>   mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL
>     handling
>   mm/hmm: do not check pmd_protnone twice in hmm_vma_handle_pmd()

I moved these toward linux-next, if others have remarks or tags please
feel free to continue.

>   mm/hmm: don't free the cached pgmap while scanning

I will respin

Thank you all for the reviews!

Regards,
Jason