Message ID | 20200311183506.3997-4-jgg@ziepe.ca (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Various error case bug fixes for hmm_range_fault() | expand |
On 3/11/20 11:35 AM, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > This eventually calls into handle_mm_fault() which is a sleeping function. > Release the lock first. > > hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not > need the lock. > > Fixes: 3afc423632a1 ("mm: pagewalk: add p4d_entry() and pgd_entry()") > Cc: Steven Price <steven.price@arm.com> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> > --- > mm/hmm.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/mm/hmm.c b/mm/hmm.c > index 9e8f68eb83287a..32dcbfd3908315 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -473,8 +473,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > > pud = READ_ONCE(*pudp); > if (pud_none(pud)) { > - ret = hmm_vma_walk_hole(start, end, -1, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole(start, end, -1, walk); > } > > if (pud_huge(pud) && pud_devmap(pud)) { > @@ -483,8 +483,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > bool fault, write_fault; > > if (!pud_present(pud)) { > - ret = hmm_vma_walk_hole(start, end, -1, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole(start, end, -1, walk); > } > > i = (addr - range->start) >> PAGE_SHIFT; > @@ -495,9 +495,9 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > hmm_range_need_fault(hmm_vma_walk, pfns, npages, > cpu_flags, &fault, &write_fault); > if (fault || write_fault) { > - ret = hmm_vma_walk_hole_(addr, end, fault, > - write_fault, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole_(addr, end, fault, write_fault, > + walk); > } > > pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >
On 11/03/2020 18:35, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > This eventually calls into handle_mm_fault() which is a sleeping function. > Release the lock first. > > hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not > need the lock. > > Fixes: 3afc423632a1 ("mm: pagewalk: add p4d_entry() and pgd_entry()") > Cc: Steven Price <steven.price@arm.com> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Sorry about that, thanks for fixing. Reviewed-by: Steven Price <steven.price@arm.com> > --- > mm/hmm.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/mm/hmm.c b/mm/hmm.c > index 9e8f68eb83287a..32dcbfd3908315 100644 > --- a/mm/hmm.c > +++ b/mm/hmm.c > @@ -473,8 +473,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > > pud = READ_ONCE(*pudp); > if (pud_none(pud)) { > - ret = hmm_vma_walk_hole(start, end, -1, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole(start, end, -1, walk); > } > > if (pud_huge(pud) && pud_devmap(pud)) { > @@ -483,8 +483,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > bool fault, write_fault; > > if (!pud_present(pud)) { > - ret = hmm_vma_walk_hole(start, end, -1, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole(start, end, -1, walk); > } > > i = (addr - range->start) >> PAGE_SHIFT; > @@ -495,9 +495,9 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, > hmm_range_need_fault(hmm_vma_walk, pfns, npages, > cpu_flags, &fault, &write_fault); > if (fault || write_fault) { > - ret = hmm_vma_walk_hole_(addr, end, fault, > - write_fault, walk); > - goto out_unlock; > + spin_unlock(ptl); > + return hmm_vma_walk_hole_(addr, end, fault, write_fault, > + walk); > } > > pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); >
On Wed, Mar 11, 2020 at 03:35:01PM -0300, Jason Gunthorpe wrote: > From: Jason Gunthorpe <jgg@mellanox.com> > > This eventually calls into handle_mm_fault() which is a sleeping function. > Release the lock first. > > hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not > need the lock. So how did this manage to not be noticed before? The fix looks fine assuming we want something backportable before starting the cleanups: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Mon, Mar 16, 2020 at 10:05:03AM +0100, Christoph Hellwig wrote: > On Wed, Mar 11, 2020 at 03:35:01PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@mellanox.com> > > > > This eventually calls into handle_mm_fault() which is a sleeping function. > > Release the lock first. > > > > hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not > > need the lock. > > So how did this manage to not be noticed before? I'm still struggling a bit with how this PUD stuff works.. However AFAICT: 1) The first hunk around pud_none() is actualy covering a race. In the non-race case the page walker will directly call hmm_vma_walk_hole() for pud_none so it would be very hard to hit this 2) I'm not 100% sure, but I think pud_present() == pud_none(), as there is no swap entry for PUD I don't know what a non-present but non-none entry is or how to create one. This is possibly dead code due to #1 3) To hit hmm_range_need_fault() requesting fault you would need a read-only huge PUD and a fault requesting write. I suspect creating a read only huge PUD entry is very rare - not something someone would deliberately construct. 4) To even trigger the PUD flow at all you need the 1G THP or the special 1G DAX pages which I strongly suspect people are not testing with. > The fix looks fine assuming we want something backportable before > starting the cleanups: I found it easier to make the other cleanup patches make sense if they didn't introduce all kinds of new logic too.. Thanks, Jason
diff --git a/mm/hmm.c b/mm/hmm.c index 9e8f68eb83287a..32dcbfd3908315 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -473,8 +473,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, pud = READ_ONCE(*pudp); if (pud_none(pud)) { - ret = hmm_vma_walk_hole(start, end, -1, walk); - goto out_unlock; + spin_unlock(ptl); + return hmm_vma_walk_hole(start, end, -1, walk); } if (pud_huge(pud) && pud_devmap(pud)) { @@ -483,8 +483,8 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, bool fault, write_fault; if (!pud_present(pud)) { - ret = hmm_vma_walk_hole(start, end, -1, walk); - goto out_unlock; + spin_unlock(ptl); + return hmm_vma_walk_hole(start, end, -1, walk); } i = (addr - range->start) >> PAGE_SHIFT; @@ -495,9 +495,9 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, hmm_range_need_fault(hmm_vma_walk, pfns, npages, cpu_flags, &fault, &write_fault); if (fault || write_fault) { - ret = hmm_vma_walk_hole_(addr, end, fault, - write_fault, walk); - goto out_unlock; + spin_unlock(ptl); + return hmm_vma_walk_hole_(addr, end, fault, write_fault, + walk); } pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);