Message ID | 20190103002126.GM6310@bombadil.infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Initialise mmu_notifier_range correctly | expand |
On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: > > One of the paths in follow_pte_pmd() initialised the mmu_notifier_range > incorrectly. > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") > Tested-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> > > diff --git a/mm/memory.c b/mm/memory.c > index 2dd2f9ab57f4..21a650368be0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4078,8 +4078,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, > goto out; > > if (range) { > - range->start = address & PAGE_MASK; > - range->end = range->start + PAGE_SIZE; > + mmu_notifier_range_init(range, mm, address & PAGE_MASK, > + (address & PAGE_MASK) + PAGE_SIZE); > mmu_notifier_invalidate_range_start(range); > } > ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
On 1/2/19 5:56 PM, Jerome Glisse wrote: > On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: >> >> One of the paths in follow_pte_pmd() initialised the mmu_notifier_range >> incorrectly. >> >> Signed-off-by: Matthew Wilcox <willy@infradead.org> >> Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") >> Tested-by: Dave Chinner <dchinner@redhat.com> > > Reviewed-by: Jérôme Glisse <jglisse@redhat.com> > >> >> diff --git a/mm/memory.c b/mm/memory.c >> index 2dd2f9ab57f4..21a650368be0 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4078,8 +4078,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, >> goto out; >> >> if (range) { >> - range->start = address & PAGE_MASK; >> - range->end = range->start + PAGE_SIZE; >> + mmu_notifier_range_init(range, mm, address & PAGE_MASK, >> + (address & PAGE_MASK) + PAGE_SIZE); >> mmu_notifier_invalidate_range_start(range); >> } >> ptep = pte_offset_map_lock(mm, pmd, address, ptlp); > Looks correct to me, as well. Having the range struct declared in separate places from the mmu_notifier_range_init() calls is not great. But I'm not sure I see a way to make it significantly cleaner, given that __follow_pte_pmd uses the range pointer as a way to decide to issue the mmn calls. thanks,
On Wed, Jan 02, 2019 at 07:32:08PM -0800, John Hubbard wrote: > Having the range struct declared in separate places from the mmu_notifier_range_init() > calls is not great. But I'm not sure I see a way to make it significantly cleaner, given > that __follow_pte_pmd uses the range pointer as a way to decide to issue the mmn calls. Yeah, I don't think there's anything we can do. But I started reviewing the comments, and they don't make sense together: /* * Note because we provide range to follow_pte_pmd it will * call mmu_notifier_invalidate_range_start() on our behalf * before taking any lock. */ if (follow_pte_pmd(vma->vm_mm, address, &range, &ptep, &pmdp, &ptl)) continue; /* * No need to call mmu_notifier_invalidate_range() as we are * downgrading page table protection not changing it to point * to a new page. * * See Documentation/vm/mmu_notifier.rst */ So if we don't call mmu_notifier_invalidate_range, why are we calling mmu_notifier_invalidate_range_start and mmu_notifier_invalidate_range_end? ie, why not this ... diff --git a/fs/dax.c b/fs/dax.c index 6959837cc465..905340149924 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -777,7 +777,6 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { - struct mmu_notifier_range range; unsigned long address; cond_resched(); @@ -787,12 +786,7 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, address = pgoff_address(index, vma); - /* - * Note because we provide start/end to follow_pte_pmd it will - * call mmu_notifier_invalidate_range_start() on our behalf - * before taking any lock. - */ - if (follow_pte_pmd(vma->vm_mm, address, &range, + if (follow_pte_pmd(vma->vm_mm, address, NULL, &ptep, &pmdp, &ptl)) continue; @@ -834,8 +828,6 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, unlock_pte: pte_unmap_unlock(ptep, ptl); } - - mmu_notifier_invalidate_range_end(&range); } i_mmap_unlock_read(mapping); }
On Wed, Jan 02, 2019 at 08:18:33PM -0800, Matthew Wilcox wrote: > On Wed, Jan 02, 2019 at 07:32:08PM -0800, John Hubbard wrote: > > Having the range struct declared in separate places from the mmu_notifier_range_init() > > calls is not great. But I'm not sure I see a way to make it significantly cleaner, given > > that __follow_pte_pmd uses the range pointer as a way to decide to issue the mmn calls. > > Yeah, I don't think there's anything we can do. But I started reviewing > the comments, and they don't make sense together: > > /* > * Note because we provide range to follow_pte_pmd it will > * call mmu_notifier_invalidate_range_start() on our behalf > * before taking any lock. > */ > if (follow_pte_pmd(vma->vm_mm, address, &range, > &ptep, &pmdp, &ptl)) > continue; > > /* > * No need to call mmu_notifier_invalidate_range() as we are > * downgrading page table protection not changing it to point > * to a new page. > * > * See Documentation/vm/mmu_notifier.rst > */ > > So if we don't call mmu_notifier_invalidate_range, why are we calling > mmu_notifier_invalidate_range_start and mmu_notifier_invalidate_range_end? > ie, why not this ... Thus comments looks wrong to me ... we need to call mmu_notifier_invalidate_range() those are use by IOMMU. I might be to blame for those comments thought. > > diff --git a/fs/dax.c b/fs/dax.c > index 6959837cc465..905340149924 100644 > --- a/fs/dax.c > +++ b/fs/dax.c > @@ -777,7 +777,6 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, > > i_mmap_lock_read(mapping); > vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { > - struct mmu_notifier_range range; > unsigned long address; > > cond_resched(); > @@ -787,12 +786,7 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, > > address = pgoff_address(index, vma); > > - /* > - * Note because we provide start/end to follow_pte_pmd it will > - * call mmu_notifier_invalidate_range_start() on our behalf > - * before taking any lock. > - */ > - if (follow_pte_pmd(vma->vm_mm, address, &range, > + if (follow_pte_pmd(vma->vm_mm, address, NULL, > &ptep, &pmdp, &ptl)) > continue; > > @@ -834,8 +828,6 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, > unlock_pte: > pte_unmap_unlock(ptep, ptl); > } > - > - mmu_notifier_invalidate_range_end(&range); > } > i_mmap_unlock_read(mapping); > }
On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: > > One of the paths in follow_pte_pmd() initialised the mmu_notifier_range > incorrectly. > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") > Tested-by: Dave Chinner <dchinner@redhat.com> Actually now that i have read the code again this is not ok to do so. The caller of follow_pte_pmd() will call range_init and follow pmd will only update the range address. So existing code is ok. I know this is kind of ugly but i do not see a way around that uglyness. > > diff --git a/mm/memory.c b/mm/memory.c > index 2dd2f9ab57f4..21a650368be0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4078,8 +4078,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, > goto out; > > if (range) { > - range->start = address & PAGE_MASK; > - range->end = range->start + PAGE_SIZE; > + mmu_notifier_range_init(range, mm, address & PAGE_MASK, > + (address & PAGE_MASK) + PAGE_SIZE); > mmu_notifier_invalidate_range_start(range); > } > ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
On Thu, Jan 03, 2019 at 09:31:16AM -0500, Jerome Glisse wrote: > On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: > > > > One of the paths in follow_pte_pmd() initialised the mmu_notifier_range > > incorrectly. > > > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > > Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") > > Tested-by: Dave Chinner <dchinner@redhat.com> > > Actually now that i have read the code again this is not ok to > do so. The caller of follow_pte_pmd() will call range_init and > follow pmd will only update the range address. So existing code > is ok. The only caller of follow_pte_pmd() does not call range_init() because it doesn't know the address. That's the point of follow_pte_pmd(). > I know this is kind of ugly but i do not see a way around that > uglyness. You wrote the code ...
On Thu, Jan 03, 2019 at 09:29:59AM -0500, Jerome Glisse wrote: > On Wed, Jan 02, 2019 at 08:18:33PM -0800, Matthew Wilcox wrote: > > On Wed, Jan 02, 2019 at 07:32:08PM -0800, John Hubbard wrote: > > > Having the range struct declared in separate places from the mmu_notifier_range_init() > > > calls is not great. But I'm not sure I see a way to make it significantly cleaner, given > > > that __follow_pte_pmd uses the range pointer as a way to decide to issue the mmn calls. > > > > Yeah, I don't think there's anything we can do. But I started reviewing > > the comments, and they don't make sense together: > > > > /* > > * Note because we provide range to follow_pte_pmd it will > > * call mmu_notifier_invalidate_range_start() on our behalf > > * before taking any lock. > > */ > > if (follow_pte_pmd(vma->vm_mm, address, &range, > > &ptep, &pmdp, &ptl)) > > continue; > > > > /* > > * No need to call mmu_notifier_invalidate_range() as we are > > * downgrading page table protection not changing it to point > > * to a new page. > > * > > * See Documentation/vm/mmu_notifier.rst > > */ > > > > So if we don't call mmu_notifier_invalidate_range, why are we calling > > mmu_notifier_invalidate_range_start and mmu_notifier_invalidate_range_end? > > ie, why not this ... > > Thus comments looks wrong to me ... we need to call > mmu_notifier_invalidate_range() those are use by > IOMMU. I might be to blame for those comments thought. Yes, you're to blame for both of them. a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 791) * Note because we provide start/end to follow_pte_pmd it will a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 792) * call mmu_notifier_invalidate_range_start() on our behalf a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 793) * before taking any lock. 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 794) * No need to call mmu_notifier_invalidate_range() as we are 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 795) * downgrading page table protection not changing it to point 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 796) * to a new page.
On Thu, Jan 03, 2019 at 09:31:16AM -0500, Jerome Glisse wrote: > On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: > > > > One of the paths in follow_pte_pmd() initialised the mmu_notifier_range > > incorrectly. > > > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > > Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") > > Tested-by: Dave Chinner <dchinner@redhat.com> > > Actually now that i have read the code again this is not ok to > do so. The caller of follow_pte_pmd() will call range_init and > follow pmd will only update the range address. So existing code > is ok. I think you need to re-read your own patch. `git show ac46d4f3c43241ffa23d5bf36153a0830c0e02cc` @@ -4058,10 +4059,10 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, if (!pmdpp) goto out; - if (start && end) { - *start = address & PMD_MASK; - *end = *start + PMD_SIZE; - mmu_notifier_invalidate_range_start(mm, *start, *end); + if (range) { + mmu_notifier_range_init(range, mm, address & PMD_MASK, + (address & PMD_MASK) + PMD_SIZE); + mmu_notifier_invalidate_range_start(range); ... so it's fine to call range_init() *here*. @@ -4069,17 +4070,17 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsign ed long address, [...] if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; - if (start && end) { - *start = address & PAGE_MASK; - *end = *start + PAGE_SIZE; - mmu_notifier_invalidate_range_start(mm, *start, *end); + if (range) { + range->start = address & PAGE_MASK; + range->end = range->start + PAGE_SIZE; + mmu_notifier_invalidate_range_start(range); ... but then *not* here later in the same function? You're not making any sense.
On Thu, Jan 03, 2019 at 06:39:08AM -0800, Matthew Wilcox wrote: > On Thu, Jan 03, 2019 at 09:29:59AM -0500, Jerome Glisse wrote: > > On Wed, Jan 02, 2019 at 08:18:33PM -0800, Matthew Wilcox wrote: > > > On Wed, Jan 02, 2019 at 07:32:08PM -0800, John Hubbard wrote: > > > > Having the range struct declared in separate places from the mmu_notifier_range_init() > > > > calls is not great. But I'm not sure I see a way to make it significantly cleaner, given > > > > that __follow_pte_pmd uses the range pointer as a way to decide to issue the mmn calls. > > > > > > Yeah, I don't think there's anything we can do. But I started reviewing > > > the comments, and they don't make sense together: > > > > > > /* > > > * Note because we provide range to follow_pte_pmd it will > > > * call mmu_notifier_invalidate_range_start() on our behalf > > > * before taking any lock. > > > */ > > > if (follow_pte_pmd(vma->vm_mm, address, &range, > > > &ptep, &pmdp, &ptl)) > > > continue; > > > > > > /* > > > * No need to call mmu_notifier_invalidate_range() as we are > > > * downgrading page table protection not changing it to point > > > * to a new page. > > > * > > > * See Documentation/vm/mmu_notifier.rst > > > */ > > > > > > So if we don't call mmu_notifier_invalidate_range, why are we calling > > > mmu_notifier_invalidate_range_start and mmu_notifier_invalidate_range_end? > > > ie, why not this ... > > > > Thus comments looks wrong to me ... we need to call > > mmu_notifier_invalidate_range() those are use by > > IOMMU. I might be to blame for those comments thought. > > Yes, you're to blame for both of them. > > a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 791) * Note because we provide start/end to follow_pte_pmd it will > a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 792) * call mmu_notifier_invalidate_range_start() on our behalf > a4d1a88525138 (Jérôme Glisse 2017-08-31 17:17:26 -0400 793) * before taking any lock. > > 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 794) * No need to call mmu_notifier_invalidate_range() as we are > 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 795) * downgrading page table protection not changing it to point > 0f10851ea475e (Jérôme Glisse 2017-11-15 17:34:07 -0800 796) * to a new page. > I remember now we do not need to call invalidate range because invalidate_range_end() does call invalidate_range so it is fine. Comments should be better thought. So existing code is fine. Cheers, Jérôme
On Thu, Jan 03, 2019 at 06:43:13AM -0800, Matthew Wilcox wrote: > On Thu, Jan 03, 2019 at 09:31:16AM -0500, Jerome Glisse wrote: > > On Wed, Jan 02, 2019 at 04:21:26PM -0800, Matthew Wilcox wrote: > > > > > > One of the paths in follow_pte_pmd() initialised the mmu_notifier_range > > > incorrectly. > > > > > > Signed-off-by: Matthew Wilcox <willy@infradead.org> > > > Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") > > > Tested-by: Dave Chinner <dchinner@redhat.com> > > > > Actually now that i have read the code again this is not ok to > > do so. The caller of follow_pte_pmd() will call range_init and > > follow pmd will only update the range address. So existing code > > is ok. > > I think you need to re-read your own patch. > > `git show ac46d4f3c43241ffa23d5bf36153a0830c0e02cc` > > @@ -4058,10 +4059,10 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, > if (!pmdpp) > goto out; > > - if (start && end) { > - *start = address & PMD_MASK; > - *end = *start + PMD_SIZE; > - mmu_notifier_invalidate_range_start(mm, *start, *end); > + if (range) { > + mmu_notifier_range_init(range, mm, address & PMD_MASK, > + (address & PMD_MASK) + PMD_SIZE); > + mmu_notifier_invalidate_range_start(range); > > ... so it's fine to call range_init() *here*. > > @@ -4069,17 +4070,17 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsign > ed long address, > [...] > if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) > goto out; > > - if (start && end) { > - *start = address & PAGE_MASK; > - *end = *start + PAGE_SIZE; > - mmu_notifier_invalidate_range_start(mm, *start, *end); > + if (range) { > + range->start = address & PAGE_MASK; > + range->end = range->start + PAGE_SIZE; > + mmu_notifier_invalidate_range_start(range); > > ... but then *not* here later in the same function? You're not making > any sense. Ok i see that the patch that add the reason why mmu notifier is call have been drop. So yes using range_init in follow_pte_pmd is fine. With that other patch the reasons is set by the caller of follow_pte_pmd and using range_init would have overwritten it. So this patch is fine for current tree. Sorry i was thinking with the other patch included in mind. Cheers, Jérôme
diff --git a/mm/memory.c b/mm/memory.c index 2dd2f9ab57f4..21a650368be0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4078,8 +4078,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, goto out; if (range) { - range->start = address & PAGE_MASK; - range->end = range->start + PAGE_SIZE; + mmu_notifier_range_init(range, mm, address & PAGE_MASK, + (address & PAGE_MASK) + PAGE_SIZE); mmu_notifier_invalidate_range_start(range); } ptep = pte_offset_map_lock(mm, pmd, address, ptlp);