Message ID | 20191030224930.3990755-15-jhubbard@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/gup: track dma-pinned pages: FOLL_PIN, FOLL_LONGTERM | expand |
On 10/30/19 3:49 PM, John Hubbard wrote: > This also fixes one or two likely bugs. Well, actually just one... > > 1. Change vfio from get_user_pages(FOLL_LONGTERM), to > pin_longterm_pages(), which sets both FOLL_LONGTERM and FOLL_PIN. > > Note that this is a change in behavior, because the > get_user_pages_remote() call was not setting FOLL_LONGTERM, but the > new pin_user_pages_remote() call that replaces it, *is* setting > FOLL_LONGTERM. It is important to set FOLL_LONGTERM, because the > DMA case requires it. Please see the FOLL_PIN documentation in > include/linux/mm.h, and Documentation/pin_user_pages.rst for details. Correction: the above comment is stale and wrong. I wrote it before getting further into the details, and the patch doesn't do this. Instead, it keeps exactly the old behavior: pin_longterm_pages_remote() is careful to avoid setting FOLL_LONGTERM. Instead of setting that flag, it drops in a "TODO" comment nearby. :) I'll update the commit description in the next version of the series. thanks, John Hubbard NVIDIA > > 2. Because all FOLL_PIN-acquired pages must be released via > put_user_page(), also convert the put_page() call over to > put_user_pages(). > > Note that this effectively changes the code's behavior in > vfio_iommu_type1.c: put_pfn(): it now ultimately calls > set_page_dirty_lock(), instead of set_page_dirty(). This is > probably more accurate. > > As Christoph Hellwig put it, "set_page_dirty() is only safe if we are > dealing with a file backed page where we have reference on the inode it > hangs off." [1] > > [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de > > Cc: Alex Williamson <alex.williamson@redhat.com> > Signed-off-by: John Hubbard <jhubbard@nvidia.com> > --- > drivers/vfio/vfio_iommu_type1.c | 15 +++++++-------- > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c > index d864277ea16f..795e13f3ef08 100644 > --- a/drivers/vfio/vfio_iommu_type1.c > +++ b/drivers/vfio/vfio_iommu_type1.c > @@ -327,9 +327,8 @@ static int put_pfn(unsigned long pfn, int prot) > { > if (!is_invalid_reserved_pfn(pfn)) { > struct page *page = pfn_to_page(pfn); > - if (prot & IOMMU_WRITE) > - SetPageDirty(page); > - put_page(page); > + > + put_user_pages_dirty_lock(&page, 1, prot & IOMMU_WRITE); > return 1; > } > return 0; > @@ -349,11 +348,11 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, > > down_read(&mm->mmap_sem); > if (mm == current->mm) { > - ret = get_user_pages(vaddr, 1, flags | FOLL_LONGTERM, page, > - vmas); > + ret = pin_longterm_pages(vaddr, 1, flags, page, vmas); > } else { > - ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page, > - vmas, NULL); > + ret = pin_longterm_pages_remote(NULL, mm, vaddr, 1, > + flags, page, vmas, > + NULL); > /* > * The lifetime of a vaddr_get_pfn() page pin is > * userspace-controlled. In the fs-dax case this could > @@ -363,7 +362,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, > */ > if (ret > 0 && vma_is_fsdax(vmas[0])) { > ret = -EOPNOTSUPP; > - put_page(page[0]); > + put_user_page(page[0]); > } > } > up_read(&mm->mmap_sem); >
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index d864277ea16f..795e13f3ef08 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -327,9 +327,8 @@ static int put_pfn(unsigned long pfn, int prot) { if (!is_invalid_reserved_pfn(pfn)) { struct page *page = pfn_to_page(pfn); - if (prot & IOMMU_WRITE) - SetPageDirty(page); - put_page(page); + + put_user_pages_dirty_lock(&page, 1, prot & IOMMU_WRITE); return 1; } return 0; @@ -349,11 +348,11 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, down_read(&mm->mmap_sem); if (mm == current->mm) { - ret = get_user_pages(vaddr, 1, flags | FOLL_LONGTERM, page, - vmas); + ret = pin_longterm_pages(vaddr, 1, flags, page, vmas); } else { - ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page, - vmas, NULL); + ret = pin_longterm_pages_remote(NULL, mm, vaddr, 1, + flags, page, vmas, + NULL); /* * The lifetime of a vaddr_get_pfn() page pin is * userspace-controlled. In the fs-dax case this could @@ -363,7 +362,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, */ if (ret > 0 && vma_is_fsdax(vmas[0])) { ret = -EOPNOTSUPP; - put_page(page[0]); + put_user_page(page[0]); } } up_read(&mm->mmap_sem);
This also fixes one or two likely bugs. 1. Change vfio from get_user_pages(FOLL_LONGTERM), to pin_longterm_pages(), which sets both FOLL_LONGTERM and FOLL_PIN. Note that this is a change in behavior, because the get_user_pages_remote() call was not setting FOLL_LONGTERM, but the new pin_user_pages_remote() call that replaces it, *is* setting FOLL_LONGTERM. It is important to set FOLL_LONGTERM, because the DMA case requires it. Please see the FOLL_PIN documentation in include/linux/mm.h, and Documentation/pin_user_pages.rst for details. 2. Because all FOLL_PIN-acquired pages must be released via put_user_page(), also convert the put_page() call over to put_user_pages(). Note that this effectively changes the code's behavior in vfio_iommu_type1.c: put_pfn(): it now ultimately calls set_page_dirty_lock(), instead of set_page_dirty(). This is probably more accurate. As Christoph Hellwig put it, "set_page_dirty() is only safe if we are dealing with a file backed page where we have reference on the inode it hangs off." [1] [1] https://lore.kernel.org/r/20190723153640.GB720@lst.de Cc: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> --- drivers/vfio/vfio_iommu_type1.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)