mbox series

[v2,00/19] pin_user_pages(): reduced-risk series for Linux 5.5

Message ID 20191125231035.1539120-1-jhubbard@nvidia.com (mailing list archive)
Headers show
Series pin_user_pages(): reduced-risk series for Linux 5.5 | expand

Message

John Hubbard Nov. 25, 2019, 11:10 p.m. UTC
Hi,

Changes since v1:

* Fixed up ppc in response to Jan Kara's review comments (thanks for
  those!).

* Fixed a kbuilt robot-detected build failure: added a stub function for
  the !CONFIG_MMU case.

* Cover letter: now refers to "unpin_user_page()", reflecting the name
  change in the last patch (instead of put_user_page() ).

* Rebased onto today's linux-next: c165016bac27 ("Add linux-next
  specific files for 20191125")

========================================================================
Here is a set of well-reviewed (expect for one patch), lower-risk  items
that can go into Linux 5.5. (Update: the powerpc conversion patch has
had some initial review now, since v1 was posted.)

This is essentially a cut-down v8 of "mm/gup: track dma-pinned pages:
FOLL_PIN" [1], and with one of the VFIO patches split into two patches.
The idea here is to get this long list of "noise" checked into 5.5, so
that the actual, higher-risk "track FOLL_PIN pages" (which is deferred:
not part of this series) will be a much shorter patchset to review.

For the v4l2-core changes, I've left those here (instead of sending
them separately to the -media tree), in order to get the name change
done now (put_user_page --> unpin_user_page). However, I've added a Cc
stable, as recommended during the last round of reviews.

Here are the relevant notes from the original cover letter, edited to
match the current situation:

This is a prerequisite to tracking dma-pinned pages. That in turn is a
prerequisite to solving the larger problem of proper interactions
between file-backed pages, and [R]DMA activities, as discussed in [1],
[2], [3], and in a remarkable number of email threads since about
2017. :)

A new internal gup flag, FOLL_PIN is introduced, and thoroughly
documented in the last patch's Documentation/vm/pin_user_pages.rst.

I believe that this will provide a good starting point for doing the
layout lease work that Ira Weiny has been working on. That's because
these new wrapper functions provide a clean, constrained, systematically
named set of functionality that, again, is required in order to even
know if a page is "dma-pinned".

In contrast to earlier approaches, the page tracking can be
incrementally applied to the kernel call sites that, until now, have
been simply calling get_user_pages() ("gup"). In other words, opt-in by
changing from this:

    get_user_pages() (sets FOLL_GET)
    put_page()

to this:
    pin_user_pages() (sets FOLL_PIN)
    unpin_user_page()

Because there are interdependencies with FOLL_LONGTERM, a similar
conversion as for FOLL_PIN, was applied. The change was from this:

    get_user_pages(FOLL_LONGTERM) (also sets FOLL_GET)
    put_page()

to this:
    pin_longterm_pages() (sets FOLL_PIN | FOLL_LONGTERM)
    unpin_user_page()

[1] https://lore.kernel.org/r/20191121071354.456618-1-jhubbard@nvidia.com

thanks,
John Hubbard
NVIDIA


Dan Williams (1):
  mm: Cleanup __put_devmap_managed_page() vs ->page_free()

John Hubbard (18):
  mm/gup: factor out duplicate code from four routines
  mm/gup: move try_get_compound_head() to top, fix minor issues
  goldish_pipe: rename local pin_user_pages() routine
  mm: fix get_user_pages_remote()'s handling of FOLL_LONGTERM
  vfio: fix FOLL_LONGTERM use, simplify get_user_pages_remote() call
  mm/gup: introduce pin_user_pages*() and FOLL_PIN
  goldish_pipe: convert to pin_user_pages() and put_user_page()
  IB/{core,hw,umem}: set FOLL_PIN via pin_user_pages*(), fix up ODP
  mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote()
  drm/via: set FOLL_PIN via pin_user_pages_fast()
  fs/io_uring: set FOLL_PIN via pin_user_pages()
  net/xdp: set FOLL_PIN via pin_user_pages()
  media/v4l2-core: set pages dirty upon releasing DMA buffers
  media/v4l2-core: pin_user_pages (FOLL_PIN) and put_user_page()
    conversion
  vfio, mm: pin_user_pages (FOLL_PIN) and put_user_page() conversion
  powerpc: book3s64: convert to pin_user_pages() and put_user_page()
  mm/gup_benchmark: use proper FOLL_WRITE flags instead of hard-coding
    "1"
  mm, tree-wide: rename put_user_page*() to unpin_user_page*()

 Documentation/core-api/index.rst            |   1 +
 Documentation/core-api/pin_user_pages.rst   | 233 ++++++++++++++
 arch/powerpc/mm/book3s64/iommu_api.c        |  12 +-
 drivers/gpu/drm/via/via_dmablit.c           |   6 +-
 drivers/infiniband/core/umem.c              |   4 +-
 drivers/infiniband/core/umem_odp.c          |  13 +-
 drivers/infiniband/hw/hfi1/user_pages.c     |   4 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c |   8 +-
 drivers/infiniband/hw/qib/qib_user_pages.c  |   4 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   |   8 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c    |   4 +-
 drivers/infiniband/sw/siw/siw_mem.c         |   4 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c   |   8 +-
 drivers/nvdimm/pmem.c                       |   6 -
 drivers/platform/goldfish/goldfish_pipe.c   |  35 +-
 drivers/vfio/vfio_iommu_type1.c             |  35 +-
 fs/io_uring.c                               |   6 +-
 include/linux/mm.h                          |  77 +++--
 mm/gup.c                                    | 340 +++++++++++++-------
 mm/gup_benchmark.c                          |   9 +-
 mm/memremap.c                               |  80 ++---
 mm/process_vm_access.c                      |  28 +-
 net/xdp/xdp_umem.c                          |   4 +-
 tools/testing/selftests/vm/gup_benchmark.c  |   6 +-
 24 files changed, 650 insertions(+), 285 deletions(-)
 create mode 100644 Documentation/core-api/pin_user_pages.rst