mbox series

[0/6] block, fs: convert most Direct IO cases to FOLL_PIN

Message ID 20220227093434.2889464-1-jhubbard@nvidia.com (mailing list archive)
Headers show
Series block, fs: convert most Direct IO cases to FOLL_PIN | expand

Message

jhubbard.send.patches@gmail.com Feb. 27, 2022, 9:34 a.m. UTC
From: John Hubbard <jhubbard@nvidia.com>

Hi,

The feedback on the RFC [1] prompted me to convert the core Direct IO
subsystem all at once. The key differences here, as compared to the RFC,
are:

    * no dio_w_*() wrapper routines,

    * no CONFIG parameter; and

    * new iov_iter_pin_pages*() routines that pin pages without
      affecting other callers of iov_iter_get_pages*(). Those other
      callers (ceph, rds, net, ...) can be converted separately.

Also, many pre-existing callers of unpin_user_pages_dirty_lock() are
wrong, and this series adds a few more callers. So readers may naturally
wonder about that. I recently had a very productive discussion with Ted
Ts'o, who suggested a way to fix the problem, and I'm going to implement
it, next. However, I think it's best to do that fix separately from
this, probably layered on top, although it could go either before or
after.

As part of fixing the "get_user_pages() + file-backed memory" problem
[2], and to support various COW-related fixes as well [3], we need to
convert the Direct IO code from get_user_pages_fast(), to
pin_user_pages_fast(). Because pin_user_pages*() calls require a
corresponding call to unpin_user_page(), the conversion is more
elaborate than just substitution.

In the main patch (patch 4) I'm a little concerned about the
bio_map_user_iov() changes, because the sole caller,
blk_rq_map_user_iov(), has either a direct mapped case or a copy from
user case, and I'm still not sure that these are properly kept separate,
from an unpin pages point of view. So a close look there by reviewers
would be welcome.

Testing: this needs lots of filesystem testing.

In this patchset:

Patches 1, 2: provide a few new routines that will be used by
conversion: pin_user_page(), iov_iter_pin_pages(),
iov_iter_pin_pages_alloc().

Patch 3: provide a few asserts that only user space pages are being
passed in for Direct IO. (This patch could be folded into another
patch.)

Patch 4: Convert all Direct IO callers that use iomap, or
blockdev_direct_IO(), or bio_iov_iter_get_pages().

Patch 5, 6: convert a few other callers to the new system: NFS-Direct,
and fuse.

This is based on linux-next (next-20220225). I've also stashed it here:

    https://github.com/johnhubbard/linux bio_pup_next_20220225


[1] https://lore.kernel.org/r/20220225085025.3052894-1-jhubbard@nvidia.com

[2] https://lwn.net/Articles/753027/ "The trouble with get_user_pages()"

[3] https://lore.kernel.org/all/20211217113049.23850-1-david@redhat.com/T/#u
    (David Hildenbrand's mm/COW fixes)

John Hubbard (6):
  mm/gup: introduce pin_user_page()
  iov_iter: new iov_iter_pin_pages*(), for FOLL_PIN pages
  block, fs: assert that key paths use iovecs, and nothing else
  block, bio, fs: convert most filesystems to pin_user_pages_fast()
  NFS: direct-io: convert to FOLL_PIN pages
  fuse: convert direct IO paths to use FOLL_PIN

 block/bio.c          | 29 ++++++++--------
 block/blk-map.c      |  6 ++--
 fs/direct-io.c       | 28 ++++++++--------
 fs/fuse/dev.c        |  7 ++--
 fs/fuse/file.c       | 38 +++++----------------
 fs/iomap/direct-io.c |  2 +-
 fs/nfs/direct.c      | 15 +++------
 include/linux/mm.h   |  1 +
 include/linux/uio.h  |  4 +++
 lib/iov_iter.c       | 78 ++++++++++++++++++++++++++++++++++++++++++++
 mm/gup.c             | 34 +++++++++++++++++++
 11 files changed, 170 insertions(+), 72 deletions(-)


base-commit: 06aeb1495c39c86ccfaf1adadc1d2200179f16eb