mbox series

[v2,0/7] convert most filesystems to pin_user_pages_fast()

Message ID 20220831041843.973026-1-jhubbard@nvidia.com (mailing list archive)
Headers show
Series convert most filesystems to pin_user_pages_fast() | expand

Message

John Hubbard Aug. 31, 2022, 4:18 a.m. UTC
This is v2. Changes since v1 are:

* Incorporated feedback from Al Viro and Jan Kara: this approach now
pins both bvecs (ITER_BVEC) and user pages (user_backed_iter()) with
FOLL_PIN.

* Incorporated David Hildenbrand's feedback: Rewrote pin_user_pages()
documentation and added a WARN_ON_ONCE() to somewhat enforce the rule
that this new function is only intended for use on file-backed pages.

* Added a tiny new patch to fix up the release_pages() number of pages
argument, so as to avoid a lot of impedance-matching checks in
subsequent patches.

v1 is here:

https://lore.kernel.org/all/20220827083607.2345453-1-jhubbard@nvidia.com/

Original cover letter still applies, here it is for convenience:

This converts the iomap core and bio_release_pages() to
pin_user_pages_fast(), also referred to as FOLL_PIN here.

The conversion is temporarily guarded by
CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO. In the future (not part of this
series), when we are certain that all filesystems have converted their
Direct IO paths to FOLL_PIN, then we can do the final step, which is to
get rid of CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO and search-and-replace
the dio_w_*() functions with their final names (see bvec.h changes).

I'd like to get this part committed at some point, because it seems to
work well already. And this will help get the remaining items, below,
converted.

Status: although many filesystems have been converted, some remain to be
investigated. These include (you can recreate this list by grepping for
iov_iter_get_pages):

	cephfs
	cifs
	9P
	RDS
	net/core: datagram.c, skmsg.c
	net/tls
	fs/splice.c

Testing: this passes some light LTP and xfstest runs and fio and a few
other things like that, on my local x86_64 test machine, both with and
without CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO being set.

Conflicts: Logan, the iov_iter parts of this will conflict with your
[PATCH v9 2/8] iov_iter: introduce iov_iter_get_pages_[alloc_]flags(),
but I think it's easy to resolve.


John Hubbard (7):
  mm: change release_pages() to use unsigned long for npages
  mm/gup: introduce pin_user_page()
  block: add dio_w_*() wrappers for pin, unpin user pages
  iov_iter: new iov_iter_pin_pages*() routines
  block, bio, fs: convert most filesystems to pin_user_pages_fast()
  NFS: direct-io: convert to FOLL_PIN pages
  fuse: convert direct IO paths to use FOLL_PIN

 block/Kconfig        | 24 +++++++++++++
 block/bio.c          | 27 +++++++-------
 block/blk-map.c      |  7 ++--
 fs/direct-io.c       | 40 ++++++++++-----------
 fs/fuse/dev.c        | 11 ++++--
 fs/fuse/file.c       | 32 +++++++++++------
 fs/fuse/fuse_i.h     |  1 +
 fs/iomap/direct-io.c |  2 +-
 fs/nfs/direct.c      | 22 ++++++------
 include/linux/bvec.h | 37 +++++++++++++++++++
 include/linux/mm.h   |  3 +-
 include/linux/uio.h  |  4 +++
 lib/iov_iter.c       | 86 ++++++++++++++++++++++++++++++++++++++++----
 mm/gup.c             | 50 ++++++++++++++++++++++++++
 mm/swap.c            |  6 ++--
 15 files changed, 282 insertions(+), 70 deletions(-)


base-commit: dcf8e5633e2e69ad60b730ab5905608b756a032f

Comments

Christoph Hellwig Sept. 6, 2022, 6:36 a.m. UTC | #1
On Tue, Aug 30, 2022 at 09:18:36PM -0700, John Hubbard wrote:
> The conversion is temporarily guarded by
> CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO. In the future (not part of this
> series), when we are certain that all filesystems have converted their
> Direct IO paths to FOLL_PIN, then we can do the final step, which is to
> get rid of CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO and search-and-replace
> the dio_w_*() functions with their final names (see bvec.h changes).

What is the the point of these wrappers?  We should be able to
convert one caller at a time in an entirely safe way.
John Hubbard Sept. 6, 2022, 7:10 a.m. UTC | #2
On 9/5/22 23:36, Christoph Hellwig wrote:
> On Tue, Aug 30, 2022 at 09:18:36PM -0700, John Hubbard wrote:
>> The conversion is temporarily guarded by
>> CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO. In the future (not part of this
>> series), when we are certain that all filesystems have converted their
>> Direct IO paths to FOLL_PIN, then we can do the final step, which is to
>> get rid of CONFIG_BLK_USE_PIN_USER_PAGES_FOR_DIO and search-and-replace
>> the dio_w_*() functions with their final names (see bvec.h changes).
> 
> What is the the point of these wrappers?  We should be able to
> convert one caller at a time in an entirely safe way.

I would be delighted if that were somehow possible. Every time I think
it's possible, it has fallen apart. The fact that bio_release_pages()
will need to switch over from put_page() to unpin_user_page(), combined
with the fact that there are a lot of callers that submit bios, has
led me to the current approach.

What did you have in mind?

thanks,
Christoph Hellwig Sept. 6, 2022, 7:22 a.m. UTC | #3
On Tue, Sep 06, 2022 at 12:10:54AM -0700, John Hubbard wrote:
> I would be delighted if that were somehow possible. Every time I think
> it's possible, it has fallen apart. The fact that bio_release_pages()
> will need to switch over from put_page() to unpin_user_page(), combined
> with the fact that there are a lot of callers that submit bios, has
> led me to the current approach.

We can (temporarily) pass the gup flag to bio_release_pages or even
better add a new bio_unpin_pages helper that undoes the pin side.
That is: don't try to reuse the old APIs, but ad new ones, just like
we do on the lower layers.
John Hubbard Sept. 6, 2022, 7:37 a.m. UTC | #4
On 9/6/22 00:22, Christoph Hellwig wrote:
> On Tue, Sep 06, 2022 at 12:10:54AM -0700, John Hubbard wrote:
>> I would be delighted if that were somehow possible. Every time I think
>> it's possible, it has fallen apart. The fact that bio_release_pages()
>> will need to switch over from put_page() to unpin_user_page(), combined
>> with the fact that there are a lot of callers that submit bios, has
>> led me to the current approach.
> 
> We can (temporarily) pass the gup flag to bio_release_pages or even
> better add a new bio_unpin_pages helper that undoes the pin side.
> That is: don't try to reuse the old APIs, but ad new ones, just like
> we do on the lower layers.

OK...so, to confirm: the idea is to convert these callsites (below) to
call a new bio_unpin_pages() routine that does unpin_user_page().

$ git grep -nw bio_release_pages
block/bio.c:1474:               bio_release_pages(bio, true);
block/bio.c:1490:       bio_release_pages(bio, false);
block/blk-map.c:308:    bio_release_pages(bio, false);
block/blk-map.c:610:                    bio_release_pages(bio, bio_data_dir(bio) == READ);
block/fops.c:99:        bio_release_pages(&bio, should_dirty);
block/fops.c:165:               bio_release_pages(bio, false);
block/fops.c:289:               bio_release_pages(bio, false);
fs/direct-io.c:510:             bio_release_pages(bio, should_dirty);
fs/iomap/direct-io.c:185:               bio_release_pages(bio, false);
fs/zonefs/super.c:793:  bio_release_pages(bio, false);


And these can probably be done in groups, not as one big patch--your
other email also asked to break them up into block, iomap, and legacy.

That would be nice, I really hate the ugly dio_w*() wrappers, thanks
for this help. :)

thanks,
Christoph Hellwig Sept. 6, 2022, 7:46 a.m. UTC | #5
On Tue, Sep 06, 2022 at 12:37:00AM -0700, John Hubbard wrote:
> On 9/6/22 00:22, Christoph Hellwig wrote:
> > On Tue, Sep 06, 2022 at 12:10:54AM -0700, John Hubbard wrote:
> >> I would be delighted if that were somehow possible. Every time I think
> >> it's possible, it has fallen apart. The fact that bio_release_pages()
> >> will need to switch over from put_page() to unpin_user_page(), combined
> >> with the fact that there are a lot of callers that submit bios, has
> >> led me to the current approach.
> > 
> > We can (temporarily) pass the gup flag to bio_release_pages or even
> > better add a new bio_unpin_pages helper that undoes the pin side.
> > That is: don't try to reuse the old APIs, but ad new ones, just like
> > we do on the lower layers.
> 
> OK...so, to confirm: the idea is to convert these callsites (below) to
> call a new bio_unpin_pages() routine that does unpin_user_page().

Yeah.  And to stay symmetric also a new bio_iov_iter_pin_pages for
the pin side.