Message ID | 20190724042518.14363-1-jhubbard@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | block/bio, fs: convert put_page() to put_user_page*() | expand |
On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote: > * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. > Then, use the new iov_iter_get_pages_use_gup() to retrieve it when > it is time to release the pages. That allows choosing between put_page() > and put_user_page*(). > > * Pass in one more piece of information to bio_release_pages: a "from_gup" > parameter. Similar use as above. > > * Change the block layer, and several file systems, to use > put_user_page*(). I think we can do this in a simple and better way. We have 5 ITER_* types. Of those ITER_DISCARD as the name suggests never uses pages, so we can skip handling it. ITER_PIPE is rejected іn the direct I/O path, which leaves us with three. Out of those ITER_BVEC needs a user page reference, so we want to call put_user_page* on it. ITER_BVEC always already has page reference, which means in the block direct I/O path path we alread don't take a page reference. We should extent that handling to all other calls of iov_iter_get_pages / iov_iter_get_pages_alloc. I think we should just reject ITER_KVEC for direct I/O as well as we have no users and it is rather pointless. Alternatively if we see a use for it the callers should always have a life page reference anyway (or might be on kmalloc memory), so we really should not take a reference either. In other words: the only time we should ever have to put a page in this patch is when they are user pages. We'll need to clean up various bits of code for that, but that can be done gradually before even getting to the actual put_user_pages conversion.
On 7/23/19 11:17 PM, Christoph Hellwig wrote: > On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote: >> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. >> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when >> it is time to release the pages. That allows choosing between put_page() >> and put_user_page*(). >> >> * Pass in one more piece of information to bio_release_pages: a "from_gup" >> parameter. Similar use as above. >> >> * Change the block layer, and several file systems, to use >> put_user_page*(). > > I think we can do this in a simple and better way. We have 5 ITER_* > types. Of those ITER_DISCARD as the name suggests never uses pages, so > we can skip handling it. ITER_PIPE is rejected іn the direct I/O path, > which leaves us with three. > > Out of those ITER_BVEC needs a user page reference, so we want to call ^ ITER_IOVEC, I hope. Otherwise I'm hopeless lost. :) > put_user_page* on it. ITER_BVEC always already has page reference, > which means in the block direct I/O path path we alread don't take > a page reference. We should extent that handling to all other calls > of iov_iter_get_pages / iov_iter_get_pages_alloc. I think we should > just reject ITER_KVEC for direct I/O as well as we have no users and > it is rather pointless. Alternatively if we see a use for it the > callers should always have a life page reference anyway (or might > be on kmalloc memory), so we really should not take a reference either. > > In other words: the only time we should ever have to put a page in > this patch is when they are user pages. We'll need to clean up > various bits of code for that, but that can be done gradually before > even getting to the actual put_user_pages conversion. > Sounds great. I'm part way into it and it doesn't look too bad. The main question is where to scatter various checks and assertions, to keep the kvecs out of direct I/0. Or at least keep the gups away from direct I/0. thanks,
On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote: > From: John Hubbard <jhubbard@nvidia.com> > > Hi, > > This is mostly Jerome's work, converting the block/bio and related areas > to call put_user_page*() instead of put_page(). Because I've changed > Jerome's patches, in some cases significantly, I'd like to get his > feedback before we actually leave him listed as the author (he might > want to disown some or all of these). > Could you add some background to the commit log for people don't have the context.. Why this converting? What's the main differences? Regards, -Bob > I added a new patch, in order to make this work with Christoph Hellwig's > recent overhaul to bio_release_pages(): "block: bio_release_pages: use > flags arg instead of bool". > > I've started the series with a patch that I've posted in another > series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]), > because I'm not sure which of these will go in first, and this allows each > to stand alone. > > Testing: not much beyond build and boot testing has been done yet. And > I'm not set up to even exercise all of it (especially the IB parts) at > run time. > > Anyway, changes here are: > > * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. > Then, use the new iov_iter_get_pages_use_gup() to retrieve it when > it is time to release the pages. That allows choosing between put_page() > and put_user_page*(). > > * Pass in one more piece of information to bio_release_pages: a "from_gup" > parameter. Similar use as above. > > * Change the block layer, and several file systems, to use > put_user_page*(). > > [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_r_20190724012606.25844-2D2-2Djhubbard-40nvidia.com&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=FpFhv2rjbKCAYGmO6Hy8WJAottr1Qz_mDKDLObQ40FU&s=q-_mX3daEr22WbdZMElc_ZbD8L9oGLD7U0xLeyJ661Y&e= > And please note the correction email that I posted as a follow-up, > if you're looking closely at that patch. :) The fixed version is > included here. > > John Hubbard (3): > mm/gup: add make_dirty arg to put_user_pages_dirty_lock() > block: bio_release_pages: use flags arg instead of bool > fs/ceph: fix a build warning: returning a value from void function > > Jérôme Glisse (9): > iov_iter: add helper to test if an iter would use GUP v2 > block: bio_release_pages: convert put_page() to put_user_page*() > block_dev: convert put_page() to put_user_page*() > fs/nfs: convert put_page() to put_user_page*() > vhost-scsi: convert put_page() to put_user_page*() > fs/cifs: convert put_page() to put_user_page*() > fs/fuse: convert put_page() to put_user_page*() > fs/ceph: convert put_page() to put_user_page*() > 9p/net: convert put_page() to put_user_page*() > > block/bio.c | 81 ++++++++++++--- > drivers/infiniband/core/umem.c | 5 +- > drivers/infiniband/hw/hfi1/user_pages.c | 5 +- > drivers/infiniband/hw/qib/qib_user_pages.c | 5 +- > drivers/infiniband/hw/usnic/usnic_uiom.c | 5 +- > drivers/infiniband/sw/siw/siw_mem.c | 8 +- > drivers/vhost/scsi.c | 13 ++- > fs/block_dev.c | 22 +++- > fs/ceph/debugfs.c | 2 +- > fs/ceph/file.c | 62 ++++++++--- > fs/cifs/cifsglob.h | 3 + > fs/cifs/file.c | 22 +++- > fs/cifs/misc.c | 19 +++- > fs/direct-io.c | 2 +- > fs/fuse/dev.c | 22 +++- > fs/fuse/file.c | 53 +++++++--- > fs/nfs/direct.c | 10 +- > include/linux/bio.h | 22 +++- > include/linux/mm.h | 5 +- > include/linux/uio.h | 11 ++ > mm/gup.c | 115 +++++++++------------ > net/9p/trans_common.c | 14 ++- > net/9p/trans_common.h | 3 +- > net/9p/trans_virtio.c | 18 +++- > 24 files changed, 357 insertions(+), 170 deletions(-) >
On 7/24/19 5:41 PM, Bob Liu wrote: > On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote: >> From: John Hubbard <jhubbard@nvidia.com> >> >> Hi, >> >> This is mostly Jerome's work, converting the block/bio and related areas >> to call put_user_page*() instead of put_page(). Because I've changed >> Jerome's patches, in some cases significantly, I'd like to get his >> feedback before we actually leave him listed as the author (he might >> want to disown some or all of these). >> > > Could you add some background to the commit log for people don't have the context.. > Why this converting? What's the main differences? > Hi Bob, 1. Many of the patches have a blurb like this: For pages that were retained via get_user_pages*(), release those pages via the new put_user_page*() routines, instead of via put_page(). This is part a tree-wide conversion, as described in commit fc1d8e7cca2d ("mm: introduce put_user_page*(), placeholder versions"). ...and if you look at that commit, you'll find several pages of information in its commit description, which should address your point. 2. This whole series has to be re-worked, as per the other feedback thread. So I'll keep your comment in mind when I post a new series. thanks,
On 7/23/19 11:17 PM, Christoph Hellwig wrote: > On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote: >> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. >> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when >> it is time to release the pages. That allows choosing between put_page() >> and put_user_page*(). >> >> * Pass in one more piece of information to bio_release_pages: a "from_gup" >> parameter. Similar use as above. >> >> * Change the block layer, and several file systems, to use >> put_user_page*(). > > I think we can do this in a simple and better way. We have 5 ITER_* > types. Of those ITER_DISCARD as the name suggests never uses pages, so > we can skip handling it. ITER_PIPE is rejected іn the direct I/O path, > which leaves us with three. > Hi Christoph, Are you working on anything like this? Or on the put_user_bvec() idea? Please let me know, otherwise I'll go in and implement something here. thanks,
On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote: > On 7/23/19 11:17 PM, Christoph Hellwig wrote: > > On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote: > >> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. > >> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when > >> it is time to release the pages. That allows choosing between put_page() > >> and put_user_page*(). > >> > >> * Pass in one more piece of information to bio_release_pages: a "from_gup" > >> parameter. Similar use as above. > >> > >> * Change the block layer, and several file systems, to use > >> put_user_page*(). > > > > I think we can do this in a simple and better way. We have 5 ITER_* > > types. Of those ITER_DISCARD as the name suggests never uses pages, so > > we can skip handling it. ITER_PIPE is rejected іn the direct I/O path, > > which leaves us with three. > > > > Hi Christoph, > > Are you working on anything like this? I was hoping I could steer you towards it. But if you don't want to do it yourself I'll add it to my ever growing todo list. > Or on the put_user_bvec() idea? I have a prototype from two month ago: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec but that only survived the most basic testing, so it'll need more work, which I'm not sure when I'll find time for.
On 8/6/19 11:34 PM, Christoph Hellwig wrote: > On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote: >> On 7/23/19 11:17 PM, Christoph Hellwig wrote: ... >>> I think we can do this in a simple and better way. We have 5 ITER_* >>> types. Of those ITER_DISCARD as the name suggests never uses pages, so >>> we can skip handling it. ITER_PIPE is rejected іn the direct I/O path, >>> which leaves us with three. >>> >> >> Hi Christoph, >> >> Are you working on anything like this? > > I was hoping I could steer you towards it. But if you don't want to do > it yourself I'll add it to my ever growing todo list. > Sure, I'm up for this. The bvec-related items are the next logical part of the gup/dma conversions to work on, and I just wanted to avoid solving the same problem if you were already in the code. >> Or on the put_user_bvec() idea? > > I have a prototype from two month ago: > > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec > > but that only survived the most basic testing, so it'll need more work, > which I'm not sure when I'll find time for. > I'll take a peek, and probably pester you with a few questions if I get confused. :) thanks,
From: John Hubbard <jhubbard@nvidia.com> Hi, This is mostly Jerome's work, converting the block/bio and related areas to call put_user_page*() instead of put_page(). Because I've changed Jerome's patches, in some cases significantly, I'd like to get his feedback before we actually leave him listed as the author (he might want to disown some or all of these). I added a new patch, in order to make this work with Christoph Hellwig's recent overhaul to bio_release_pages(): "block: bio_release_pages: use flags arg instead of bool". I've started the series with a patch that I've posted in another series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]), because I'm not sure which of these will go in first, and this allows each to stand alone. Testing: not much beyond build and boot testing has been done yet. And I'm not set up to even exercise all of it (especially the IB parts) at run time. Anyway, changes here are: * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter. Then, use the new iov_iter_get_pages_use_gup() to retrieve it when it is time to release the pages. That allows choosing between put_page() and put_user_page*(). * Pass in one more piece of information to bio_release_pages: a "from_gup" parameter. Similar use as above. * Change the block layer, and several file systems, to use put_user_page*(). [1] https://lore.kernel.org/r/20190724012606.25844-2-jhubbard@nvidia.com And please note the correction email that I posted as a follow-up, if you're looking closely at that patch. :) The fixed version is included here. John Hubbard (3): mm/gup: add make_dirty arg to put_user_pages_dirty_lock() block: bio_release_pages: use flags arg instead of bool fs/ceph: fix a build warning: returning a value from void function Jérôme Glisse (9): iov_iter: add helper to test if an iter would use GUP v2 block: bio_release_pages: convert put_page() to put_user_page*() block_dev: convert put_page() to put_user_page*() fs/nfs: convert put_page() to put_user_page*() vhost-scsi: convert put_page() to put_user_page*() fs/cifs: convert put_page() to put_user_page*() fs/fuse: convert put_page() to put_user_page*() fs/ceph: convert put_page() to put_user_page*() 9p/net: convert put_page() to put_user_page*() block/bio.c | 81 ++++++++++++--- drivers/infiniband/core/umem.c | 5 +- drivers/infiniband/hw/hfi1/user_pages.c | 5 +- drivers/infiniband/hw/qib/qib_user_pages.c | 5 +- drivers/infiniband/hw/usnic/usnic_uiom.c | 5 +- drivers/infiniband/sw/siw/siw_mem.c | 8 +- drivers/vhost/scsi.c | 13 ++- fs/block_dev.c | 22 +++- fs/ceph/debugfs.c | 2 +- fs/ceph/file.c | 62 ++++++++--- fs/cifs/cifsglob.h | 3 + fs/cifs/file.c | 22 +++- fs/cifs/misc.c | 19 +++- fs/direct-io.c | 2 +- fs/fuse/dev.c | 22 +++- fs/fuse/file.c | 53 +++++++--- fs/nfs/direct.c | 10 +- include/linux/bio.h | 22 +++- include/linux/mm.h | 5 +- include/linux/uio.h | 11 ++ mm/gup.c | 115 +++++++++------------ net/9p/trans_common.c | 14 ++- net/9p/trans_common.h | 3 +- net/9p/trans_virtio.c | 18 +++- 24 files changed, 357 insertions(+), 170 deletions(-)