Message ID | 20190807013340.9706-1-jhubbard@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | put_user_pages(): miscellaneous call sites | expand |
On 8/6/19 6:32 PM, john.hubbard@gmail.com wrote: > From: John Hubbard <jhubbard@nvidia.com> > ... > > John Hubbard (38): > mm/gup: add make_dirty arg to put_user_pages_dirty_lock() ... > 54 files changed, 191 insertions(+), 323 deletions(-) > ahem, yes, apparently this is what happens if I add a few patches while editing the cover letter... :) The subject line should read "00/41", and the list of files affected here is therefore under-reported in this cover letter. However, the patch series itself is intact and ready for submission. thanks,
Hi John... I added this patch series on top of Linux 5.3rc6 and ran xfstests with no regressions... Acked-by: Mike Marshall <hubcap@omnibond.com> -Mike On Tue, Aug 6, 2019 at 9:50 PM John Hubbard <jhubbard@nvidia.com> wrote: > > On 8/6/19 6:32 PM, john.hubbard@gmail.com wrote: > > From: John Hubbard <jhubbard@nvidia.com> > > ... > > > > John Hubbard (38): > > mm/gup: add make_dirty arg to put_user_pages_dirty_lock() > ... > > 54 files changed, 191 insertions(+), 323 deletions(-) > > > ahem, yes, apparently this is what happens if I add a few patches while editing > the cover letter... :) > > The subject line should read "00/41", and the list of files affected here is > therefore under-reported in this cover letter. However, the patch series itself is > intact and ready for submission. > > thanks, > -- > John Hubbard > NVIDIA
On 8/29/2019 6:29 PM, Mike Marshall wrote: > Hi John... > > I added this patch series on top of Linux 5.3rc6 and ran > xfstests with no regressions... > > Acked-by: Mike Marshall <hubcap@omnibond.com> > Hi Mike (and I hope Ira and others are reading as well, because I'm making a bunch of claims further down), That's great news, thanks for running that test suite and for the report and the ACK. There is an interesting pause right now, due to the fact that we've made some tentative decisions about gup pinning, that affect the call sites. A key decision is that only pages that were requested via FOLL_PIN, will require put_user_page*() to release them. There are 4 main cases, which were first explained by Jan Kara and Vlastimil Babka, and are now written up in my FOLL_PIN patch [1]. So, what that means for this series is that: 1. Some call sites (mlock.c for example, and a lot of the mm/ files in fact, and more) will not be converted: some of these patches will get dropped, especially in mm/. 2. Call sites that do DirectIO or RDMA will need to set FOLL_PIN, and will also need to call put_user_page(). 3. Call sites that do RDMA will need to set FOLL_LONGTERM *and* FOLL_PIN, 3.a. ...and will at least in some cases need to provide a link to a vaddr_pin object, and thus back to a struct file*...maybe. Still under discussion. 4. It's desirable to keep FOLL_* flags (or at least FOLL_PIN) internal to the gup() calls. That implies using a wrapper call such as Ira's vaddr_pin_[user]_pages(), instead of gup(), and vaddr_unpin_[user]_pages() instead of put_user_page*(). 5. We don't want to churn the call sites unnecessarily. With that in mind, I've taken another pass through all these patches and narrowed it down to: a) 12 call sites that I'd like to convert soon, but even those really look cleaner with a full conversion to a wrapper call similar to (identical to?) vaddr_pin_[user]_pages(), probably just the FOLL_PIN only variant (not FOLL_LONGTERM). That wrapper call is not ready yet, though. b) Some more call sites that require both FOLL_PIN and FOLL_LONGTERM. Definitely will wait to use the wrapper calls for these, because they may also require hooking up to a struct file*. c) A few more that were already applied, which is fine, because they show where to convert, and simplify a few sites anyway. But they'll need follow-on changes to, one way or another, set FOLL_PIN. d) And of course a few sites whose patches get dropped, as mentioned above. [1] https://lore.kernel.org/r/20190821040727.19650-3-jhubbard@nvidia.com thanks,
From: John Hubbard <jhubbard@nvidia.com> Hi, This consolidates everything into a "here's what's remaining for Andrew to add to his tree (for now)" series: * The first patch is an updated version of one that is already in the akpm tree. * The next two patches are already in the akpm tree, included here for completeness. * The last 5 patches are new to this series, but were previously posted. Changes since v2: * Updated patch 1 * Review feedback from Ira. (This only affected the code comments.) Added Ira's Reviewed-by. * Review feedback: Further collapsed the siw_umem code: siw_free_plist() is gone entirely. * Added 7 patches: * 3 patches from the "mm/: 3 more put_user_page() conversions" series: "mm/ksm: convert put_page() to put_user_page*()" "mm/mempolicy.c: convert put_page() to put_user_page*()" "mm/mlock.c: convert put_page() to put_user_page*()" * "security/tomoyo: convert put_page() to put_user_page*()", now that Tetsuo has ACK'd it for going in via Andrew's tree. * "powerpc: convert put_page() to put_user_page*()": no reviews yet. * two patches that were already accepted: "drivers/gpu/drm/via: convert put_page() to put_user_page*()" "net/xdp: convert put_page() to put_user_page*()" * Continued to omit 1 patch ("fs/io_uring.c: convert put_page() to put_user_page*()"), sent separately, because Jens Axboe is putting it into his tree. * Added Rodrigo Vivi's ACK for the i915 patch. * Added Tetsuo Handa's ACK for the security/tomoyo patch * Juergen Gross has verified that his Signed-off-by is valid. * Added Calum Mackay's Reviewed-by. Changes since v1: * 9 out of 34 patches have been reviewed or ack'd or changed: * Picked up Keith's Reviewed-by for patch 26 (gup_benchmark). * Picked up ACKs for patches 3, 10, 15, 16 (ceph, genwqe, staging/vc04_services, drivers/tee). * Patch 6 (i915): adjusted drivers/gpu/drm/i915/gem/i915_gem_userptr.c to match the latest linux.git: the code has already been fixed in linux.git, as of the latest -rc, to do a set_page_dirty_lock(), instead of set_page_dirty(). So all that it needs now is a conversion to put_user_page(). I've done that in a way (avoiding the changed API call) that allows patch 6 to go up via either Andrew's -mm tree, or the drm tree, just in case. See that patch's comments for slightly more detail. * Patch 20 (xen): applied Juergen's recommended fix, and speculatively (pending his approval) added his Signed-off-by (also noted in the patch comments). * Improved patch 31 (NFS) as recommended by Calum Mackay. * Includes the latest version of patch 1. (Patch 1 has been separately reposted [3], with those updates. And it's included here in order to make this series apply directly to linux.git, as noted in the original cover letter below.) Cover letter from v1: These are best characterized as miscellaneous conversions: many (not all) call sites that don't involve biovec or iov_iter, nor mm/. It also leaves out a few call sites that require some more work. These are mostly pretty simple ones. It's probably best to send all of these via Andrew's -mm tree, assuming that there are no significant merge conflicts with ongoing work in other trees (which I doubt, given that these are small changes). These patches apply to the latest linux.git. Patch #1 is also already in Andrew's tree, but given the broad non-linux-mm Cc list, I thought it would be more convenient to just include that patch here, so that people can use linux.git as the base--even though these are probably destined for linux-mm. This is part a tree-wide conversion, as described in commit fc1d8e7cca2d ("mm: introduce put_user_page*(), placeholder versions"). That commit has an extensive description of the problem and the planned steps to solve it, but the highlites are: 1) Provide put_user_page*() routines, intended to be used for releasing pages that were pinned via get_user_pages*(). 2) Convert all of the call sites for get_user_pages*(), to invoke put_user_page*(), instead of put_page(). This involves dozens of call sites, and will take some time. 3) After (2) is complete, use get_user_pages*() and put_user_page*() to implement tracking of these pages. This tracking will be separate from the existing struct page refcounting. 4) Use the tracking and identification of these pages, to implement special handling (especially in writeback paths) when the pages are backed by a filesystem. And a few references, also from that commit: [1] https://lwn.net/Articles/774411/ : "DMA and get_user_pages()" [2] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()" [3] "mm/gup: add make_dirty arg to put_user_pages_dirty_lock()" https://lore.kernel.org/r/20190804214042.4564-1-jhubbard@nvidia.com Ira Weiny (1): fs/binfmt_elf: convert put_page() to put_user_page*() John Hubbard (38): mm/gup: add make_dirty arg to put_user_pages_dirty_lock() net/rds: convert put_page() to put_user_page*() net/ceph: convert put_page() to put_user_page*() x86/kvm: convert put_page() to put_user_page*() drm/etnaviv: convert release_pages() to put_user_pages() drm/i915: convert put_page() to put_user_page*() drm/radeon: convert put_page() to put_user_page*() media/ivtv: convert put_page() to put_user_page*() media/v4l2-core/mm: convert put_page() to put_user_page*() genwqe: convert put_page() to put_user_page*() scif: convert put_page() to put_user_page*() vmci: convert put_page() to put_user_page*() rapidio: convert put_page() to put_user_page*() oradax: convert put_page() to put_user_page*() staging/vc04_services: convert put_page() to put_user_page*() drivers/tee: convert put_page() to put_user_page*() vfio: convert put_page() to put_user_page*() fbdev/pvr2fb: convert put_page() to put_user_page*() fsl_hypervisor: convert put_page() to put_user_page*() xen: convert put_page() to put_user_page*() fs/exec.c: convert put_page() to put_user_page*() orangefs: convert put_page() to put_user_page*() uprobes: convert put_page() to put_user_page*() futex: convert put_page() to put_user_page*() mm/frame_vector.c: convert put_page() to put_user_page*() mm/gup_benchmark.c: convert put_page() to put_user_page*() mm/memory.c: convert put_page() to put_user_page*() mm/madvise.c: convert put_page() to put_user_page*() mm/process_vm_access.c: convert put_page() to put_user_page*() crypt: convert put_page() to put_user_page*() fs/nfs: convert put_page() to put_user_page*() goldfish_pipe: convert put_page() to put_user_page*() kernel/events/core.c: convert put_page() to put_user_page*() security/tomoyo: convert put_page() to put_user_page*() powerpc: convert put_page() to put_user_page*() mm/mlock.c: convert put_page() to put_user_page*() mm/mempolicy.c: convert put_page() to put_user_page*() mm/ksm: convert put_page() to put_user_page*() arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 19 ++- arch/powerpc/kvm/e500_mmu.c | 3 +- arch/powerpc/mm/book3s64/iommu_api.c | 11 +- arch/x86/kvm/svm.c | 4 +- crypto/af_alg.c | 7 +- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 6 +- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- drivers/infiniband/core/umem.c | 5 +- drivers/infiniband/hw/hfi1/user_pages.c | 5 +- drivers/infiniband/hw/qib/qib_user_pages.c | 13 +-- drivers/infiniband/hw/usnic/usnic_uiom.c | 5 +- drivers/infiniband/sw/siw/siw_mem.c | 19 +-- drivers/media/pci/ivtv/ivtv-udma.c | 14 +-- drivers/media/pci/ivtv/ivtv-yuv.c | 11 +- drivers/media/v4l2-core/videobuf-dma-sg.c | 3 +- drivers/misc/genwqe/card_utils.c | 17 +-- drivers/misc/mic/scif/scif_rma.c | 17 ++- drivers/misc/vmw_vmci/vmci_context.c | 2 +- drivers/misc/vmw_vmci/vmci_queue_pair.c | 11 +- drivers/platform/goldfish/goldfish_pipe.c | 9 +- drivers/rapidio/devices/rio_mport_cdev.c | 9 +- drivers/sbus/char/oradax.c | 2 +- .../interface/vchiq_arm/vchiq_2835_arm.c | 10 +- drivers/tee/tee_shm.c | 10 +- drivers/vfio/vfio_iommu_type1.c | 8 +- drivers/video/fbdev/pvr2fb.c | 3 +- drivers/virt/fsl_hypervisor.c | 7 +- drivers/xen/privcmd.c | 32 ++--- fs/binfmt_elf.c | 2 +- fs/binfmt_elf_fdpic.c | 2 +- fs/exec.c | 2 +- fs/nfs/direct.c | 11 +- fs/orangefs/orangefs-bufmap.c | 7 +- include/linux/mm.h | 5 +- kernel/events/core.c | 2 +- kernel/events/uprobes.c | 6 +- kernel/futex.c | 10 +- mm/frame_vector.c | 4 +- mm/gup.c | 109 +++++++----------- mm/gup_benchmark.c | 2 +- mm/ksm.c | 14 +-- mm/madvise.c | 2 +- mm/memory.c | 2 +- mm/mempolicy.c | 2 +- mm/mlock.c | 6 +- mm/process_vm_access.c | 18 +-- net/ceph/pagevec.c | 8 +- net/rds/info.c | 5 +- net/rds/message.c | 2 +- net/rds/rdma.c | 15 ++- security/tomoyo/domain.c | 2 +- virt/kvm/kvm_main.c | 4 +- 54 files changed, 191 insertions(+), 323 deletions(-)