mbox series

[v11,0/9] Userspace P2PDMA with O_DIRECT NVMe devices

Message ID 20221021174116.7200-1-logang@deltatee.com (mailing list archive)
Headers show
Series Userspace P2PDMA with O_DIRECT NVMe devices | expand

Message

Logan Gunthorpe Oct. 21, 2022, 5:41 p.m. UTC
Hi,

This is the latest P2PDMA userspace patch set. This version includes
some cleanup from feedback from the last posting[1].

This patch set enables userspace P2PDMA by allowing userspace to mmap()
allocated chunks of the CMB. The resulting VMA can be passed only
to O_DIRECT IO on NVMe backed files or block devices. A flag is added
to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
on whether the block queue indicates P2PDMA support. Patches 7
creates the sysfs resource that can hand out the VMAs and Patch 8
adds brief documentation for the new interface.

Feedback welcome.

This series is based on v6.1-rc1. A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v11

Thanks,

Logan

[1] https://lkml.kernel.org/r/20220922163926.7077-1-logang@deltatee.com

--

Changes in v11:
  - Rebased onto v6.1-rc1, fixed minor conflict in bio_map_user_iov
  - The GUP test was moved to try_grab_page() and try_grab_folio().
    This ought to be a bit more future proof. It required adding a new
    cleanup patch to return a proper error code from try_grab_page().
    (Per Jason)

Changes in v10:
  - Rebased onto v6.0-rc6
  - Reworked iov iter changes to reuse the code better and
    name them without the _flags() prefix (per Christoph)
  - Renamed a number of flags variables to gup_flags (per John)
  - Minor fixups to the last documentation patch (from Greg and John)

Changes in v9:
  - Rebased onto v6.0-rc2, included reworking the iov_iter patch
    due to changes there
  - Drop the char device mmap implementation in favour of a sysfs
    based interface. (per Christoph)

 (v8 only included the first half of the series and was merged for v6.0)

Changes in v8:
  - Rebase onto v5.19-rc1
  - Rework how the pages are stored in the VMA per Jason's suggestion

Changes in v7:
  - Rebased onto v5.18-rc1 which includes Christophs cleanup to
    free_zone_device_page() (similar to Ralph's patch).
  - Fix bug with concurrent first calls to pci_p2pdma_vma_fault()
    that caused a double allocation and lost p2p memory. Noticed
    by Andrew Maier.
  - Collected a Reviewed-by tag from Chaitanya.
  - Numerous minor fixes to commit messages

--

Logan Gunthorpe (9):
  mm: allow multiple error returns in try_grab_page()
  mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
  iov_iter: introduce iov_iter_get_pages_[alloc_]flags()
  block: add check when merging zone device pages
  lib/scatterlist: add check when merging zone device pages
  block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()
  block: set FOLL_PCI_P2PDMA in bio_map_user_iov()
  PCI/P2PDMA: Allow userspace VMA allocations through sysfs
  ABI: sysfs-bus-pci: add documentation for p2pmem allocate

 Documentation/ABI/testing/sysfs-bus-pci |  10 ++
 block/bio.c                             |  11 ++-
 block/blk-map.c                         |  12 ++-
 drivers/pci/p2pdma.c                    | 124 ++++++++++++++++++++++++
 include/linux/mm.h                      |   3 +-
 include/linux/mmzone.h                  |  24 +++++
 include/linux/uio.h                     |   6 ++
 lib/iov_iter.c                          |  32 ++++--
 lib/scatterlist.c                       |  25 +++--
 mm/gup.c                                |  45 ++++++---
 mm/huge_memory.c                        |  19 ++--
 mm/hugetlb.c                            |  23 +++--
 12 files changed, 280 insertions(+), 54 deletions(-)


base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
--
2.30.2

Comments

Christoph Hellwig Oct. 24, 2022, 3:03 p.m. UTC | #1
The series looks good to me know. How do we want to handle it?  I think
we need a special branch somewhere (maybe in the block or mm trees?)
so that we can base the other iov_iter work from John on it.  Also
Al has a whole bunch of iov_iter changes that we probably want on
the same branch as well, although some of those (READ vs WRITE fixups)
look like 6.1 material to me.
John Hubbard Oct. 24, 2022, 7:15 p.m. UTC | #2
On 10/24/22 08:03, Christoph Hellwig wrote:
> The series looks good to me know. How do we want to handle it?  I think
> we need a special branch somewhere (maybe in the block or mm trees?)
> so that we can base the other iov_iter work from John on it.  Also
> Al has a whole bunch of iov_iter changes that we probably want on
> the same branch as well, although some of those (READ vs WRITE fixups)
> look like 6.1 material to me.
> 

A little earlier, Jens graciously offered [1] to provide a topic branch,
such as:

     for-6.2/block-gup [2]

(I've moved the name forward from 6.1 to 6.2, because that discussion
was 7 weeks ago.)


[1] https://lore.kernel.org/ae675a01-90e6-4af1-6c43-660b3a6c7b72@kernel.dk
[2] https://lore.kernel.org/55a2d67f-9a12-9fe6-d73b-8c3f5eb36f31@kernel.dk

thanks,
Christoph Hellwig Nov. 8, 2022, 6:56 a.m. UTC | #3
On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
> A little earlier, Jens graciously offered [1] to provide a topic branch,
> such as:
>
>     for-6.2/block-gup [2]
>
> (I've moved the name forward from 6.1 to 6.2, because that discussion
> was 7 weeks ago.)

So what are we going to do with this series?  It would be sad to miss
the merge window again.
Logan Gunthorpe Nov. 9, 2022, 5:28 p.m. UTC | #4
@add Jens

On 2022-11-07 23:56, Christoph Hellwig wrote:
> On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
>> A little earlier, Jens graciously offered [1] to provide a topic branch,
>> such as:
>>
>>     for-6.2/block-gup [2]
>>
>> (I've moved the name forward from 6.1 to 6.2, because that discussion
>> was 7 weeks ago.)
> 
> So what are we going to do with this series?  It would be sad to miss
> the merge window again.

I noticed Jens wasn't copied on this series. I've added him. It would be
nice to get this in someone's tree soon.

Thanks!

Logan
Jens Axboe Nov. 9, 2022, 6:33 p.m. UTC | #5
On 11/9/22 10:28 AM, Logan Gunthorpe wrote:
> @add Jens
> 
> On 2022-11-07 23:56, Christoph Hellwig wrote:
>> On Mon, Oct 24, 2022 at 12:15:56PM -0700, John Hubbard wrote:
>>> A little earlier, Jens graciously offered [1] to provide a topic branch,
>>> such as:
>>>
>>>     for-6.2/block-gup [2]
>>>
>>> (I've moved the name forward from 6.1 to 6.2, because that discussion
>>> was 7 weeks ago.)
>>
>> So what are we going to do with this series?  It would be sad to miss
>> the merge window again.
> 
> I noticed Jens wasn't copied on this series. I've added him. It would be
> nice to get this in someone's tree soon.

I took a look and the series looks fine to me.
Jens Axboe Nov. 9, 2022, 6:44 p.m. UTC | #6
On Fri, 21 Oct 2022 11:41:07 -0600, Logan Gunthorpe wrote:
> This is the latest P2PDMA userspace patch set. This version includes
> some cleanup from feedback from the last posting[1].
> 
> This patch set enables userspace P2PDMA by allowing userspace to mmap()
> allocated chunks of the CMB. The resulting VMA can be passed only
> to O_DIRECT IO on NVMe backed files or block devices. A flag is added
> to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
> on whether the block queue indicates P2PDMA support. Patches 7
> creates the sysfs resource that can hand out the VMAs and Patch 8
> adds brief documentation for the new interface.
> 
> [...]

Applied, thanks!

[1/9] mm: allow multiple error returns in try_grab_page()
      commit: 0f0892356fa174bdd8bd655c820ee3658c4c9f01
[2/9] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
      commit: 4003f107fa2eabb0aab90e37a1ed7b74c6f0d132
[3/9] iov_iter: introduce iov_iter_get_pages_[alloc_]flags()
      commit: d82076403cef7fcd1e7617c9db48bf21ebdc1f9c
[4/9] block: add check when merging zone device pages
      commit: 49580e690755d0e51ed7aa2c33225dd884fa738a
[5/9] lib/scatterlist: add check when merging zone device pages
      commit: 1567b49d1a4081ba7e1a0ff2dc39bc58c59f2a51
[6/9] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()
      commit: 5e3e3f2e15df46abcab1959f93f214f778b6ec49
[7/9] block: set FOLL_PCI_P2PDMA in bio_map_user_iov()
      commit: 7ee4ccf57484d260c37b29f9a48b65c4101403e8
[8/9] PCI/P2PDMA: Allow userspace VMA allocations through sysfs
      commit: 7e9c7ef83d785236f5a8c3761dd053fae9b92fb8
[9/9] ABI: sysfs-bus-pci: add documentation for p2pmem allocate
      commit: 6d4338cb4070a762dba0cadee00b7ec206d9f868

Best regards,