Message ID | 20241216095920.237117-1-wguay@fb.com (mailing list archive) |
---|---|
Headers | show |
Series | cover-letter: Allow MMIO regions to be exported through dmabuf | expand |
Hi Vivek, > On 16 Dec 2024, at 18:34, Kasireddy, Vivek <vivek.kasireddy@intel.com> wrote: > > > > Hi Wei Lin, > >> Subject: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported >> through dmabuf >> >> From: Wei Lin Guay <wguay@meta.com> >> >> This is another attempt to revive the patches posted by Jason >> Gunthorpe and Vivek Kasireddy, at >> https://patchwork.kernel.org/project/linux-media/cover/0-v2- >> 472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/ >> https://urldefense.com/v3/__https://lwn.net/Articles/970751/__;!!Bt8RZUm9aw!5uPrlSI9DhXVIeRMntADbkdWcu4YYhAzGN0OwyQ2NHxrN7bYE9f1eQBXUD8xHHPxiJorSkYhNjIRwdG4PsD2$ > v2: https://lore.kernel.org/dri-devel/20240624065552.1572580-1-vivek.kasireddy@intel.com/ > addresses review comments from Alex and Jason and also includes the ability > to create the dmabuf from multiple ranges. This is really needed to future-proof > the feature. The good thing is that your patch series addressed Christian’s concerns of adding dma_buf_try_get(). However, several questions regarding your v2 patch: - The dma-buf still support redundant mmap handler? (One of review comments from v1?) - Rather than handle different regions within a single dma-buf, would vfio-user open multiple distinct file descriptors work? For our specific use case, we don't require multiple regions and prefer Jason’s original patch. > > Also, my understanding is that this patchset cannot proceed until Leon's series is merged: > https://lore.kernel.org/kvm/cover.1733398913.git.leon@kernel.org/ Thanks for the pointer. I will rebase my local patch series on top of that. Thanks, Wei Lin > > > Thanks, > Vivek > >> >> In addition to the initial proposal by Jason, another promising >> application is exposing memory from an AI accelerator (bound to VFIO) >> to an RDMA device. This would allow the RDMA device to directly access >> the accelerator's memory, thereby facilitating direct data >> transactions between the RDMA device and the accelerator. >> >> Below is from the text/motivation from the orginal cover letter. >> >> dma-buf has become a way to safely acquire a handle to non-struct page >> memory that can still have lifetime controlled by the exporter. Notably >> RDMA can now import dma-buf FDs and build them into MRs which allows >> for >> PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory >> from PCI device BARs. >> >> This series supports a use case for SPDK where a NVMe device will be owned >> by SPDK through VFIO but interacting with a RDMA device. The RDMA device >> may directly access the NVMe CMB or directly manipulate the NVMe device's >> doorbell using PCI P2P. >> >> However, as a general mechanism, it can support many other scenarios with >> VFIO. I imagine this dmabuf approach to be usable by iommufd as well for >> generic and safe P2P mappings. >> >> This series goes after the "Break up ioctl dispatch functions to one >> function per ioctl" series. >> >> v2: >> - Name the new file dma_buf.c >> - Restore orig_nents before freeing >> - Fix reversed logic around priv->revoked >> - Set priv->index >> - Rebased on v2 "Break up ioctl dispatch functions" >> v1: https://lore.kernel.org/r/0-v1-9e6e1739ed95+5fa- >> vfio_dma_buf_jgg@nvidia.com >> Cc: linux-rdma@vger.kernel.org >> Cc: Oded Gabbay <ogabbay@kernel.org> >> Cc: Christian König <christian.koenig@amd.com> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >> Cc: Leon Romanovsky <leon@kernel.org> >> Cc: Maor Gottlieb <maorg@nvidia.com> >> Cc: dri-devel@lists.freedesktop.org >> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> >> >> Jason Gunthorpe (3): >> vfio: Add vfio_device_get() >> dma-buf: Add dma_buf_try_get() >> vfio/pci: Allow MMIO regions to be exported through dma-buf >> >> Wei Lin Guay (1): >> vfio/pci: Allow export dmabuf without move_notify from importer >> >> drivers/vfio/pci/Makefile | 1 + >> drivers/vfio/pci/dma_buf.c | 291 +++++++++++++++++++++++++++++ >> drivers/vfio/pci/vfio_pci_config.c | 8 +- >> drivers/vfio/pci/vfio_pci_core.c | 44 ++++- >> drivers/vfio/pci/vfio_pci_priv.h | 30 +++ >> drivers/vfio/vfio_main.c | 1 + >> include/linux/dma-buf.h | 13 ++ >> include/linux/vfio.h | 6 + >> include/linux/vfio_pci_core.h | 1 + >> include/uapi/linux/vfio.h | 18 ++ >> 10 files changed, 405 insertions(+), 8 deletions(-) >> create mode 100644 drivers/vfio/pci/dma_buf.c >> >> -- >> 2.43.5
Hi Wei Lin, > Subject: Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported > through dmabuf > >> > >> From: Wei Lin Guay <wguay@meta.com> > >> > >> This is another attempt to revive the patches posted by Jason > >> Gunthorpe and Vivek Kasireddy, at > >> https://patchwork.kernel.org/project/linux-media/cover/0-v2- > >> 472615b3877e+28f7-vfio_dma_buf_jgg@nvidia.com/ > >> > https://urldefense.com/v3/__https://lwn.net/Articles/970751/__;!!Bt8RZUm > 9aw!5uPrlSI9DhXVIeRMntADbkdWcu4YYhAzGN0OwyQ2NHxrN7bYE9f1eQBX > UD8xHHPxiJorSkYhNjIRwdG4PsD2$ > > v2: https://lore.kernel.org/dri-devel/20240624065552.1572580-1- > vivek.kasireddy@intel.com/ > > addresses review comments from Alex and Jason and also includes the > ability > > to create the dmabuf from multiple ranges. This is really needed to future- > proof > > the feature. > > The good thing is that your patch series addressed Christian’s concerns of > adding dma_buf_try_get(). No, it did not address his concern. In v2, instead of adding a new function, we are now calling get_file_rcu() directly, which was suggested by Christian in the old original review. > > However, several questions regarding your v2 patch: > - The dma-buf still support redundant mmap handler? (One of review > comments from v1?) Yeah, the mmap handler is really needed as a debugging tool given that the importer would not be able to provide access to the dmabuf's underlying memory via the CPU in any other way. > - Rather than handle different regions within a single dma-buf, would vfio- > user open multiple distinct file descriptors work? > For our specific use case, we don't require multiple regions and prefer > Jason’s original patch. Restricting the dmabuf to a single region (or having to create multiple dmabufs to represent multiple regions/ranges associated with a single scattered buffer) would not be feasible or ideal in all cases. For instance, in my use-case, I am sharing a large framebuffer (FB) located in GPU's VRAM. And, allocating a large FB contiguously (nr_ranges = 1) in VRAM is not possible when there is memory pressure. Furthermore, since we are adding a new UAPI with this patch/feature, we cannot go back and tweak it (to add support for nr_ranges > 1) should there be a need in the future, but you can always use nr_ranges = 1 anytime. That is why it makes sense to be flexible in terms of the number of ranges/regions. > > > > > Also, my understanding is that this patchset cannot proceed until Leon's > series is merged: > > https://lore.kernel.org/kvm/cover.1733398913.git.leon@kernel.org/ > > Thanks for the pointer. > I will rebase my local patch series on top of that. AFAIK, Leon's work includes new mechanism/APIs to do P2P DMA, which we should be using in this patchset. And, I think he is also planning to use the new APIs to augment dmabuf usage and not be forced to use the scatter-gather list particularly in cases where the underlying memory is not backed by struct page. I was just waiting for all of this to happen, before posting a v3. Thanks, Vivek > > Thanks, > Wei Lin > > > > > > > Thanks, > > Vivek > > > >> > >> In addition to the initial proposal by Jason, another promising > >> application is exposing memory from an AI accelerator (bound to VFIO) > >> to an RDMA device. This would allow the RDMA device to directly access > >> the accelerator's memory, thereby facilitating direct data > >> transactions between the RDMA device and the accelerator. > >> > >> Below is from the text/motivation from the orginal cover letter. > >> > >> dma-buf has become a way to safely acquire a handle to non-struct page > >> memory that can still have lifetime controlled by the exporter. Notably > >> RDMA can now import dma-buf FDs and build them into MRs which > allows > >> for > >> PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory > >> from PCI device BARs. > >> > >> This series supports a use case for SPDK where a NVMe device will be > owned > >> by SPDK through VFIO but interacting with a RDMA device. The RDMA > device > >> may directly access the NVMe CMB or directly manipulate the NVMe > device's > >> doorbell using PCI P2P. > >> > >> However, as a general mechanism, it can support many other scenarios > with > >> VFIO. I imagine this dmabuf approach to be usable by iommufd as well for > >> generic and safe P2P mappings. > >> > >> This series goes after the "Break up ioctl dispatch functions to one > >> function per ioctl" series. > >> > >> v2: > >> - Name the new file dma_buf.c > >> - Restore orig_nents before freeing > >> - Fix reversed logic around priv->revoked > >> - Set priv->index > >> - Rebased on v2 "Break up ioctl dispatch functions" > >> v1: https://lore.kernel.org/r/0-v1-9e6e1739ed95+5fa- > >> vfio_dma_buf_jgg@nvidia.com > >> Cc: linux-rdma@vger.kernel.org > >> Cc: Oded Gabbay <ogabbay@kernel.org> > >> Cc: Christian König <christian.koenig@amd.com> > >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > >> Cc: Leon Romanovsky <leon@kernel.org> > >> Cc: Maor Gottlieb <maorg@nvidia.com> > >> Cc: dri-devel@lists.freedesktop.org > >> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > >> > >> Jason Gunthorpe (3): > >> vfio: Add vfio_device_get() > >> dma-buf: Add dma_buf_try_get() > >> vfio/pci: Allow MMIO regions to be exported through dma-buf > >> > >> Wei Lin Guay (1): > >> vfio/pci: Allow export dmabuf without move_notify from importer > >> > >> drivers/vfio/pci/Makefile | 1 + > >> drivers/vfio/pci/dma_buf.c | 291 +++++++++++++++++++++++++++++ > >> drivers/vfio/pci/vfio_pci_config.c | 8 +- > >> drivers/vfio/pci/vfio_pci_core.c | 44 ++++- > >> drivers/vfio/pci/vfio_pci_priv.h | 30 +++ > >> drivers/vfio/vfio_main.c | 1 + > >> include/linux/dma-buf.h | 13 ++ > >> include/linux/vfio.h | 6 + > >> include/linux/vfio_pci_core.h | 1 + > >> include/uapi/linux/vfio.h | 18 ++ > >> 10 files changed, 405 insertions(+), 8 deletions(-) > >> create mode 100644 drivers/vfio/pci/dma_buf.c > >> > >> -- > >> 2.43.5