Message ID | 20190624194908.121273-3-john.stultz@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | DMA-BUF Heaps (destaging ION) | expand |
On Thu, Jul 18, 2019 at 3:06 AM Christoph Hellwig <hch@infradead.org> wrote: > > > +void INIT_HEAP_HELPER_BUFFER(struct heap_helper_buffer *buffer, > > + void (*free)(struct heap_helper_buffer *)) > > Please use a lower case naming following the naming scheme for the > rest of the file. Yes! Apologies as this was a hold-over from when the initialization function was an inline function in the style of INIT_WORK/INIT_LIST_HEAD. No longer appropriate that its a function. I'll change it. > > +static void *dma_heap_map_kernel(struct heap_helper_buffer *buffer) > > +{ > > + void *vaddr; > > + > > + vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL); > > + if (!vaddr) > > + return ERR_PTR(-ENOMEM); > > + > > + return vaddr; > > +} > > Unless I'm misreading the patches this is used for the same pages that > also might be dma mapped. In this case you need to use > flush_kernel_vmap_range and invalidate_kernel_vmap_range in the right > places to ensure coherency between the vmap and device view. Please > also document the buffer ownership, as this really can get complicated. Forgive me I wasn't familiar with those calls, but this seems like it would apply to all dma-buf exporters if so, and I don't see any similar flush_kernel_vmap_range calls there (both functions are seemingly only used by xfs, md and bio). We do have the dma_heap_dma_buf_begin_cpu_access()/dma_heap_dma_buf_end_cpu_access() hooks (see more on these below) which sync the buffers for each attachment (via dma_sync_sg_for_cpu/device), and are used around cpu access to the buffers. Are you suggesting the flush_kernel_vmap_range() call be added to those hooks or is the dma_sync_sg_for_* calls sufficient there? > > +static vm_fault_t dma_heap_vm_fault(struct vm_fault *vmf) > > +{ > > + struct vm_area_struct *vma = vmf->vma; > > + struct heap_helper_buffer *buffer = vma->vm_private_data; > > + > > + vmf->page = buffer->pages[vmf->pgoff]; > > + get_page(vmf->page); > > + > > + return 0; > > +} > > Is there any exlusion between mmap / vmap and the device accessing > the data? Without that you are going to run into a lot of coherency > problems. This has actually been a concern of mine recently, but at the higher dma-buf core level. Conceptually, there is the dma_buf_map_attachment() and dma_buf_unmap_attachment() calls drivers use to map the buffer to an attached device, and there are the dma_buf_begin_cpu_access() and dma_buf_end_cpu_access() calls which are to be done when touching the cpu mapped pages. These look like serializing functions, but actually don't seem to have any enforcement mechanism to exclude parallel access. To me it seems like adding the exclusion between those operations should be done at the dmabuf core level, and would actually be helpful for optimizing some of the cache maintenance rules w/ dmabuf. Does this sound like something closer to what your suggesting, or am I misunderstanding your point? Again, I really appreciate the review and feedback! Thanks so much! -john
On Mon, Jul 22, 2019 at 9:09 PM John Stultz <john.stultz@linaro.org> wrote: > > On Thu, Jul 18, 2019 at 3:06 AM Christoph Hellwig <hch@infradead.org> wrote: > > > > Is there any exlusion between mmap / vmap and the device accessing > > the data? Without that you are going to run into a lot of coherency > > problems. dma_fence is basically the way to handle exclusion between different device access (since device access tends to be asynchronous). For device<->device access, each driver is expected to take care of any cache(s) that the device might have. (Ie. device writing to buffer should flush it's caches if needed before signalling fence to let reading device know that it is safe to read, etc.) _begin/end_cpu_access() is intended to be the exclusion for CPU access (which is synchronous) BR, -R > This has actually been a concern of mine recently, but at the higher > dma-buf core level. Conceptually, there is the > dma_buf_map_attachment() and dma_buf_unmap_attachment() calls drivers > use to map the buffer to an attached device, and there are the > dma_buf_begin_cpu_access() and dma_buf_end_cpu_access() calls which > are to be done when touching the cpu mapped pages. These look like > serializing functions, but actually don't seem to have any enforcement > mechanism to exclude parallel access. > > To me it seems like adding the exclusion between those operations > should be done at the dmabuf core level, and would actually be helpful > for optimizing some of the cache maintenance rules w/ dmabuf. Does > this sound like something closer to what your suggesting, or am I > misunderstanding your point? > > Again, I really appreciate the review and feedback! > > Thanks so much! > -john
On 7/24/19 2:55 AM, Christoph Hellwig wrote: > On Tue, Jul 23, 2019 at 01:09:55PM -0700, Rob Clark wrote: >> On Mon, Jul 22, 2019 at 9:09 PM John Stultz <john.stultz@linaro.org> wrote: >>> >>> On Thu, Jul 18, 2019 at 3:06 AM Christoph Hellwig <hch@infradead.org> wrote: >>>> >>>> Is there any exlusion between mmap / vmap and the device accessing >>>> the data? Without that you are going to run into a lot of coherency >>>> problems. >> >> dma_fence is basically the way to handle exclusion between different >> device access (since device access tends to be asynchronous). For >> device<->device access, each driver is expected to take care of any >> cache(s) that the device might have. (Ie. device writing to buffer >> should flush it's caches if needed before signalling fence to let >> reading device know that it is safe to read, etc.) >> >> _begin/end_cpu_access() is intended to be the exclusion for CPU access >> (which is synchronous) > > What I mean is that we need a clear state machine (preferably including > ownership tracking ala dma-debug) where a piece of memory has one > owner at a time that can access it. Only the owner can access is at > that time, and at any owner switch we need to flush/invalidate all > relevant caches. And with memory that is vmaped and mapped to userspace > that can get really complicated. > > The above sounds like you have some of that in place, but we'll really > need clear rules to make sure we don't have holes in the scheme. > Well then lets think on this. A given buffer can have 3 owners states (CPU-owned, Device-owned, and Un-owned). These are based on the caching state from the CPU perspective. If a buffer is CPU-owned then we (Linux) can write to the buffer safely without worry that the data is stale or that it will be accessed by the device without having been flushed. Device-owned buffers should not be accessed by the CPU, and inter-device synchronization should be handled by fencing as Rob points out. Un-owned is how a buffer starts for consistency and to prevent unneeded cache operations on unwritten buffers. We also need to track the mapping states, 4 states for this, CPU-mapped, Device-mapped, CPU/Device-mapped, unmapped. Should be self explanatory, map_dma_buf maps towards the device, mmap/vmap/kmap towards the CPU. Leaving a buffer mapped by the CPU while device access takes place is safe as long as ownership is taken before any access. One more point, we assume reference counting for the below discussion, for instance unmap_dma_buf refers to the last device unmapping, map_dma_buf refers only to the first. This gives 12 combined states, if we assume a buffer will always be owned when it has someone mapping it, either CPU or device or both, then we can drop 3 states. If a buffer is only mapped into one space, then that space owns it, this drops 2 cross-owned states. Lastly if not mapped by either space then the buffer becomes un-owned (and the backing memory can be freed or migrated as needed). Leaving us 5 valid states. * Un-Owned Un-Mapped * Device-Owned Device-Mapped * Device-Owned CPU/Device-Mapped * CPU-Owned CPU-Mapped * CPU-Owned CPU/Device-Mapped There are 6 DMA-BUF operations (classes) on a buffer: * map_dma_buf * unmap_dma_buf * begin_cpu_access * end_cpu_access * mmap/vmap/kmap * ummanp/vunmap/kunmap From all this I've suggest the following state-machine(in DOT language): Note: Buffers start in "Un-Owned Un-Mapped" and can only be freed from that state. Note: Commented out states/transitions are not valid but here to prove completeness ------------------------------------------------------------------- digraph dma_buf_buffer_states { label = "DMA-BUF Buffer states"; uo_um [ label="Un-Owned\nUn-Mapped" ]; // uo_dm [ label="Un-Owned\nDevice-Mapped" ]; // uo_cm [ label="Un-Owned\nCPU-Mapped" ]; // uo_cdm [ label="Un-Owned\nCPU/Device-Mapped" ]; // do_um [ label="Device-Owned\nUn-Mapped" ]; do_dm [ label="Device-Owned\nDevice-Mapped" ]; // do_cm [ label="Device-Owned\nCPU-Mapped" ]; do_cdm [ label="Device-Owned\nCPU/Device-Mapped" ]; // co_um [ label="CPU-Owned\nUn-Mapped" ]; // co_dm [ label="CPU-Owned\nDevice-Mapped" ]; co_cm [ label="CPU-Owned\nCPU-Mapped" ]; co_cdm [ label="CPU-Owned\nCPU/Device-Mapped" ]; /* From Un-Owned Un-Mapped */ uo_um -> do_dm [ label="map_dma_buf" ]; // uo_um -> [ label="unmap_dma_buf" ]; // uo_um -> [ label="begin_cpu_access" ]; // uo_um -> [ label="end_cpu_access" ]; uo_um -> co_cm [ label="mmap/vmap/kmap" ]; // uo_um -> [ label="ummanp/vunmap/kunmap" ]; /* From Device-Owned Device-Mapped */ do_dm -> do_dm [ label="map_dma_buf" ]; do_dm -> uo_um [ label="unmap_dma_buf" ]; // do_dm -> [ label="begin_cpu_access" ]; // do_dm -> [ label="end_cpu_access" ]; do_dm -> do_cdm [ label="mmap/vmap/kmap" ]; // do_dm -> [ label="ummanp/vunmap/kunmap" ]; /* From Device-Owned CPU/Device-Mapped */ do_cdm -> do_cdm [ label="map_dma_buf" ]; do_cdm -> co_cm [ label="unmap_dma_buf" ]; do_cdm -> co_cdm [ label="begin_cpu_access" ]; // do_cdm -> [ label="end_cpu_access" ]; do_cdm -> do_cdm [ label="mmap/vmap/kmap" ]; do_cdm -> do_dm [ label="ummanp/vunmap/kunmap" ]; /* From CPU-Owned CPU-Mapped */ co_cm -> co_cdm [ label="map_dma_buf" ]; // co_cm -> [ label="unmap_dma_buf" ]; // co_cm -> [ label="begin_cpu_access" ]; co_cm -> co_cm [ label="end_cpu_access" ]; // co_cm -> [ label="mmap/vmap/kmap" ]; co_cm -> uo_um [ label="ummanp/vunmap/kunmap" ]; /* From CPU-Owned CPU/Device-Mapped */ co_cdm -> co_cdm [ label="map_dma_buf" ]; co_cdm -> co_cm [ label="unmap_dma_buf" ]; // co_cdm -> [ label="begin_cpu_access" ]; co_cdm -> do_cdm [ label="end_cpu_access" ]; co_cdm -> co_cdm [ label="mmap/vmap/kmap" ]; // co_cdm -> [ label="ummanp/vunmap/kunmap" ]; { rank = same; co_cm -> do_dm [ style=invis ]; rankdir = LR; } { rank = same; co_cdm -> do_cdm [ style=invis ]; rankdir = LR; } } ------------------------------------------------------------------- If we consider this the "official" model, then we can start optimizing cache operations, and start forbidding some nonsensical operations. What do y'all think? Andrew
On Thu, Jul 25, 2019 at 5:41 AM Christoph Hellwig <hch@infradead.org> wrote: > > On Wed, Jul 24, 2019 at 11:20:31AM -0400, Andrew F. Davis wrote: > > Well then lets think on this. A given buffer can have 3 owners states > > (CPU-owned, Device-owned, and Un-owned). These are based on the caching > > state from the CPU perspective. > > > > If a buffer is CPU-owned then we (Linux) can write to the buffer safely > > without worry that the data is stale or that it will be accessed by the > > device without having been flushed. Device-owned buffers should not be > > accessed by the CPU, and inter-device synchronization should be handled > > by fencing as Rob points out. Un-owned is how a buffer starts for > > consistency and to prevent unneeded cache operations on unwritten buffers. > > CPU owned also needs to be split into which mapping owns it - in the > normal DMA this is the kernel direct mapping, but in dma-buf it seems > the primary way of using it in kernel space is the vmap, in addition > to that the mappings can also be exported to userspace, which is another > mapping that is possibly not cache coherent with the kernel one. Just for some history, dmabuf->vmap() is optional, and mostly added for the benefit of drm/udl (usb display, where CPU needs to read out and encode (?) the video stream.. most of the drm drivers are using tmpfs to get backing pages, and if there is any kernel direct mapping it is unused. It is probably ok for any allocator that does care about a kernel direct mapping to simply not implement vmap. BR, -R
diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 1cb3dd104825..e3e3dca29e46 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ reservation.o seqno-fence.o +obj-$(CONFIG_DMABUF_HEAPS) += heaps/ obj-$(CONFIG_DMABUF_HEAPS) += dma-heap.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile new file mode 100644 index 000000000000..de49898112db --- /dev/null +++ b/drivers/dma-buf/heaps/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-y += heap-helpers.o diff --git a/drivers/dma-buf/heaps/heap-helpers.c b/drivers/dma-buf/heaps/heap-helpers.c new file mode 100644 index 000000000000..fba1895f3bf0 --- /dev/null +++ b/drivers/dma-buf/heaps/heap-helpers.c @@ -0,0 +1,262 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/device.h> +#include <linux/dma-buf.h> +#include <linux/err.h> +#include <linux/idr.h> +#include <linux/list.h> +#include <linux/slab.h> +#include <linux/uaccess.h> +#include <uapi/linux/dma-heap.h> + +#include "heap-helpers.h" + +void INIT_HEAP_HELPER_BUFFER(struct heap_helper_buffer *buffer, + void (*free)(struct heap_helper_buffer *)) +{ + buffer->private_flags = 0; + buffer->priv_virt = NULL; + mutex_init(&buffer->lock); + buffer->vmap_cnt = 0; + buffer->vaddr = NULL; + buffer->pagecount = 0; + buffer->pages = NULL;; + INIT_LIST_HEAD(&buffer->attachments); + buffer->free = free; +} + +static void *dma_heap_map_kernel(struct heap_helper_buffer *buffer) +{ + void *vaddr; + + vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL); + if (!vaddr) + return ERR_PTR(-ENOMEM); + + return vaddr; +} + +static void dma_heap_buffer_destroy(struct dma_heap_buffer *heap_buffer) +{ + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + + if (buffer->vmap_cnt > 0) { + WARN("%s: buffer still mapped in the kernel\n", + __func__); + vunmap(buffer->vaddr); + } + + buffer->free(buffer); +} + +static void *dma_heap_buffer_vmap_get(struct dma_heap_buffer *heap_buffer) +{ + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + void *vaddr; + + if (buffer->vmap_cnt) { + buffer->vmap_cnt++; + return buffer->vaddr; + } + vaddr = dma_heap_map_kernel(buffer); + if (WARN_ONCE(!vaddr, + "heap->ops->map_kernel should return ERR_PTR on error")) + return ERR_PTR(-EINVAL); + if (IS_ERR(vaddr)) + return vaddr; + buffer->vaddr = vaddr; + buffer->vmap_cnt++; + return vaddr; +} + +static void dma_heap_buffer_vmap_put(struct dma_heap_buffer *heap_buffer) +{ + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + + if (!--buffer->vmap_cnt) { + vunmap(buffer->vaddr); + buffer->vaddr = NULL; + } +} + +struct dma_heaps_attachment { + struct device *dev; + struct sg_table table; + struct list_head list; +}; + +static int dma_heap_attach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attachment) +{ + struct dma_heaps_attachment *a; + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + int ret; + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + + ret = sg_alloc_table_from_pages(&a->table, buffer->pages, + buffer->pagecount, 0, + buffer->pagecount << PAGE_SHIFT, + GFP_KERNEL); + if (ret) { + kfree(a); + return ret; + } + + a->dev = attachment->dev; + INIT_LIST_HEAD(&a->list); + + attachment->priv = a; + + mutex_lock(&buffer->lock); + list_add(&a->list, &buffer->attachments); + mutex_unlock(&buffer->lock); + + return 0; +} + +static void dma_heap_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attachment) +{ + struct dma_heaps_attachment *a = attachment->priv; + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + + mutex_lock(&buffer->lock); + list_del(&a->list); + mutex_unlock(&buffer->lock); + + sg_free_table(&a->table); + kfree(a); +} + +static struct sg_table *dma_heap_map_dma_buf( + struct dma_buf_attachment *attachment, + enum dma_data_direction direction) +{ + struct dma_heaps_attachment *a = attachment->priv; + struct sg_table *table; + + table = &a->table; + + if (!dma_map_sg(attachment->dev, table->sgl, table->nents, + direction)) + table = ERR_PTR(-ENOMEM); + return table; +} + +static void dma_heap_unmap_dma_buf(struct dma_buf_attachment *attachment, + struct sg_table *table, + enum dma_data_direction direction) +{ + dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction); +} + +static vm_fault_t dma_heap_vm_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct heap_helper_buffer *buffer = vma->vm_private_data; + + vmf->page = buffer->pages[vmf->pgoff]; + get_page(vmf->page); + + return 0; +} + +static const struct vm_operations_struct dma_heap_vm_ops = { + .fault = dma_heap_vm_fault, +}; + +static int dma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) +{ + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) + return -EINVAL; + + vma->vm_ops = &dma_heap_vm_ops; + vma->vm_private_data = buffer; + + return 0; +} + +static void dma_heap_dma_buf_release(struct dma_buf *dmabuf) +{ + struct dma_heap_buffer *buffer = dmabuf->priv; + + dma_heap_buffer_destroy(buffer); +} + +static int dma_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + struct dma_heaps_attachment *a; + int ret = 0; + + mutex_lock(&buffer->lock); + list_for_each_entry(a, &buffer->attachments, list) { + dma_sync_sg_for_cpu(a->dev, a->table.sgl, a->table.nents, + direction); + } + mutex_unlock(&buffer->lock); + + return ret; +} + +static int dma_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf, + enum dma_data_direction direction) +{ + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + struct dma_heaps_attachment *a; + + mutex_lock(&buffer->lock); + list_for_each_entry(a, &buffer->attachments, list) { + dma_sync_sg_for_device(a->dev, a->table.sgl, a->table.nents, + direction); + } + mutex_unlock(&buffer->lock); + + return 0; +} + +static void *dma_heap_dma_buf_vmap(struct dma_buf *dmabuf) +{ + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + void *vaddr; + + mutex_lock(&buffer->lock); + vaddr = dma_heap_buffer_vmap_get(heap_buffer); + mutex_unlock(&buffer->lock); + + return vaddr; +} + +static void dma_heap_dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) +{ + struct dma_heap_buffer *heap_buffer = dmabuf->priv; + struct heap_helper_buffer *buffer = to_helper_buffer(heap_buffer); + + mutex_lock(&buffer->lock); + dma_heap_buffer_vmap_put(heap_buffer); + mutex_unlock(&buffer->lock); +} + +const struct dma_buf_ops heap_helper_ops = { + .map_dma_buf = dma_heap_map_dma_buf, + .unmap_dma_buf = dma_heap_unmap_dma_buf, + .mmap = dma_heap_mmap, + .release = dma_heap_dma_buf_release, + .attach = dma_heap_attach, + .detach = dma_heap_detach, + .begin_cpu_access = dma_heap_dma_buf_begin_cpu_access, + .end_cpu_access = dma_heap_dma_buf_end_cpu_access, + .vmap = dma_heap_dma_buf_vmap, + .vunmap = dma_heap_dma_buf_vunmap, +}; diff --git a/drivers/dma-buf/heaps/heap-helpers.h b/drivers/dma-buf/heaps/heap-helpers.h new file mode 100644 index 000000000000..7e0a82b11c9e --- /dev/null +++ b/drivers/dma-buf/heaps/heap-helpers.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * DMABUF Heaps helper code + * + * Copyright (C) 2011 Google, Inc. + * Copyright (C) 2019 Linaro Ltd. + */ + +#ifndef _HEAP_HELPERS_H +#define _HEAP_HELPERS_H + +#include <linux/dma-heap.h> +#include <linux/list.h> + +/** + * struct dma_heap_buffer - metadata for a particular buffer + * @heap: back pointer to the heap the buffer came from + * @dmabuf: backing dma-buf for this buffer + * @size: size of the buffer + * @flags: buffer specific flags + */ +struct dma_heap_buffer { + struct dma_heap *heap; + struct dma_buf *dmabuf; + size_t size; + unsigned long flags; +}; + +struct heap_helper_buffer { + struct dma_heap_buffer heap_buffer; + + unsigned long private_flags; + void *priv_virt; + struct mutex lock; + int vmap_cnt; + void *vaddr; + pgoff_t pagecount; + struct page **pages; + struct list_head attachments; + + void (*free)(struct heap_helper_buffer *buffer); +}; + +static inline struct heap_helper_buffer *to_helper_buffer( + struct dma_heap_buffer *h) +{ + return container_of(h, struct heap_helper_buffer, heap_buffer); +} + +void INIT_HEAP_HELPER_BUFFER(struct heap_helper_buffer *buffer, + void (*free)(struct heap_helper_buffer *)); +extern const struct dma_buf_ops heap_helper_ops; + +#endif /* _HEAP_HELPERS_H */