Message ID | 20190716213746.4670-3-robdclark@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/3] drm/gem: don't force writecombine mmap'ing | expand |
Rob Clark <robdclark@gmail.com> writes: > From: Rob Clark <robdclark@chromium.org> > > Since there is no real device associated with VGEM, it is impossible to > end up with appropriate dev->dma_ops, meaning that we have no way to > invalidate the shmem pages allocated by VGEM. So, at least on platforms > without drm_cflush_pages(), we end up with corruption when cache lines > from previous usage of VGEM bo pages get evicted to memory. > > The only sane option is to use cached mappings. This may be an improvement, but... pin/unpin is only on attaching/closing the dma-buf, right? So, great, you flushed the cached map once after exporting the vgem dma-buf to the actual GPU device, but from then on you still have no interface for getting coherent access through VGEM's mapping again, which still exists. I feel like this is papering over something that's really just broken, and we should stop providing VGEM just because someone wants to write dma-buf test code without driver-specific BO alloc ioctl code.
On Tue, Jul 16, 2019 at 4:39 PM Eric Anholt <eric@anholt.net> wrote: > > Rob Clark <robdclark@gmail.com> writes: > > > From: Rob Clark <robdclark@chromium.org> > > > > Since there is no real device associated with VGEM, it is impossible to > > end up with appropriate dev->dma_ops, meaning that we have no way to > > invalidate the shmem pages allocated by VGEM. So, at least on platforms > > without drm_cflush_pages(), we end up with corruption when cache lines > > from previous usage of VGEM bo pages get evicted to memory. > > > > The only sane option is to use cached mappings. > > This may be an improvement, but... > > pin/unpin is only on attaching/closing the dma-buf, right? So, great, > you flushed the cached map once after exporting the vgem dma-buf to the > actual GPU device, but from then on you still have no interface for > getting coherent access through VGEM's mapping again, which still > exists. In *theory* one would detach before doing further CPU access to buffer, and then re-attach when passing back to GPU. Ofc that isn't how actual drivers do things. But maybe it is enough for vgem to serve it's purpose (ie. test code). > I feel like this is papering over something that's really just broken, > and we should stop providing VGEM just because someone wants to write > dma-buf test code without driver-specific BO alloc ioctl code. yup, it is vgem that is fundamentally broken (or maybe more specifically doesn't fit in w/ dma-mappings view of how to do cache maint), and I'm just papering over it because people and CI systems want to be able to use it to do some dma-buf tests ;-) I'm kinda wondering, at least for arm/dt based systems, if there is a way (other than in early boot) that we can inject a vgem device node into the dtb. That isn't a thing drivers should normally do, but (if possible) since vgem is really just test infrastructure, it could be a way to make dma-mapping happily think vgem is a real device. BR, -R
On Tue, Jul 16, 2019 at 05:13:10PM -0700, Rob Clark wrote: > On Tue, Jul 16, 2019 at 4:39 PM Eric Anholt <eric@anholt.net> wrote: > > > > Rob Clark <robdclark@gmail.com> writes: > > > > > From: Rob Clark <robdclark@chromium.org> > > > > > > Since there is no real device associated with VGEM, it is impossible to > > > end up with appropriate dev->dma_ops, meaning that we have no way to > > > invalidate the shmem pages allocated by VGEM. So, at least on platforms > > > without drm_cflush_pages(), we end up with corruption when cache lines > > > from previous usage of VGEM bo pages get evicted to memory. > > > > > > The only sane option is to use cached mappings. > > > > This may be an improvement, but... > > > > pin/unpin is only on attaching/closing the dma-buf, right? So, great, > > you flushed the cached map once after exporting the vgem dma-buf to the > > actual GPU device, but from then on you still have no interface for > > getting coherent access through VGEM's mapping again, which still > > exists. > > In *theory* one would detach before doing further CPU access to > buffer, and then re-attach when passing back to GPU. > > Ofc that isn't how actual drivers do things. But maybe it is enough > for vgem to serve it's purpose (ie. test code). > > > I feel like this is papering over something that's really just broken, > > and we should stop providing VGEM just because someone wants to write > > dma-buf test code without driver-specific BO alloc ioctl code. > > yup, it is vgem that is fundamentally broken (or maybe more > specifically doesn't fit in w/ dma-mappings view of how to do cache > maint), and I'm just papering over it because people and CI systems > want to be able to use it to do some dma-buf tests ;-) > > I'm kinda wondering, at least for arm/dt based systems, if there is a > way (other than in early boot) that we can inject a vgem device node > into the dtb. That isn't a thing drivers should normally do, but (if > possible) since vgem is really just test infrastructure, it could be a > way to make dma-mapping happily think vgem is a real device. Or we just extend drm_cflush_pages with the cflushing we need (at least for those arms where this is possible, let's ignore the others) and accept for a few more years that dma-api doesn't fit? Note this would need to be a full copypasta of what the arch code has (since just exporting the function was shot down before), but I really don't care about the resulting wailing if we do this. -Daniel
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index e7d12e93b1f0..84262e2bd7f7 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -259,9 +259,6 @@ static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) if (ret) return ret; - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; return 0; } @@ -310,17 +307,17 @@ static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) static int vgem_prime_pin(struct drm_gem_object *obj, struct device *dev) { struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; + long i, n_pages = obj->size >> PAGE_SHIFT; struct page **pages; pages = vgem_pin_pages(bo); if (IS_ERR(pages)) return PTR_ERR(pages); - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); + for (i = 0; i < n_pages; i++) { + dma_sync_single_for_device(dev, page_to_phys(pages[i]), + PAGE_SIZE, DMA_BIDIRECTIONAL); + } return 0; } @@ -328,6 +325,13 @@ static int vgem_prime_pin(struct drm_gem_object *obj, struct device *dev) static void vgem_prime_unpin(struct drm_gem_object *obj, struct device *dev) { struct drm_vgem_gem_object *bo = to_vgem_bo(obj); + long i, n_pages = obj->size >> PAGE_SHIFT; + struct page **pages = bo->pages; + + for (i = 0; i < n_pages; i++) { + dma_sync_single_for_cpu(dev, page_to_phys(pages[i]), + PAGE_SIZE, DMA_BIDIRECTIONAL); + } vgem_unpin_pages(bo); } @@ -382,7 +386,7 @@ static void *vgem_prime_vmap(struct drm_gem_object *obj) if (IS_ERR(pages)) return NULL; - return vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); + return vmap(pages, n_pages, 0, PAGE_KERNEL); } static void vgem_prime_vunmap(struct drm_gem_object *obj, void *vaddr) @@ -411,7 +415,7 @@ static int vgem_prime_mmap(struct drm_gem_object *obj, fput(vma->vm_file); vma->vm_file = get_file(obj->filp); vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); return 0; }