Message ID | 20180807104500.31264-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/amdgpu: Transfer fences to dmabuf importer | expand |
On Tue, Aug 07, 2018 at 11:45:00AM +0100, Chris Wilson wrote: > amdgpu only uses shared-fences internally, but dmabuf importers rely on > implicit write hazard tracking via the reservation_object.fence_excl. > For example, the importer use the write hazard for timing a page flip to > only occur after the exporter has finished flushing its write into the > surface. As such, on exporting a dmabuf, we must either flush all > outstanding fences (for we do not know which are writes and should have > been exclusive) or alternatively create a new exclusive fence that is > the composite of all the existing shared fences, and so will only be > signaled when all earlier fences are signaled (ensuring that we can not > be signaled before the completion of any earlier write). > > Testcase: igt/amd_prime/amd-to-i915 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Alex Deucher <alexander.deucher@amd.com> > Cc: "Christian König" <christian.koenig@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 70 ++++++++++++++++++++--- > 1 file changed, 62 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > index 1c5d97f4b4dd..47e6ec5510b6 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > @@ -37,6 +37,7 @@ > #include "amdgpu_display.h" > #include <drm/amdgpu_drm.h> > #include <linux/dma-buf.h> > +#include <linux/dma-fence-array.h> > > static const struct dma_buf_ops amdgpu_dmabuf_ops; > > @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, > return ERR_PTR(ret); > } > > +static int > +__reservation_object_make_exclusive(struct reservation_object *obj) > +{ Why not you move the helper to reservation.c, and then export symbol for this file? Thanks, Ray > + struct reservation_object_list *fobj; > + struct dma_fence_array *array; > + struct dma_fence **fences; > + unsigned int count, i; > + > + fobj = reservation_object_get_list(obj); > + if (!fobj) > + return 0; > + > + count = !!rcu_access_pointer(obj->fence_excl); > + count += fobj->shared_count; > + > + fences = kmalloc_array(sizeof(*fences), count, GFP_KERNEL); > + if (!fences) > + return -ENOMEM; > + > + for (i = 0; i < fobj->shared_count; i++) { > + struct dma_fence *f = > + rcu_dereference_protected(fobj->shared[i], > + reservation_object_held(obj)); > + > + fences[i] = dma_fence_get(f); > + } > + > + if (rcu_access_pointer(obj->fence_excl)) { > + struct dma_fence *f = > + rcu_dereference_protected(obj->fence_excl, > + reservation_object_held(obj)); > + > + fences[i] = dma_fence_get(f); > + } > + > + array = dma_fence_array_create(count, fences, > + dma_fence_context_alloc(1), 0, > + false); > + if (!array) > + goto err_fences_put; > + > + reservation_object_add_excl_fence(obj, &array->base); > + return 0; > + > +err_fences_put: > + for (i = 0; i < count; i++) > + dma_fence_put(fences[i]); > + kfree(fences); > + return -ENOMEM; > +} > + > /** > * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation > * @dma_buf: shared DMA buffer > @@ -219,16 +271,18 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, > > if (attach->dev->driver != adev->dev->driver) { > /* > - * Wait for all shared fences to complete before we switch to future > - * use of exclusive fence on this prime shared bo. > + * We only create shared fences for internal use, but importers > + * of the dmabuf rely on exclusive fences for implicitly > + * tracking write hazards. As any of the current fences may > + * correspond to a write, we need to convert all existing > + * fences on the resevation object into a single exclusive > + * fence. > */ > - r = reservation_object_wait_timeout_rcu(bo->tbo.resv, > - true, false, > - MAX_SCHEDULE_TIMEOUT); > - if (unlikely(r < 0)) { > - DRM_DEBUG_PRIME("Fence wait failed: %li\n", r); > + reservation_object_lock(bo->tbo.resv, NULL); > + r = __reservation_object_make_exclusive(bo->tbo.resv); > + reservation_object_unlock(bo->tbo.resv); > + if (r) > goto error_unreserve; > - } > } > > /* pin buffer into GTT */ > -- > 2.18.0 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Quoting Huang Rui (2018-08-07 11:56:24) > On Tue, Aug 07, 2018 at 11:45:00AM +0100, Chris Wilson wrote: > > amdgpu only uses shared-fences internally, but dmabuf importers rely on > > implicit write hazard tracking via the reservation_object.fence_excl. > > For example, the importer use the write hazard for timing a page flip to > > only occur after the exporter has finished flushing its write into the > > surface. As such, on exporting a dmabuf, we must either flush all > > outstanding fences (for we do not know which are writes and should have > > been exclusive) or alternatively create a new exclusive fence that is > > the composite of all the existing shared fences, and so will only be > > signaled when all earlier fences are signaled (ensuring that we can not > > be signaled before the completion of any earlier write). > > > > Testcase: igt/amd_prime/amd-to-i915 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Alex Deucher <alexander.deucher@amd.com> > > Cc: "Christian König" <christian.koenig@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 70 ++++++++++++++++++++--- > > 1 file changed, 62 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > > index 1c5d97f4b4dd..47e6ec5510b6 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > > @@ -37,6 +37,7 @@ > > #include "amdgpu_display.h" > > #include <drm/amdgpu_drm.h> > > #include <linux/dma-buf.h> > > +#include <linux/dma-fence-array.h> > > > > static const struct dma_buf_ops amdgpu_dmabuf_ops; > > > > @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, > > return ERR_PTR(ret); > > } > > > > +static int > > +__reservation_object_make_exclusive(struct reservation_object *obj) > > +{ > > Why not you move the helper to reservation.c, and then export symbol for > this file? I have not seen anything else that would wish to use this helper. The first task is to solve this issue here before worrying about generalisation. -Chris
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 1c5d97f4b4dd..47e6ec5510b6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -37,6 +37,7 @@ #include "amdgpu_display.h" #include <drm/amdgpu_drm.h> #include <linux/dma-buf.h> +#include <linux/dma-fence-array.h> static const struct dma_buf_ops amdgpu_dmabuf_ops; @@ -188,6 +189,57 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); } +static int +__reservation_object_make_exclusive(struct reservation_object *obj) +{ + struct reservation_object_list *fobj; + struct dma_fence_array *array; + struct dma_fence **fences; + unsigned int count, i; + + fobj = reservation_object_get_list(obj); + if (!fobj) + return 0; + + count = !!rcu_access_pointer(obj->fence_excl); + count += fobj->shared_count; + + fences = kmalloc_array(sizeof(*fences), count, GFP_KERNEL); + if (!fences) + return -ENOMEM; + + for (i = 0; i < fobj->shared_count; i++) { + struct dma_fence *f = + rcu_dereference_protected(fobj->shared[i], + reservation_object_held(obj)); + + fences[i] = dma_fence_get(f); + } + + if (rcu_access_pointer(obj->fence_excl)) { + struct dma_fence *f = + rcu_dereference_protected(obj->fence_excl, + reservation_object_held(obj)); + + fences[i] = dma_fence_get(f); + } + + array = dma_fence_array_create(count, fences, + dma_fence_context_alloc(1), 0, + false); + if (!array) + goto err_fences_put; + + reservation_object_add_excl_fence(obj, &array->base); + return 0; + +err_fences_put: + for (i = 0; i < count; i++) + dma_fence_put(fences[i]); + kfree(fences); + return -ENOMEM; +} + /** * amdgpu_gem_map_attach - &dma_buf_ops.attach implementation * @dma_buf: shared DMA buffer @@ -219,16 +271,18 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, if (attach->dev->driver != adev->dev->driver) { /* - * Wait for all shared fences to complete before we switch to future - * use of exclusive fence on this prime shared bo. + * We only create shared fences for internal use, but importers + * of the dmabuf rely on exclusive fences for implicitly + * tracking write hazards. As any of the current fences may + * correspond to a write, we need to convert all existing + * fences on the resevation object into a single exclusive + * fence. */ - r = reservation_object_wait_timeout_rcu(bo->tbo.resv, - true, false, - MAX_SCHEDULE_TIMEOUT); - if (unlikely(r < 0)) { - DRM_DEBUG_PRIME("Fence wait failed: %li\n", r); + reservation_object_lock(bo->tbo.resv, NULL); + r = __reservation_object_make_exclusive(bo->tbo.resv); + reservation_object_unlock(bo->tbo.resv); + if (r) goto error_unreserve; - } } /* pin buffer into GTT */
amdgpu only uses shared-fences internally, but dmabuf importers rely on implicit write hazard tracking via the reservation_object.fence_excl. For example, the importer use the write hazard for timing a page flip to only occur after the exporter has finished flushing its write into the surface. As such, on exporting a dmabuf, we must either flush all outstanding fences (for we do not know which are writes and should have been exclusive) or alternatively create a new exclusive fence that is the composite of all the existing shared fences, and so will only be signaled when all earlier fences are signaled (ensuring that we can not be signaled before the completion of any earlier write). Testcase: igt/amd_prime/amd-to-i915 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 70 ++++++++++++++++++++--- 1 file changed, 62 insertions(+), 8 deletions(-)