diff mbox

drm/amdgpu: Attach exclusive fence to prime exported bo's. (v4)

Message ID 1478574925-18162-1-git-send-email-mario.kleiner.de@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mario Kleiner Nov. 8, 2016, 3:15 a.m. UTC
External clients which import our bo's wait only
for exclusive dmabuf-fences, not on shared ones,
ditto for bo's which we import from external
providers and write to.

Therefore attach exclusive fences on prime shared buffers
if our exported buffer gets imported by an external
client, or if we import a buffer from an external
exporter.

See discussion in thread:
https://lists.freedesktop.org/archives/dri-devel/2016-October/122370.html

Prime export tested on Intel iGPU + AMD Tonga dGPU as
DRI3/Present Prime render offload, and with the Tonga
standalone as primary gpu.

v2: Add a wait for all shared fences before prime export,
    as suggested by Christian Koenig.

v3: - Mark buffer prime_exported in amdgpu_gem_prime_pin,
    so we only use the exclusive fence when exporting a
    bo to external clients like a separate iGPU, but not
    when exporting/importing from/to ourselves as part of
    regular DRI3 fd passing.

    - Propagate failure of reservation_object_wait_rcu back
    to caller.

v4: - Switch to a prime_shared_count counter instead of a
      flag, which gets in/decremented on prime_pin/unpin, so
      we can switch back to shared fences if all clients
      detach from our exported bo.

    - Also switch to exclusive fence for prime imported bo's.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95472
Tested-by: Mike Lothian <mike@fireburn.co.uk> (v1)
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c   | 19 +++++++++++++++++++
 3 files changed, 21 insertions(+), 1 deletion(-)

Comments

Christian König Nov. 8, 2016, 7:48 a.m. UTC | #1
Am 08.11.2016 um 04:15 schrieb Mario Kleiner:
> External clients which import our bo's wait only
> for exclusive dmabuf-fences, not on shared ones,
> ditto for bo's which we import from external
> providers and write to.
>
> Therefore attach exclusive fences on prime shared buffers
> if our exported buffer gets imported by an external
> client, or if we import a buffer from an external
> exporter.
>
> See discussion in thread:
> https://lists.freedesktop.org/archives/dri-devel/2016-October/122370.html
>
> Prime export tested on Intel iGPU + AMD Tonga dGPU as
> DRI3/Present Prime render offload, and with the Tonga
> standalone as primary gpu.
>
> v2: Add a wait for all shared fences before prime export,
>      as suggested by Christian Koenig.
>
> v3: - Mark buffer prime_exported in amdgpu_gem_prime_pin,
>      so we only use the exclusive fence when exporting a
>      bo to external clients like a separate iGPU, but not
>      when exporting/importing from/to ourselves as part of
>      regular DRI3 fd passing.
>
>      - Propagate failure of reservation_object_wait_rcu back
>      to caller.
>
> v4: - Switch to a prime_shared_count counter instead of a
>        flag, which gets in/decremented on prime_pin/unpin, so
>        we can switch back to shared fences if all clients
>        detach from our exported bo.
>
>      - Also switch to exclusive fence for prime imported bo's.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95472
> Tested-by: Mike Lothian <mike@fireburn.co.uk> (v1)
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Michel Dänzer <michel.daenzer@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h         |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c   | 19 +++++++++++++++++++
>   3 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 039b57e..496f72b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -459,6 +459,7 @@ struct amdgpu_bo {
>   	u64				metadata_flags;
>   	void				*metadata;
>   	u32				metadata_size;
> +	unsigned			prime_shared_count;
>   	/* list of all virtual address to which this bo
>   	 * is associated to
>   	 */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> index 651115d..c02db01f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
> @@ -132,7 +132,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
>   		entry->priority = min(info[i].bo_priority,
>   				      AMDGPU_BO_LIST_MAX_PRIORITY);
>   		entry->tv.bo = &entry->robj->tbo;
> -		entry->tv.shared = true;
> +		entry->tv.shared = !entry->robj->prime_shared_count;
>   
>   		if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_GDS)
>   			gds_obj = entry->robj;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> index 7700dc2..31aac7e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
> @@ -74,6 +74,7 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
>   	if (ret)
>   		return ERR_PTR(ret);
>   
> +	bo->prime_shared_count = 1;
>   	return &bo->gem_base;
>   }
>   
> @@ -81,13 +82,29 @@ int amdgpu_gem_prime_pin(struct drm_gem_object *obj)
>   {
>   	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
>   	int ret = 0;
> +	long lret;

No need for an extra long sized variable here, just change the type of 
the existing one.

The compiler can usually optimize away any size conversions perfectly fine.

With that fixed the patch is Reviewed-by: Christian König 
<christian.koenig@amd.com>.

Thanks a lot for the help,
Christian.

>   
>   	ret = amdgpu_bo_reserve(bo, false);
>   	if (unlikely(ret != 0))
>   		return ret;
>   
> +	/*
> +	 * Wait for all shared fences to complete before we switch to future
> +	 * use of exclusive fence on this prime shared bo.
> +	 */
> +	lret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, false,
> +						   MAX_SCHEDULE_TIMEOUT);
> +	if (unlikely(lret < 0)) {
> +		DRM_DEBUG_PRIME("Fence wait failed: %li\n", lret);
> +		amdgpu_bo_unreserve(bo);
> +		return lret;
> +	}
> +
>   	/* pin buffer into GTT */
>   	ret = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT, NULL);
> +	if (likely(ret == 0))
> +		bo->prime_shared_count++;
> +
>   	amdgpu_bo_unreserve(bo);
>   	return ret;
>   }
> @@ -102,6 +119,8 @@ void amdgpu_gem_prime_unpin(struct drm_gem_object *obj)
>   		return;
>   
>   	amdgpu_bo_unpin(bo);
> +	if (bo->prime_shared_count)
> +		bo->prime_shared_count--;
>   	amdgpu_bo_unreserve(bo);
>   }
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 039b57e..496f72b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -459,6 +459,7 @@  struct amdgpu_bo {
 	u64				metadata_flags;
 	void				*metadata;
 	u32				metadata_size;
+	unsigned			prime_shared_count;
 	/* list of all virtual address to which this bo
 	 * is associated to
 	 */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 651115d..c02db01f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -132,7 +132,7 @@  static int amdgpu_bo_list_set(struct amdgpu_device *adev,
 		entry->priority = min(info[i].bo_priority,
 				      AMDGPU_BO_LIST_MAX_PRIORITY);
 		entry->tv.bo = &entry->robj->tbo;
-		entry->tv.shared = true;
+		entry->tv.shared = !entry->robj->prime_shared_count;
 
 		if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_GDS)
 			gds_obj = entry->robj;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 7700dc2..31aac7e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -74,6 +74,7 @@  amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 	if (ret)
 		return ERR_PTR(ret);
 
+	bo->prime_shared_count = 1;
 	return &bo->gem_base;
 }
 
@@ -81,13 +82,29 @@  int amdgpu_gem_prime_pin(struct drm_gem_object *obj)
 {
 	struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
 	int ret = 0;
+	long lret;
 
 	ret = amdgpu_bo_reserve(bo, false);
 	if (unlikely(ret != 0))
 		return ret;
 
+	/*
+	 * Wait for all shared fences to complete before we switch to future
+	 * use of exclusive fence on this prime shared bo.
+	 */
+	lret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, false,
+						   MAX_SCHEDULE_TIMEOUT);
+	if (unlikely(lret < 0)) {
+		DRM_DEBUG_PRIME("Fence wait failed: %li\n", lret);
+		amdgpu_bo_unreserve(bo);
+		return lret;
+	}
+
 	/* pin buffer into GTT */
 	ret = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT, NULL);
+	if (likely(ret == 0))
+		bo->prime_shared_count++;
+
 	amdgpu_bo_unreserve(bo);
 	return ret;
 }
@@ -102,6 +119,8 @@  void amdgpu_gem_prime_unpin(struct drm_gem_object *obj)
 		return;
 
 	amdgpu_bo_unpin(bo);
+	if (bo->prime_shared_count)
+		bo->prime_shared_count--;
 	amdgpu_bo_unreserve(bo);
 }