diff mbox

[2/2] dma-buf: try to replace a signaled fence in reservation_object_add_shared_inplace

Message ID 20171030145904.18584-2-deathsimple@vodafone.de (mailing list archive)
State New, archived
Headers show

Commit Message

Christian König Oct. 30, 2017, 2:59 p.m. UTC
From: Christian König <christian.koenig@amd.com>

The amdgpu issue to also need signaled fences in the reservation objects should
be fixed by now.

Optimize the handling by replacing a signaled fence when adding a new
shared one.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/reservation.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Comments

Chris Wilson Nov. 6, 2017, 4:22 p.m. UTC | #1
Quoting Christian König (2017-10-30 14:59:04)
> From: Christian König <christian.koenig@amd.com>
> 
> The amdgpu issue to also need signaled fences in the reservation objects should
> be fixed by now.
> 
> Optimize the handling by replacing a signaled fence when adding a new
> shared one.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/reservation.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 6fc794576997..a3928ce9f311 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
>                                       struct reservation_object_list *fobj,
>                                       struct dma_fence *fence)
>  {
> -       u32 i;
> +       struct dma_fence *signaled = NULL;
> +       u32 i, signaled_idx;
>  
>         dma_fence_get(fence);
>  
> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
>                         dma_fence_put(old_fence);
>                         return;
>                 }
> +
> +               if (!signaled && dma_fence_is_signaled(old_fence)) {
> +                       signaled = old_fence;
> +                       signaled_idx = i;
> +               }

How much do we care about only keeping one fence per-ctx here? You could
rearrange this to break on old_fence->context == fence->context ||
dma_fence_is_signaled(old_fence) then everyone can use the final block.
-Chris
Christian König Nov. 14, 2017, 2:24 p.m. UTC | #2
Am 06.11.2017 um 17:22 schrieb Chris Wilson:
> Quoting Christian König (2017-10-30 14:59:04)
>> From: Christian König <christian.koenig@amd.com>
>>
>> The amdgpu issue to also need signaled fences in the reservation objects should
>> be fixed by now.
>>
>> Optimize the handling by replacing a signaled fence when adding a new
>> shared one.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/dma-buf/reservation.c | 18 +++++++++++++++---
>>   1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
>> index 6fc794576997..a3928ce9f311 100644
>> --- a/drivers/dma-buf/reservation.c
>> +++ b/drivers/dma-buf/reservation.c
>> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
>>                                        struct reservation_object_list *fobj,
>>                                        struct dma_fence *fence)
>>   {
>> -       u32 i;
>> +       struct dma_fence *signaled = NULL;
>> +       u32 i, signaled_idx;
>>   
>>          dma_fence_get(fence);
>>   
>> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
>>                          dma_fence_put(old_fence);
>>                          return;
>>                  }
>> +
>> +               if (!signaled && dma_fence_is_signaled(old_fence)) {
>> +                       signaled = old_fence;
>> +                       signaled_idx = i;
>> +               }
> How much do we care about only keeping one fence per-ctx here? You could
> rearrange this to break on old_fence->context == fence->context ||
> dma_fence_is_signaled(old_fence) then everyone can use the final block.

Yeah, that is what David Zhou suggested as well.

I've rejected this approach for now cause I think we still have cases 
where we rely on one fence per ctx (but I'm not 100% sure).

I changed patch #1 in this series as you suggest and going to send that 
out once more in a minute.

Can we get this upstream as is for now? I won't have much more time 
working on this.

Regards,
Christian.

> -Chris
Chris Wilson Nov. 14, 2017, 2:34 p.m. UTC | #3
Quoting Christian König (2017-11-14 14:24:44)
> Am 06.11.2017 um 17:22 schrieb Chris Wilson:
> > Quoting Christian König (2017-10-30 14:59:04)
> >> From: Christian König <christian.koenig@amd.com>
> >>
> >> The amdgpu issue to also need signaled fences in the reservation objects should
> >> be fixed by now.
> >>
> >> Optimize the handling by replacing a signaled fence when adding a new
> >> shared one.
> >>
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> ---
> >>   drivers/dma-buf/reservation.c | 18 +++++++++++++++---
> >>   1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> >> index 6fc794576997..a3928ce9f311 100644
> >> --- a/drivers/dma-buf/reservation.c
> >> +++ b/drivers/dma-buf/reservation.c
> >> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
> >>                                        struct reservation_object_list *fobj,
> >>                                        struct dma_fence *fence)
> >>   {
> >> -       u32 i;
> >> +       struct dma_fence *signaled = NULL;
> >> +       u32 i, signaled_idx;
> >>   
> >>          dma_fence_get(fence);
> >>   
> >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj,
> >>                          dma_fence_put(old_fence);
> >>                          return;
> >>                  }
> >> +
> >> +               if (!signaled && dma_fence_is_signaled(old_fence)) {
> >> +                       signaled = old_fence;
> >> +                       signaled_idx = i;
> >> +               }
> > How much do we care about only keeping one fence per-ctx here? You could
> > rearrange this to break on old_fence->context == fence->context ||
> > dma_fence_is_signaled(old_fence) then everyone can use the final block.
> 
> Yeah, that is what David Zhou suggested as well.
> 
> I've rejected this approach for now cause I think we still have cases 
> where we rely on one fence per ctx (but I'm not 100% sure).
> 
> I changed patch #1 in this series as you suggest and going to send that 
> out once more in a minute.
> 
> Can we get this upstream as is for now? I won't have much more time 
> working on this.

Sure, we are only discussing how we might make it look tidier, pure
micro-optimisation with the caveat of losing the one-fence-per-ctx
guarantee.
-Chris
diff mbox

Patch

diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index 6fc794576997..a3928ce9f311 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -104,7 +104,8 @@  reservation_object_add_shared_inplace(struct reservation_object *obj,
 				      struct reservation_object_list *fobj,
 				      struct dma_fence *fence)
 {
-	u32 i;
+	struct dma_fence *signaled = NULL;
+	u32 i, signaled_idx;
 
 	dma_fence_get(fence);
 
@@ -126,17 +127,28 @@  reservation_object_add_shared_inplace(struct reservation_object *obj,
 			dma_fence_put(old_fence);
 			return;
 		}
+
+		if (!signaled && dma_fence_is_signaled(old_fence)) {
+			signaled = old_fence;
+			signaled_idx = i;
+		}
 	}
 
 	/*
 	 * memory barrier is added by write_seqcount_begin,
 	 * fobj->shared_count is protected by this lock too
 	 */
-	RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
-	fobj->shared_count++;
+	if (signaled) {
+		RCU_INIT_POINTER(fobj->shared[signaled_idx], fence);
+	} else {
+		RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
+		fobj->shared_count++;
+	}
 
 	write_seqcount_end(&obj->seq);
 	preempt_enable();
+
+	dma_fence_put(signaled);
 }
 
 static void