Message ID | 20171030145904.18584-2-deathsimple@vodafone.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Christian König (2017-10-30 14:59:04) > From: Christian König <christian.koenig@amd.com> > > The amdgpu issue to also need signaled fences in the reservation objects should > be fixed by now. > > Optimize the handling by replacing a signaled fence when adding a new > shared one. > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/dma-buf/reservation.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > index 6fc794576997..a3928ce9f311 100644 > --- a/drivers/dma-buf/reservation.c > +++ b/drivers/dma-buf/reservation.c > @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > struct reservation_object_list *fobj, > struct dma_fence *fence) > { > - u32 i; > + struct dma_fence *signaled = NULL; > + u32 i, signaled_idx; > > dma_fence_get(fence); > > @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > dma_fence_put(old_fence); > return; > } > + > + if (!signaled && dma_fence_is_signaled(old_fence)) { > + signaled = old_fence; > + signaled_idx = i; > + } How much do we care about only keeping one fence per-ctx here? You could rearrange this to break on old_fence->context == fence->context || dma_fence_is_signaled(old_fence) then everyone can use the final block. -Chris
Am 06.11.2017 um 17:22 schrieb Chris Wilson: > Quoting Christian König (2017-10-30 14:59:04) >> From: Christian König <christian.koenig@amd.com> >> >> The amdgpu issue to also need signaled fences in the reservation objects should >> be fixed by now. >> >> Optimize the handling by replacing a signaled fence when adding a new >> shared one. >> >> Signed-off-by: Christian König <christian.koenig@amd.com> >> --- >> drivers/dma-buf/reservation.c | 18 +++++++++++++++--- >> 1 file changed, 15 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c >> index 6fc794576997..a3928ce9f311 100644 >> --- a/drivers/dma-buf/reservation.c >> +++ b/drivers/dma-buf/reservation.c >> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >> struct reservation_object_list *fobj, >> struct dma_fence *fence) >> { >> - u32 i; >> + struct dma_fence *signaled = NULL; >> + u32 i, signaled_idx; >> >> dma_fence_get(fence); >> >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, >> dma_fence_put(old_fence); >> return; >> } >> + >> + if (!signaled && dma_fence_is_signaled(old_fence)) { >> + signaled = old_fence; >> + signaled_idx = i; >> + } > How much do we care about only keeping one fence per-ctx here? You could > rearrange this to break on old_fence->context == fence->context || > dma_fence_is_signaled(old_fence) then everyone can use the final block. Yeah, that is what David Zhou suggested as well. I've rejected this approach for now cause I think we still have cases where we rely on one fence per ctx (but I'm not 100% sure). I changed patch #1 in this series as you suggest and going to send that out once more in a minute. Can we get this upstream as is for now? I won't have much more time working on this. Regards, Christian. > -Chris
Quoting Christian König (2017-11-14 14:24:44) > Am 06.11.2017 um 17:22 schrieb Chris Wilson: > > Quoting Christian König (2017-10-30 14:59:04) > >> From: Christian König <christian.koenig@amd.com> > >> > >> The amdgpu issue to also need signaled fences in the reservation objects should > >> be fixed by now. > >> > >> Optimize the handling by replacing a signaled fence when adding a new > >> shared one. > >> > >> Signed-off-by: Christian König <christian.koenig@amd.com> > >> --- > >> drivers/dma-buf/reservation.c | 18 +++++++++++++++--- > >> 1 file changed, 15 insertions(+), 3 deletions(-) > >> > >> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c > >> index 6fc794576997..a3928ce9f311 100644 > >> --- a/drivers/dma-buf/reservation.c > >> +++ b/drivers/dma-buf/reservation.c > >> @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >> struct reservation_object_list *fobj, > >> struct dma_fence *fence) > >> { > >> - u32 i; > >> + struct dma_fence *signaled = NULL; > >> + u32 i, signaled_idx; > >> > >> dma_fence_get(fence); > >> > >> @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, > >> dma_fence_put(old_fence); > >> return; > >> } > >> + > >> + if (!signaled && dma_fence_is_signaled(old_fence)) { > >> + signaled = old_fence; > >> + signaled_idx = i; > >> + } > > How much do we care about only keeping one fence per-ctx here? You could > > rearrange this to break on old_fence->context == fence->context || > > dma_fence_is_signaled(old_fence) then everyone can use the final block. > > Yeah, that is what David Zhou suggested as well. > > I've rejected this approach for now cause I think we still have cases > where we rely on one fence per ctx (but I'm not 100% sure). > > I changed patch #1 in this series as you suggest and going to send that > out once more in a minute. > > Can we get this upstream as is for now? I won't have much more time > working on this. Sure, we are only discussing how we might make it look tidier, pure micro-optimisation with the caveat of losing the one-fence-per-ctx guarantee. -Chris
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 6fc794576997..a3928ce9f311 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -104,7 +104,8 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct reservation_object_list *fobj, struct dma_fence *fence) { - u32 i; + struct dma_fence *signaled = NULL; + u32 i, signaled_idx; dma_fence_get(fence); @@ -126,17 +127,28 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, dma_fence_put(old_fence); return; } + + if (!signaled && dma_fence_is_signaled(old_fence)) { + signaled = old_fence; + signaled_idx = i; + } } /* * memory barrier is added by write_seqcount_begin, * fobj->shared_count is protected by this lock too */ - RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); - fobj->shared_count++; + if (signaled) { + RCU_INIT_POINTER(fobj->shared[signaled_idx], fence); + } else { + RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence); + fobj->shared_count++; + } write_seqcount_end(&obj->seq); preempt_enable(); + + dma_fence_put(signaled); } static void