diff mbox

dma-buf/sync-file: Avoid enable fence signaling if poll(.timeout=0)

Message ID 20160921072625.GA3473@joana (mailing list archive)
State New, archived
Headers show

Commit Message

Gustavo Padovan Sept. 21, 2016, 7:26 a.m. UTC
Hi Rafael,

2016-09-14 Rafael Antognolli <rafael.antognolli@intel.com>:

> Hi Chris and Gustavo,
> 
> On Mon, Aug 29, 2016 at 07:16:13PM +0100, Chris Wilson wrote:
> > If we being polled with a timeout of zero, a nonblocking busy query,
> > we don't need to install any fence callbacks as we will not be waiting.
> > As we only install the callback once, the overhead comes from the atomic
> > bit test that also causes serialisation between threads.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: Gustavo Padovan <gustavo@padovan.org>
> > Cc: linux-media@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/dma-buf/sync_file.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > index 486d29c1a830..abb5fdab75fd 100644
> > --- a/drivers/dma-buf/sync_file.c
> > +++ b/drivers/dma-buf/sync_file.c
> > @@ -306,7 +306,8 @@ static unsigned int sync_file_poll(struct file *file, poll_table *wait)
> >  
> >  	poll_wait(file, &sync_file->wq, wait);
> >  
> > -	if (!test_and_set_bit(POLL_ENABLED, &sync_file->fence->flags)) {
> > +	if (!poll_does_not_wait(wait) &&
> > +	    !test_and_set_bit(POLL_ENABLED, &sync_file->fence->flags)) {
> >  		if (fence_add_callback(sync_file->fence, &sync_file->cb,
> >  				       fence_check_cb_func) < 0)
> >  			wake_up_all(&sync_file->wq);
> 
> This commit is causing an error on one of the tests that Robert Foss
> submitted for i-g-t. The one that does random merge of fences from
> different timelines. A simple version of the test that still triggers
> this is:
> 
> static void test_sync_simple_merge(void)
> {
>         int fence1, fence2, fence_merge, timeline1, timeline2;
>         int ret;
> 
>         timeline1 = sw_sync_timeline_create();
>         timeline2 = sw_sync_timeline_create();
>         fence1 = sw_sync_fence_create(timeline1, 1);
>         fence2 = sw_sync_fence_create(timeline2, 2);
>         fence_merge = sw_sync_merge(fence1, fence2);
>         sw_sync_timeline_inc(timeline1, 5);
>         sw_sync_timeline_inc(timeline2, 5);
> 
>         ret = sw_sync_wait(fence_merge, 0);
>         igt_assert_f(ret > 0, "Failure triggering fence\n");
> 
>         sw_sync_fence_destroy(fence_merge);
>         sw_sync_fence_destroy(fence1);
>         sw_sync_fence_destroy(fence2);
>         sw_sync_timeline_destroy(timeline1);
>         sw_sync_timeline_destroy(timeline2);
> }
> 
> It looks like you cannot trust fence_is_signaled() without a
> fence_add_callback(). I think the fence_array->num_pending won't get
> updated. Although I couldn't figure out why it only happens if you merge
> fences from different timelines.

Yes, num_pending is only updated when signaling is enabled. It only
happens with different timelines because when you merge fences that are
on the same timeline your final sync_file has only one fence and thus 
a fence_array is not created.

If we want to keep the poll_does_not_wait optimization we need a way
to count the pending fences during fence_is_signaled(). I'd propose
something like this:


Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Date:   Tue Sep 20 16:43:06 2016 +0200

    dma-buf/fence-array: get signaled state when signaling is disabled
    
    If the fences in the fence_array signal on the fence_array does not have
    signalling enabled num_pending will not be updated accordingly.
    
    So when signaling is disabled check the signal of every fence with
    fence_is_signaled() and then compare with num_pending to learn if the
    fence_array was signalled or not.
    
    If we want to keep the poll_does_not_wait optimization I think we need
    something like this. It keeps the same behaviour if signalling is enabled
    but tries to calculated the state otherwise.
    
    Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>



Gustavo

Comments

Chris Wilson Sept. 21, 2016, 11:08 a.m. UTC | #1
On Wed, Sep 21, 2016 at 10:26:25AM +0300, Gustavo Padovan wrote:
> Hi Rafael,
> 
> 2016-09-14 Rafael Antognolli <rafael.antognolli@intel.com>:
> 
> > Hi Chris and Gustavo,
> > 
> > On Mon, Aug 29, 2016 at 07:16:13PM +0100, Chris Wilson wrote:
> > > If we being polled with a timeout of zero, a nonblocking busy query,
> > > we don't need to install any fence callbacks as we will not be waiting.
> > > As we only install the callback once, the overhead comes from the atomic
> > > bit test that also causes serialisation between threads.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > Cc: Gustavo Padovan <gustavo@padovan.org>
> > > Cc: linux-media@vger.kernel.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: linaro-mm-sig@lists.linaro.org
> > > ---
> > >  drivers/dma-buf/sync_file.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > index 486d29c1a830..abb5fdab75fd 100644
> > > --- a/drivers/dma-buf/sync_file.c
> > > +++ b/drivers/dma-buf/sync_file.c
> > > @@ -306,7 +306,8 @@ static unsigned int sync_file_poll(struct file *file, poll_table *wait)
> > >  
> > >  	poll_wait(file, &sync_file->wq, wait);
> > >  
> > > -	if (!test_and_set_bit(POLL_ENABLED, &sync_file->fence->flags)) {
> > > +	if (!poll_does_not_wait(wait) &&
> > > +	    !test_and_set_bit(POLL_ENABLED, &sync_file->fence->flags)) {
> > >  		if (fence_add_callback(sync_file->fence, &sync_file->cb,
> > >  				       fence_check_cb_func) < 0)
> > >  			wake_up_all(&sync_file->wq);
> > 
> > This commit is causing an error on one of the tests that Robert Foss
> > submitted for i-g-t. The one that does random merge of fences from
> > different timelines. A simple version of the test that still triggers
> > this is:
> > 
> > static void test_sync_simple_merge(void)
> > {
> >         int fence1, fence2, fence_merge, timeline1, timeline2;
> >         int ret;
> > 
> >         timeline1 = sw_sync_timeline_create();
> >         timeline2 = sw_sync_timeline_create();
> >         fence1 = sw_sync_fence_create(timeline1, 1);
> >         fence2 = sw_sync_fence_create(timeline2, 2);
> >         fence_merge = sw_sync_merge(fence1, fence2);
> >         sw_sync_timeline_inc(timeline1, 5);
> >         sw_sync_timeline_inc(timeline2, 5);
> > 
> >         ret = sw_sync_wait(fence_merge, 0);
> >         igt_assert_f(ret > 0, "Failure triggering fence\n");
> > 
> >         sw_sync_fence_destroy(fence_merge);
> >         sw_sync_fence_destroy(fence1);
> >         sw_sync_fence_destroy(fence2);
> >         sw_sync_timeline_destroy(timeline1);
> >         sw_sync_timeline_destroy(timeline2);
> > }
> > 
> > It looks like you cannot trust fence_is_signaled() without a
> > fence_add_callback(). I think the fence_array->num_pending won't get
> > updated. Although I couldn't figure out why it only happens if you merge
> > fences from different timelines.
> 
> Yes, num_pending is only updated when signaling is enabled. It only
> happens with different timelines because when you merge fences that are
> on the same timeline your final sync_file has only one fence and thus 
> a fence_array is not created.
> 
> If we want to keep the poll_does_not_wait optimization we need a way
> to count the pending fences during fence_is_signaled(). I'd propose
> something like this:
> 
> 
> Author: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> Date:   Tue Sep 20 16:43:06 2016 +0200
> 
>     dma-buf/fence-array: get signaled state when signaling is disabled
>     
>     If the fences in the fence_array signal on the fence_array does not have
>     signalling enabled num_pending will not be updated accordingly.
>     
>     So when signaling is disabled check the signal of every fence with
>     fence_is_signaled() and then compare with num_pending to learn if the
>     fence_array was signalled or not.
>     
>     If we want to keep the poll_does_not_wait optimization I think we need
>     something like this. It keeps the same behaviour if signalling is enabled
>     but tries to calculated the state otherwise.
>     
>     Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

We need this regardless, so yay for uncovering a bug!
> 
> diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
> index f1989fc..34c9209 100644
> --- a/drivers/dma-buf/fence-array.c
> +++ b/drivers/dma-buf/fence-array.c
> @@ -75,8 +75,18 @@ static bool fence_array_enable_signaling(struct fence *fence)
>  static bool fence_array_signaled(struct fence *fence)
>  {
>         struct fence_array *array = to_fence_array(fence);
> +       int i, num_pending;
>  
> -       return atomic_read(&array->num_pending) <= 0;
> +       num_pending = atomic_read(&array->num_pending);
> +
> +       if (!test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {

Oh, very sneaky. I thought this was FENCE_FLAG_SIGNALED_BIT!

Throw in a comment like:

/* Before signaling is enabled, num_pending is static (set during array
 * construction as a count of *all* fences. To ensure forward progress,
 * i.e. a while (!fence_is_signaled()) ; busy-loop eventually proceeds,
 * we need to check the current status of our fences.
 */

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
diff mbox

Patch

diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
index f1989fc..34c9209 100644
--- a/drivers/dma-buf/fence-array.c
+++ b/drivers/dma-buf/fence-array.c
@@ -75,8 +75,18 @@  static bool fence_array_enable_signaling(struct fence *fence)
 static bool fence_array_signaled(struct fence *fence)
 {
        struct fence_array *array = to_fence_array(fence);
+       int i, num_pending;
 
-       return atomic_read(&array->num_pending) <= 0;
+       num_pending = atomic_read(&array->num_pending);
+
+       if (!test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {
+               for (i = 0 ; i < array->num_fences; ++i) {
+                       if (fence_is_signaled(array->fences[i]))
+                               num_pending--;
+               }
+       }
+
+       return num_pending <= 0;
 }
 
 static void fence_array_release(struct fence *fence)