Message ID | 1476633833-6352-1-git-send-email-robdclark@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Oct 16, 2016 at 12:03:53PM -0400, Rob Clark wrote: > Currently with fence-array, we have a potential deadlock situation. If we > fence_add_callback() on an array-fence, the array-fence's lock is aquired > first, and in it's ->enable_signaling() callback, it will install cb's on > it's array-member fences, so the array-member's lock is acquired second. > > But in the signal path, the array-member's lock is acquired first, and the > array-fence's lock acquired second. > > One approach to deal with this is avoid holding the fence's lock when > calling the cb. It is a bit intrusive and I haven't fixed up all the > other drivers that call directly or indirectly fence_signal_locked(). > > I guess the other option would be introduce a work-queue for array-fence? > Or?? Not keen on any yet, still hoping for a gold ore. > lockdep splat: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.7.0-rc7+ #489 Not tainted > ------------------------------------------------------- > surfaceflinger/2034 is trying to acquire lock: > (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > > but task is already holding lock: > (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #1 (&(&obj->child_list_lock)->rlock){......}: > [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 > [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 > [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 > [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 > [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 > [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > -> #0 (&(&array->lock)->rlock){......}: > [<ffff000008104768>] print_circular_bug+0x80/0x2e0 > [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 > [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 > [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 > [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 > [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > > *** DEADLOCK *** > > 1 lock held by surfaceflinger/2034: > #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > --- > drivers/dma-buf/fence.c | 22 ++++++++++++++-------- > drivers/dma-buf/sw_sync.c | 2 +- > drivers/dma-buf/sync_debug.c | 16 +++++++++------- > include/linux/fence.h | 6 +++--- > 4 files changed, 27 insertions(+), 19 deletions(-) > > diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c > index cc05ddd..917f905 100644 > --- a/drivers/dma-buf/fence.c > +++ b/drivers/dma-buf/fence.c > @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); > * > * Unlike fence_signal, this function must be called with fence->lock held. > */ > -int fence_signal_locked(struct fence *fence) > +int fence_signal_locked(struct fence *fence, unsigned long flags) > { > - struct fence_cb *cur, *tmp; > + struct fence_cb *cur; > int ret = 0; > > lockdep_assert_held(fence->lock); > @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) > } else > trace_fence_signaled(fence); > > - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { > + while (!list_empty(&fence->cb_list)) { > + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); > list_del_init(&cur->node); > + spin_unlock_irqrestore(fence->lock, flags); We don't need to pass down flags here, as we don't have to change the context of the lock (i.e. we just leave irqs disabled if they have been by the caller). That spares us the issue of the flags being changed here, and the changed flags not being propagated back. We need only grab the list under the lock once, as once signaled no new elements will be added to the list (after add_callback takes the lock it checks for the signal). i.e. struct list_head cb_list; list_splice_init(&fence->cb_list, &cb_list; spin_unlock(fence->lock); list_for_each_entry_safe(cur, tmp, &cb_list, node) cur->func(fence, cur); spin_lock(fence->lock); Though I'm not convinced that dropping the lock is going to be safe for everyone, since that lock may be communal and guarding move than just the cb_list, e.g. virtio_gpu_fence_event_process(). -Chris
On Sun, Oct 16, 2016 at 1:38 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Sun, Oct 16, 2016 at 12:03:53PM -0400, Rob Clark wrote: >> Currently with fence-array, we have a potential deadlock situation. If we >> fence_add_callback() on an array-fence, the array-fence's lock is aquired >> first, and in it's ->enable_signaling() callback, it will install cb's on >> it's array-member fences, so the array-member's lock is acquired second. >> >> But in the signal path, the array-member's lock is acquired first, and the >> array-fence's lock acquired second. >> >> One approach to deal with this is avoid holding the fence's lock when >> calling the cb. It is a bit intrusive and I haven't fixed up all the >> other drivers that call directly or indirectly fence_signal_locked(). >> >> I guess the other option would be introduce a work-queue for array-fence? >> Or?? > > Not keen on any yet, still hoping for a gold ore. yeah, that is why I sent these two approaches as RFC.. hopefully if there is a better way, someone will speak up ;-) I do think that, unless someone has a better idea, I like how the wq in fence-array worked out better.. >> lockdep splat: >> >> ====================================================== >> [ INFO: possible circular locking dependency detected ] >> 4.7.0-rc7+ #489 Not tainted >> ------------------------------------------------------- >> surfaceflinger/2034 is trying to acquire lock: >> (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8 >> >> but task is already holding lock: >> (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 >> >> which lock already depends on the new lock. >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (&(&obj->child_list_lock)->rlock){......}: >> [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 >> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >> [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 >> [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 >> [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 >> [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 >> [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 >> [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 >> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >> >> -> #0 (&(&array->lock)->rlock){......}: >> [<ffff000008104768>] print_circular_bug+0x80/0x2e0 >> [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 >> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >> [<ffff00000858cddc>] fence_signal+0x5c/0xf8 >> [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 >> [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 >> [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 >> [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 >> [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 >> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >> >> other info that might help us debug this: >> >> Possible unsafe locking scenario: >> >> CPU0 CPU1 >> ---- ---- >> lock(&(&obj->child_list_lock)->rlock); >> lock(&(&array->lock)->rlock); >> lock(&(&obj->child_list_lock)->rlock); >> lock(&(&array->lock)->rlock); >> >> *** DEADLOCK *** >> >> 1 lock held by surfaceflinger/2034: >> #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 >> --- >> drivers/dma-buf/fence.c | 22 ++++++++++++++-------- >> drivers/dma-buf/sw_sync.c | 2 +- >> drivers/dma-buf/sync_debug.c | 16 +++++++++------- >> include/linux/fence.h | 6 +++--- >> 4 files changed, 27 insertions(+), 19 deletions(-) >> >> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c >> index cc05ddd..917f905 100644 >> --- a/drivers/dma-buf/fence.c >> +++ b/drivers/dma-buf/fence.c >> @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); >> * >> * Unlike fence_signal, this function must be called with fence->lock held. >> */ >> -int fence_signal_locked(struct fence *fence) >> +int fence_signal_locked(struct fence *fence, unsigned long flags) >> { >> - struct fence_cb *cur, *tmp; >> + struct fence_cb *cur; >> int ret = 0; >> >> lockdep_assert_held(fence->lock); >> @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) >> } else >> trace_fence_signaled(fence); >> >> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { >> + while (!list_empty(&fence->cb_list)) { >> + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); >> list_del_init(&cur->node); >> + spin_unlock_irqrestore(fence->lock, flags); > > We don't need to pass down flags here, as we don't have to change the > context of the lock (i.e. we just leave irqs disabled if they have been > by the caller). That spares us the issue of the flags being changed > here, and the changed flags not being propagated back. > > We need only grab the list under the lock once, as once signaled no new > elements will be added to the list (after add_callback takes the lock it > checks for the signal). > > i.e. > > struct list_head cb_list; > > list_splice_init(&fence->cb_list, &cb_list; > spin_unlock(fence->lock); > > list_for_each_entry_safe(cur, tmp, &cb_list, node) > cur->func(fence, cur); > > spin_lock(fence->lock); hmm, yeah, not having to pass flags everywhere simplifies things.. although wasn't sure how much that would be abusing the spin_lock interface.. > Though I'm not convinced that dropping the lock is going to be safe for > everyone, since that lock may be communal and guarding move than just > the cb_list, e.g. virtio_gpu_fence_event_process(). yeah, that is part of why I liked the wq in fence-array better.. if we go with this first approach instead, I'll need to audit more closely the other callers, which I haven't done yet, since what they are doing might not be safe if we start dropping locks temporarily.. and it adds an unexpected twist for future users of the API, not really following the principle of least surprise. BR, -R > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre
Op 16-10-16 om 18:03 schreef Rob Clark: > Currently with fence-array, we have a potential deadlock situation. If we > fence_add_callback() on an array-fence, the array-fence's lock is aquired > first, and in it's ->enable_signaling() callback, it will install cb's on > it's array-member fences, so the array-member's lock is acquired second. > > But in the signal path, the array-member's lock is acquired first, and the > array-fence's lock acquired second. > > One approach to deal with this is avoid holding the fence's lock when > calling the cb. It is a bit intrusive and I haven't fixed up all the > other drivers that call directly or indirectly fence_signal_locked(). > > I guess the other option would be introduce a work-queue for array-fence? > Or?? Enable signaling when creating the fence array is an option. As an optimization we don't enable signaling when creating a single fence, but when you merge fences you're probably interested in the result anyway. The real problem is fence_add_callback in .enable_signaling, not the signaling itself. > lockdep splat: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.7.0-rc7+ #489 Not tainted > ------------------------------------------------------- > surfaceflinger/2034 is trying to acquire lock: > (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > > but task is already holding lock: > (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #1 (&(&obj->child_list_lock)->rlock){......}: > [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 > [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 > [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 > [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 > [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 > [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > -> #0 (&(&array->lock)->rlock){......}: > [<ffff000008104768>] print_circular_bug+0x80/0x2e0 > [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 > [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 > [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 > [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 > [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > > *** DEADLOCK *** > > 1 lock held by surfaceflinger/2034: > #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > --- > drivers/dma-buf/fence.c | 22 ++++++++++++++-------- > drivers/dma-buf/sw_sync.c | 2 +- > drivers/dma-buf/sync_debug.c | 16 +++++++++------- > include/linux/fence.h | 6 +++--- > 4 files changed, 27 insertions(+), 19 deletions(-) > > diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c > index cc05ddd..917f905 100644 > --- a/drivers/dma-buf/fence.c > +++ b/drivers/dma-buf/fence.c > @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); > * > * Unlike fence_signal, this function must be called with fence->lock held. > */ > -int fence_signal_locked(struct fence *fence) > +int fence_signal_locked(struct fence *fence, unsigned long flags) > { > - struct fence_cb *cur, *tmp; > + struct fence_cb *cur; > int ret = 0; > > lockdep_assert_held(fence->lock); > @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) > } else > trace_fence_signaled(fence); > > - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { > + while (!list_empty(&fence->cb_list)) { > + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); > list_del_init(&cur->node); > + spin_unlock_irqrestore(fence->lock, flags); > cur->func(fence, cur); > + spin_lock_irqsave(fence->lock, flags); Dropping the lock in this function is a terrible idea. The fence code assumes that signaling is atomic with setting the signaled bit. This could probably introduce some race conditions, like when this function is called inside a driver's interrupt handler which requires the lock to update other bookkeeping. ~Maarten
On Mon, Oct 17, 2016 at 4:25 AM, Maarten Lankhorst <maarten.lankhorst@linux.intel.com> wrote: > Op 16-10-16 om 18:03 schreef Rob Clark: >> Currently with fence-array, we have a potential deadlock situation. If we >> fence_add_callback() on an array-fence, the array-fence's lock is aquired >> first, and in it's ->enable_signaling() callback, it will install cb's on >> it's array-member fences, so the array-member's lock is acquired second. >> >> But in the signal path, the array-member's lock is acquired first, and the >> array-fence's lock acquired second. >> >> One approach to deal with this is avoid holding the fence's lock when >> calling the cb. It is a bit intrusive and I haven't fixed up all the >> other drivers that call directly or indirectly fence_signal_locked(). >> >> I guess the other option would be introduce a work-queue for array-fence? >> Or?? > Enable signaling when creating the fence array is an option. As an optimization we don't enable > signaling when creating a single fence, but when you merge fences you're probably interested > in the result anyway. I think what you mean is to fence_add_callback() on all the member fences up-front, rather from ops->enable_signaling()? I guess that should work. BR, -R
Am 17.10.2016 um 20:02 schrieb Rob Clark: > On Mon, Oct 17, 2016 at 4:25 AM, Maarten Lankhorst > <maarten.lankhorst@linux.intel.com> wrote: >> Op 16-10-16 om 18:03 schreef Rob Clark: >>> Currently with fence-array, we have a potential deadlock situation. If we >>> fence_add_callback() on an array-fence, the array-fence's lock is aquired >>> first, and in it's ->enable_signaling() callback, it will install cb's on >>> it's array-member fences, so the array-member's lock is acquired second. >>> >>> But in the signal path, the array-member's lock is acquired first, and the >>> array-fence's lock acquired second. >>> >>> One approach to deal with this is avoid holding the fence's lock when >>> calling the cb. It is a bit intrusive and I haven't fixed up all the >>> other drivers that call directly or indirectly fence_signal_locked(). >>> >>> I guess the other option would be introduce a work-queue for array-fence? >>> Or?? >> Enable signaling when creating the fence array is an option. As an optimization we don't enable >> signaling when creating a single fence, but when you merge fences you're probably interested >> in the result anyway. > I think what you mean is to fence_add_callback() on all the member > fences up-front, rather from ops->enable_signaling()? I guess that > should work. Yeah, but we should try to avoid that. Enabling signaling all the time is really expensive for some use cases. I would certainly prefer the approach using a work item. Regards, Christian. > > BR, > -R > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 16.10.2016 um 18:03 schrieb Rob Clark: > Currently with fence-array, we have a potential deadlock situation. If we > fence_add_callback() on an array-fence, the array-fence's lock is aquired > first, and in it's ->enable_signaling() callback, it will install cb's on > it's array-member fences, so the array-member's lock is acquired second. > > But in the signal path, the array-member's lock is acquired first, and the > array-fence's lock acquired second. > > One approach to deal with this is avoid holding the fence's lock when > calling the cb. It is a bit intrusive and I haven't fixed up all the > other drivers that call directly or indirectly fence_signal_locked(). In general: Oh! Yes! Please! We have the same issue in the AMD scheduler when we want to register a new callback on the next fence in the list. > I guess the other option would be introduce a work-queue for array-fence? That's what we do in the GPU scheduler and it adds quite a bunch of extra overhead. So my preferences are clearly to fix calling the cb with any locks held before using another work item for the array fences. But I'm not sure if that is possible with all drivers. Regards, Christian. > Or?? > > lockdep splat: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.7.0-rc7+ #489 Not tainted > ------------------------------------------------------- > surfaceflinger/2034 is trying to acquire lock: > (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > > but task is already holding lock: > (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #1 (&(&obj->child_list_lock)->rlock){......}: > [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 > [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 > [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 > [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 > [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 > [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > -> #0 (&(&array->lock)->rlock){......}: > [<ffff000008104768>] print_circular_bug+0x80/0x2e0 > [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 > [<ffff000008108e0c>] lock_acquire+0x4c/0x68 > [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 > [<ffff00000858cddc>] fence_signal+0x5c/0xf8 > [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 > [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 > [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 > [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 > [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 > [<ffff000008084f30>] el0_svc_naked+0x24/0x28 > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > lock(&(&obj->child_list_lock)->rlock); > lock(&(&array->lock)->rlock); > > *** DEADLOCK *** > > 1 lock held by surfaceflinger/2034: > #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] sw_sync_ioctl+0x228/0x3b0 > --- > drivers/dma-buf/fence.c | 22 ++++++++++++++-------- > drivers/dma-buf/sw_sync.c | 2 +- > drivers/dma-buf/sync_debug.c | 16 +++++++++------- > include/linux/fence.h | 6 +++--- > 4 files changed, 27 insertions(+), 19 deletions(-) > > diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c > index cc05ddd..917f905 100644 > --- a/drivers/dma-buf/fence.c > +++ b/drivers/dma-buf/fence.c > @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); > * > * Unlike fence_signal, this function must be called with fence->lock held. > */ > -int fence_signal_locked(struct fence *fence) > +int fence_signal_locked(struct fence *fence, unsigned long flags) > { > - struct fence_cb *cur, *tmp; > + struct fence_cb *cur; > int ret = 0; > > lockdep_assert_held(fence->lock); > @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) > } else > trace_fence_signaled(fence); > > - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { > + while (!list_empty(&fence->cb_list)) { > + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); > list_del_init(&cur->node); > + spin_unlock_irqrestore(fence->lock, flags); > cur->func(fence, cur); > + spin_lock_irqsave(fence->lock, flags); > } > return ret; > } > @@ -124,12 +127,15 @@ int fence_signal(struct fence *fence) > trace_fence_signaled(fence); > > if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { > - struct fence_cb *cur, *tmp; > + struct fence_cb *cur; > > spin_lock_irqsave(fence->lock, flags); > - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { > + while (!list_empty(&fence->cb_list)) { > + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); > list_del_init(&cur->node); > + spin_unlock_irqrestore(fence->lock, flags); > cur->func(fence, cur); > + spin_lock_irqsave(fence->lock, flags); > } > spin_unlock_irqrestore(fence->lock, flags); > } > @@ -211,7 +217,7 @@ void fence_enable_sw_signaling(struct fence *fence) > spin_lock_irqsave(fence->lock, flags); > > if (!fence->ops->enable_signaling(fence)) > - fence_signal_locked(fence); > + fence_signal_locked(fence, flags); > > spin_unlock_irqrestore(fence->lock, flags); > } > @@ -266,7 +272,7 @@ int fence_add_callback(struct fence *fence, struct fence_cb *cb, > trace_fence_enable_signal(fence); > > if (!fence->ops->enable_signaling(fence)) { > - fence_signal_locked(fence); > + fence_signal_locked(fence, flags); > ret = -ENOENT; > } > } > @@ -366,7 +372,7 @@ fence_default_wait(struct fence *fence, bool intr, signed long timeout) > trace_fence_enable_signal(fence); > > if (!fence->ops->enable_signaling(fence)) { > - fence_signal_locked(fence); > + fence_signal_locked(fence, flags); > goto out; > } > } > diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c > index 62e8e6d..2271f7f 100644 > --- a/drivers/dma-buf/sw_sync.c > +++ b/drivers/dma-buf/sw_sync.c > @@ -146,7 +146,7 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) > > list_for_each_entry_safe(pt, next, &obj->active_list_head, > active_list) { > - if (fence_is_signaled_locked(&pt->base)) > + if (fence_is_signaled_locked(&pt->base, flags)) > list_del_init(&pt->active_list); > } > > diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c > index 2dd4c3d..a7556d3 100644 > --- a/drivers/dma-buf/sync_debug.c > +++ b/drivers/dma-buf/sync_debug.c > @@ -71,12 +71,13 @@ static const char *sync_status_str(int status) > return "error"; > } > > -static void sync_print_fence(struct seq_file *s, struct fence *fence, bool show) > +static void sync_print_fence(struct seq_file *s, struct fence *fence, > + bool show, unsigned long flags) > { > int status = 1; > struct sync_timeline *parent = fence_parent(fence); > > - if (fence_is_signaled_locked(fence)) > + if (fence_is_signaled_locked(fence, flags)) > status = fence->status; > > seq_printf(s, " %s%sfence %s", > @@ -124,13 +125,14 @@ static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj) > list_for_each(pos, &obj->child_list_head) { > struct sync_pt *pt = > container_of(pos, struct sync_pt, child_list); > - sync_print_fence(s, &pt->base, false); > + sync_print_fence(s, &pt->base, false, flags); > } > spin_unlock_irqrestore(&obj->child_list_lock, flags); > } > > static void sync_print_sync_file(struct seq_file *s, > - struct sync_file *sync_file) > + struct sync_file *sync_file, > + unsigned long flags) > { > int i; > > @@ -141,9 +143,9 @@ static void sync_print_sync_file(struct seq_file *s, > struct fence_array *array = to_fence_array(sync_file->fence); > > for (i = 0; i < array->num_fences; ++i) > - sync_print_fence(s, array->fences[i], true); > + sync_print_fence(s, array->fences[i], true, flags); > } else { > - sync_print_fence(s, sync_file->fence, true); > + sync_print_fence(s, sync_file->fence, true, flags); > } > } > > @@ -172,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused) > struct sync_file *sync_file = > container_of(pos, struct sync_file, sync_file_list); > > - sync_print_sync_file(s, sync_file); > + sync_print_sync_file(s, sync_file, flags); > seq_puts(s, "\n"); > } > spin_unlock_irqrestore(&sync_file_list_lock, flags); > diff --git a/include/linux/fence.h b/include/linux/fence.h > index 0d76305..073f380 100644 > --- a/include/linux/fence.h > +++ b/include/linux/fence.h > @@ -220,7 +220,7 @@ static inline void fence_put(struct fence *fence) > } > > int fence_signal(struct fence *fence); > -int fence_signal_locked(struct fence *fence); > +int fence_signal_locked(struct fence *fence, unsigned long flags); > signed long fence_default_wait(struct fence *fence, bool intr, signed long timeout); > int fence_add_callback(struct fence *fence, struct fence_cb *cb, > fence_func_t func); > @@ -239,13 +239,13 @@ void fence_enable_sw_signaling(struct fence *fence); > * This function requires fence->lock to be held. > */ > static inline bool > -fence_is_signaled_locked(struct fence *fence) > +fence_is_signaled_locked(struct fence *fence, unsigned long flags) > { > if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) > return true; > > if (fence->ops->signaled && fence->ops->signaled(fence)) { > - fence_signal_locked(fence); > + fence_signal_locked(fence, flags); > return true; > } >
On Tue, Oct 18, 2016 at 6:07 AM, Christian König <deathsimple@vodafone.de> wrote: > Am 16.10.2016 um 18:03 schrieb Rob Clark: >> >> Currently with fence-array, we have a potential deadlock situation. If we >> fence_add_callback() on an array-fence, the array-fence's lock is aquired >> first, and in it's ->enable_signaling() callback, it will install cb's on >> it's array-member fences, so the array-member's lock is acquired second. >> >> But in the signal path, the array-member's lock is acquired first, and the >> array-fence's lock acquired second. >> >> One approach to deal with this is avoid holding the fence's lock when >> calling the cb. It is a bit intrusive and I haven't fixed up all the >> other drivers that call directly or indirectly fence_signal_locked(). > > > In general: Oh! Yes! Please! We have the same issue in the AMD scheduler > when we want to register a new callback on the next fence in the list. > >> I guess the other option would be introduce a work-queue for array-fence? > > > That's what we do in the GPU scheduler and it adds quite a bunch of extra > overhead. > > So my preferences are clearly to fix calling the cb with any locks held > before using another work item for the array fences. But I'm not sure if > that is possible with all drivers. I guess it is probably not 100% possible to ensure driver isn't holding other of it's own locks when cb is called.. so I guess wq for cb that needs to take other locks is a safe solution. That said, and maybe I need more coffee, but I think the spinlock for iterating cb_list is probably not needed.. I think we could arrange things so the test_and_set(SIGNALED) is enough to protect iterating the list and calling the cb's. (Maybe protect the test_and_set(SIGNALED) w/ fence->lock just so it doesn't race someone who already got past the test_bit(SIGNALED) in _add_callback()?) Then "all" we have to do is figure out how to kill the fence_signal_locked()/fence_is_signaled_locked() paths.. I think I also wouldn't mind pushing the locking down into the ->enable_signaling() cb too for drivers that need that. Maybe we don't strictly need that. From quick look, seems like half the drivers just 'return true;' and seems a bit silly to grab a lock for that. Maybe wq in fence-array in the short term at least is the best way just to get things working without such an invasive change. But seems like if we could kill the current _locked() paths that there is potential to make the fence->lock situation less annoying. BR, -R > Regards, > Christian. > > >> Or?? >> >> lockdep splat: >> >> ====================================================== >> [ INFO: possible circular locking dependency detected ] >> 4.7.0-rc7+ #489 Not tainted >> ------------------------------------------------------- >> surfaceflinger/2034 is trying to acquire lock: >> (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] >> fence_signal+0x5c/0xf8 >> >> but task is already holding lock: >> (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] >> sw_sync_ioctl+0x228/0x3b0 >> >> which lock already depends on the new lock. >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (&(&obj->child_list_lock)->rlock){......}: >> [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 >> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >> [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 >> [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 >> [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 >> [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 >> [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 >> [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 >> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >> >> -> #0 (&(&array->lock)->rlock){......}: >> [<ffff000008104768>] print_circular_bug+0x80/0x2e0 >> [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 >> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >> [<ffff00000858cddc>] fence_signal+0x5c/0xf8 >> [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 >> [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 >> [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 >> [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 >> [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 >> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >> >> other info that might help us debug this: >> >> Possible unsafe locking scenario: >> >> CPU0 CPU1 >> ---- ---- >> lock(&(&obj->child_list_lock)->rlock); >> lock(&(&array->lock)->rlock); >> lock(&(&obj->child_list_lock)->rlock); >> lock(&(&array->lock)->rlock); >> >> *** DEADLOCK *** >> >> 1 lock held by surfaceflinger/2034: >> #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] >> sw_sync_ioctl+0x228/0x3b0 >> --- >> drivers/dma-buf/fence.c | 22 ++++++++++++++-------- >> drivers/dma-buf/sw_sync.c | 2 +- >> drivers/dma-buf/sync_debug.c | 16 +++++++++------- >> include/linux/fence.h | 6 +++--- >> 4 files changed, 27 insertions(+), 19 deletions(-) >> >> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c >> index cc05ddd..917f905 100644 >> --- a/drivers/dma-buf/fence.c >> +++ b/drivers/dma-buf/fence.c >> @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); >> * >> * Unlike fence_signal, this function must be called with fence->lock >> held. >> */ >> -int fence_signal_locked(struct fence *fence) >> +int fence_signal_locked(struct fence *fence, unsigned long flags) >> { >> - struct fence_cb *cur, *tmp; >> + struct fence_cb *cur; >> int ret = 0; >> lockdep_assert_held(fence->lock); >> @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) >> } else >> trace_fence_signaled(fence); >> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { >> + while (!list_empty(&fence->cb_list)) { >> + cur = list_first_entry(&fence->cb_list, struct fence_cb, >> node); >> list_del_init(&cur->node); >> + spin_unlock_irqrestore(fence->lock, flags); >> cur->func(fence, cur); >> + spin_lock_irqsave(fence->lock, flags); >> } >> return ret; >> } >> @@ -124,12 +127,15 @@ int fence_signal(struct fence *fence) >> trace_fence_signaled(fence); >> if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { >> - struct fence_cb *cur, *tmp; >> + struct fence_cb *cur; >> spin_lock_irqsave(fence->lock, flags); >> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) >> { >> + while (!list_empty(&fence->cb_list)) { >> + cur = list_first_entry(&fence->cb_list, struct >> fence_cb, node); >> list_del_init(&cur->node); >> + spin_unlock_irqrestore(fence->lock, flags); >> cur->func(fence, cur); >> + spin_lock_irqsave(fence->lock, flags); >> } >> spin_unlock_irqrestore(fence->lock, flags); >> } >> @@ -211,7 +217,7 @@ void fence_enable_sw_signaling(struct fence *fence) >> spin_lock_irqsave(fence->lock, flags); >> if (!fence->ops->enable_signaling(fence)) >> - fence_signal_locked(fence); >> + fence_signal_locked(fence, flags); >> spin_unlock_irqrestore(fence->lock, flags); >> } >> @@ -266,7 +272,7 @@ int fence_add_callback(struct fence *fence, struct >> fence_cb *cb, >> trace_fence_enable_signal(fence); >> if (!fence->ops->enable_signaling(fence)) { >> - fence_signal_locked(fence); >> + fence_signal_locked(fence, flags); >> ret = -ENOENT; >> } >> } >> @@ -366,7 +372,7 @@ fence_default_wait(struct fence *fence, bool intr, >> signed long timeout) >> trace_fence_enable_signal(fence); >> if (!fence->ops->enable_signaling(fence)) { >> - fence_signal_locked(fence); >> + fence_signal_locked(fence, flags); >> goto out; >> } >> } >> diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c >> index 62e8e6d..2271f7f 100644 >> --- a/drivers/dma-buf/sw_sync.c >> +++ b/drivers/dma-buf/sw_sync.c >> @@ -146,7 +146,7 @@ static void sync_timeline_signal(struct sync_timeline >> *obj, unsigned int inc) >> list_for_each_entry_safe(pt, next, &obj->active_list_head, >> active_list) { >> - if (fence_is_signaled_locked(&pt->base)) >> + if (fence_is_signaled_locked(&pt->base, flags)) >> list_del_init(&pt->active_list); >> } >> diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c >> index 2dd4c3d..a7556d3 100644 >> --- a/drivers/dma-buf/sync_debug.c >> +++ b/drivers/dma-buf/sync_debug.c >> @@ -71,12 +71,13 @@ static const char *sync_status_str(int status) >> return "error"; >> } >> -static void sync_print_fence(struct seq_file *s, struct fence *fence, >> bool show) >> +static void sync_print_fence(struct seq_file *s, struct fence *fence, >> + bool show, unsigned long flags) >> { >> int status = 1; >> struct sync_timeline *parent = fence_parent(fence); >> - if (fence_is_signaled_locked(fence)) >> + if (fence_is_signaled_locked(fence, flags)) >> status = fence->status; >> seq_printf(s, " %s%sfence %s", >> @@ -124,13 +125,14 @@ static void sync_print_obj(struct seq_file *s, >> struct sync_timeline *obj) >> list_for_each(pos, &obj->child_list_head) { >> struct sync_pt *pt = >> container_of(pos, struct sync_pt, child_list); >> - sync_print_fence(s, &pt->base, false); >> + sync_print_fence(s, &pt->base, false, flags); >> } >> spin_unlock_irqrestore(&obj->child_list_lock, flags); >> } >> static void sync_print_sync_file(struct seq_file *s, >> - struct sync_file *sync_file) >> + struct sync_file *sync_file, >> + unsigned long flags) >> { >> int i; >> @@ -141,9 +143,9 @@ static void sync_print_sync_file(struct seq_file *s, >> struct fence_array *array = >> to_fence_array(sync_file->fence); >> for (i = 0; i < array->num_fences; ++i) >> - sync_print_fence(s, array->fences[i], true); >> + sync_print_fence(s, array->fences[i], true, >> flags); >> } else { >> - sync_print_fence(s, sync_file->fence, true); >> + sync_print_fence(s, sync_file->fence, true, flags); >> } >> } >> @@ -172,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, >> void *unused) >> struct sync_file *sync_file = >> container_of(pos, struct sync_file, >> sync_file_list); >> - sync_print_sync_file(s, sync_file); >> + sync_print_sync_file(s, sync_file, flags); >> seq_puts(s, "\n"); >> } >> spin_unlock_irqrestore(&sync_file_list_lock, flags); >> diff --git a/include/linux/fence.h b/include/linux/fence.h >> index 0d76305..073f380 100644 >> --- a/include/linux/fence.h >> +++ b/include/linux/fence.h >> @@ -220,7 +220,7 @@ static inline void fence_put(struct fence *fence) >> } >> int fence_signal(struct fence *fence); >> -int fence_signal_locked(struct fence *fence); >> +int fence_signal_locked(struct fence *fence, unsigned long flags); >> signed long fence_default_wait(struct fence *fence, bool intr, signed >> long timeout); >> int fence_add_callback(struct fence *fence, struct fence_cb *cb, >> fence_func_t func); >> @@ -239,13 +239,13 @@ void fence_enable_sw_signaling(struct fence *fence); >> * This function requires fence->lock to be held. >> */ >> static inline bool >> -fence_is_signaled_locked(struct fence *fence) >> +fence_is_signaled_locked(struct fence *fence, unsigned long flags) >> { >> if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >> return true; >> if (fence->ops->signaled && fence->ops->signaled(fence)) { >> - fence_signal_locked(fence); >> + fence_signal_locked(fence, flags); >> return true; >> } >> > > >
Am 18.10.2016 um 16:23 schrieb Rob Clark: > On Tue, Oct 18, 2016 at 6:07 AM, Christian König > <deathsimple@vodafone.de> wrote: >> Am 16.10.2016 um 18:03 schrieb Rob Clark: >>> Currently with fence-array, we have a potential deadlock situation. If we >>> fence_add_callback() on an array-fence, the array-fence's lock is aquired >>> first, and in it's ->enable_signaling() callback, it will install cb's on >>> it's array-member fences, so the array-member's lock is acquired second. >>> >>> But in the signal path, the array-member's lock is acquired first, and the >>> array-fence's lock acquired second. >>> >>> One approach to deal with this is avoid holding the fence's lock when >>> calling the cb. It is a bit intrusive and I haven't fixed up all the >>> other drivers that call directly or indirectly fence_signal_locked(). >> >> In general: Oh! Yes! Please! We have the same issue in the AMD scheduler >> when we want to register a new callback on the next fence in the list. >> >>> I guess the other option would be introduce a work-queue for array-fence? >> >> That's what we do in the GPU scheduler and it adds quite a bunch of extra >> overhead. >> >> So my preferences are clearly to fix calling the cb with any locks held >> before using another work item for the array fences. But I'm not sure if >> that is possible with all drivers. > I guess it is probably not 100% possible to ensure driver isn't > holding other of it's own locks when cb is called.. so I guess wq for > cb that needs to take other locks is a safe solution. > > That said, and maybe I need more coffee, but I think the spinlock for > iterating cb_list is probably not needed.. I think we could arrange > things so the test_and_set(SIGNALED) is enough to protect iterating > the list and calling the cb's. (Maybe protect the > test_and_set(SIGNALED) w/ fence->lock just so it doesn't race someone > who already got past the test_bit(SIGNALED) in _add_callback()?) > > Then "all" we have to do is figure out how to kill the > fence_signal_locked()/fence_is_signaled_locked() paths.. > > I think I also wouldn't mind pushing the locking down into the > ->enable_signaling() cb too for drivers that need that. Maybe we > don't strictly need that. From quick look, seems like half the > drivers just 'return true;' and seems a bit silly to grab a lock for > that. > > Maybe wq in fence-array in the short term at least is the best way > just to get things working without such an invasive change. But seems > like if we could kill the current _locked() paths that there is > potential to make the fence->lock situation less annoying. Maybe a heretic question, but do we really need the lock after all ? I mean the only use case for a double linked list I can see is removing the callbacks in the case of a timed out wait and that should be a rather rare operation. So wouldn't a single linked list with atomic swaps do as well? Christian. > > BR, > -R > >> Regards, >> Christian. >> >> >>> Or?? >>> >>> lockdep splat: >>> >>> ====================================================== >>> [ INFO: possible circular locking dependency detected ] >>> 4.7.0-rc7+ #489 Not tainted >>> ------------------------------------------------------- >>> surfaceflinger/2034 is trying to acquire lock: >>> (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] >>> fence_signal+0x5c/0xf8 >>> >>> but task is already holding lock: >>> (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] >>> sw_sync_ioctl+0x228/0x3b0 >>> >>> which lock already depends on the new lock. >>> >>> the existing dependency chain (in reverse order) is: >>> >>> -> #1 (&(&obj->child_list_lock)->rlock){......}: >>> [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 >>> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >>> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >>> [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 >>> [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 >>> [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 >>> [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 >>> [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 >>> [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 >>> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >>> >>> -> #0 (&(&array->lock)->rlock){......}: >>> [<ffff000008104768>] print_circular_bug+0x80/0x2e0 >>> [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 >>> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >>> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >>> [<ffff00000858cddc>] fence_signal+0x5c/0xf8 >>> [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 >>> [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 >>> [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 >>> [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 >>> [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 >>> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >>> >>> other info that might help us debug this: >>> >>> Possible unsafe locking scenario: >>> >>> CPU0 CPU1 >>> ---- ---- >>> lock(&(&obj->child_list_lock)->rlock); >>> lock(&(&array->lock)->rlock); >>> lock(&(&obj->child_list_lock)->rlock); >>> lock(&(&array->lock)->rlock); >>> >>> *** DEADLOCK *** >>> >>> 1 lock held by surfaceflinger/2034: >>> #0: (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] >>> sw_sync_ioctl+0x228/0x3b0 >>> --- >>> drivers/dma-buf/fence.c | 22 ++++++++++++++-------- >>> drivers/dma-buf/sw_sync.c | 2 +- >>> drivers/dma-buf/sync_debug.c | 16 +++++++++------- >>> include/linux/fence.h | 6 +++--- >>> 4 files changed, 27 insertions(+), 19 deletions(-) >>> >>> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c >>> index cc05ddd..917f905 100644 >>> --- a/drivers/dma-buf/fence.c >>> +++ b/drivers/dma-buf/fence.c >>> @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); >>> * >>> * Unlike fence_signal, this function must be called with fence->lock >>> held. >>> */ >>> -int fence_signal_locked(struct fence *fence) >>> +int fence_signal_locked(struct fence *fence, unsigned long flags) >>> { >>> - struct fence_cb *cur, *tmp; >>> + struct fence_cb *cur; >>> int ret = 0; >>> lockdep_assert_held(fence->lock); >>> @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) >>> } else >>> trace_fence_signaled(fence); >>> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { >>> + while (!list_empty(&fence->cb_list)) { >>> + cur = list_first_entry(&fence->cb_list, struct fence_cb, >>> node); >>> list_del_init(&cur->node); >>> + spin_unlock_irqrestore(fence->lock, flags); >>> cur->func(fence, cur); >>> + spin_lock_irqsave(fence->lock, flags); >>> } >>> return ret; >>> } >>> @@ -124,12 +127,15 @@ int fence_signal(struct fence *fence) >>> trace_fence_signaled(fence); >>> if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { >>> - struct fence_cb *cur, *tmp; >>> + struct fence_cb *cur; >>> spin_lock_irqsave(fence->lock, flags); >>> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) >>> { >>> + while (!list_empty(&fence->cb_list)) { >>> + cur = list_first_entry(&fence->cb_list, struct >>> fence_cb, node); >>> list_del_init(&cur->node); >>> + spin_unlock_irqrestore(fence->lock, flags); >>> cur->func(fence, cur); >>> + spin_lock_irqsave(fence->lock, flags); >>> } >>> spin_unlock_irqrestore(fence->lock, flags); >>> } >>> @@ -211,7 +217,7 @@ void fence_enable_sw_signaling(struct fence *fence) >>> spin_lock_irqsave(fence->lock, flags); >>> if (!fence->ops->enable_signaling(fence)) >>> - fence_signal_locked(fence); >>> + fence_signal_locked(fence, flags); >>> spin_unlock_irqrestore(fence->lock, flags); >>> } >>> @@ -266,7 +272,7 @@ int fence_add_callback(struct fence *fence, struct >>> fence_cb *cb, >>> trace_fence_enable_signal(fence); >>> if (!fence->ops->enable_signaling(fence)) { >>> - fence_signal_locked(fence); >>> + fence_signal_locked(fence, flags); >>> ret = -ENOENT; >>> } >>> } >>> @@ -366,7 +372,7 @@ fence_default_wait(struct fence *fence, bool intr, >>> signed long timeout) >>> trace_fence_enable_signal(fence); >>> if (!fence->ops->enable_signaling(fence)) { >>> - fence_signal_locked(fence); >>> + fence_signal_locked(fence, flags); >>> goto out; >>> } >>> } >>> diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c >>> index 62e8e6d..2271f7f 100644 >>> --- a/drivers/dma-buf/sw_sync.c >>> +++ b/drivers/dma-buf/sw_sync.c >>> @@ -146,7 +146,7 @@ static void sync_timeline_signal(struct sync_timeline >>> *obj, unsigned int inc) >>> list_for_each_entry_safe(pt, next, &obj->active_list_head, >>> active_list) { >>> - if (fence_is_signaled_locked(&pt->base)) >>> + if (fence_is_signaled_locked(&pt->base, flags)) >>> list_del_init(&pt->active_list); >>> } >>> diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c >>> index 2dd4c3d..a7556d3 100644 >>> --- a/drivers/dma-buf/sync_debug.c >>> +++ b/drivers/dma-buf/sync_debug.c >>> @@ -71,12 +71,13 @@ static const char *sync_status_str(int status) >>> return "error"; >>> } >>> -static void sync_print_fence(struct seq_file *s, struct fence *fence, >>> bool show) >>> +static void sync_print_fence(struct seq_file *s, struct fence *fence, >>> + bool show, unsigned long flags) >>> { >>> int status = 1; >>> struct sync_timeline *parent = fence_parent(fence); >>> - if (fence_is_signaled_locked(fence)) >>> + if (fence_is_signaled_locked(fence, flags)) >>> status = fence->status; >>> seq_printf(s, " %s%sfence %s", >>> @@ -124,13 +125,14 @@ static void sync_print_obj(struct seq_file *s, >>> struct sync_timeline *obj) >>> list_for_each(pos, &obj->child_list_head) { >>> struct sync_pt *pt = >>> container_of(pos, struct sync_pt, child_list); >>> - sync_print_fence(s, &pt->base, false); >>> + sync_print_fence(s, &pt->base, false, flags); >>> } >>> spin_unlock_irqrestore(&obj->child_list_lock, flags); >>> } >>> static void sync_print_sync_file(struct seq_file *s, >>> - struct sync_file *sync_file) >>> + struct sync_file *sync_file, >>> + unsigned long flags) >>> { >>> int i; >>> @@ -141,9 +143,9 @@ static void sync_print_sync_file(struct seq_file *s, >>> struct fence_array *array = >>> to_fence_array(sync_file->fence); >>> for (i = 0; i < array->num_fences; ++i) >>> - sync_print_fence(s, array->fences[i], true); >>> + sync_print_fence(s, array->fences[i], true, >>> flags); >>> } else { >>> - sync_print_fence(s, sync_file->fence, true); >>> + sync_print_fence(s, sync_file->fence, true, flags); >>> } >>> } >>> @@ -172,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, >>> void *unused) >>> struct sync_file *sync_file = >>> container_of(pos, struct sync_file, >>> sync_file_list); >>> - sync_print_sync_file(s, sync_file); >>> + sync_print_sync_file(s, sync_file, flags); >>> seq_puts(s, "\n"); >>> } >>> spin_unlock_irqrestore(&sync_file_list_lock, flags); >>> diff --git a/include/linux/fence.h b/include/linux/fence.h >>> index 0d76305..073f380 100644 >>> --- a/include/linux/fence.h >>> +++ b/include/linux/fence.h >>> @@ -220,7 +220,7 @@ static inline void fence_put(struct fence *fence) >>> } >>> int fence_signal(struct fence *fence); >>> -int fence_signal_locked(struct fence *fence); >>> +int fence_signal_locked(struct fence *fence, unsigned long flags); >>> signed long fence_default_wait(struct fence *fence, bool intr, signed >>> long timeout); >>> int fence_add_callback(struct fence *fence, struct fence_cb *cb, >>> fence_func_t func); >>> @@ -239,13 +239,13 @@ void fence_enable_sw_signaling(struct fence *fence); >>> * This function requires fence->lock to be held. >>> */ >>> static inline bool >>> -fence_is_signaled_locked(struct fence *fence) >>> +fence_is_signaled_locked(struct fence *fence, unsigned long flags) >>> { >>> if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>> return true; >>> if (fence->ops->signaled && fence->ops->signaled(fence)) { >>> - fence_signal_locked(fence); >>> + fence_signal_locked(fence, flags); >>> return true; >>> } >>> >> >>
On Tue, Oct 18, 2016 at 11:27 AM, Christian König <deathsimple@vodafone.de> wrote: > Am 18.10.2016 um 16:23 schrieb Rob Clark: >> >> On Tue, Oct 18, 2016 at 6:07 AM, Christian König >> <deathsimple@vodafone.de> wrote: >>> >>> Am 16.10.2016 um 18:03 schrieb Rob Clark: >>>> >>>> Currently with fence-array, we have a potential deadlock situation. If >>>> we >>>> fence_add_callback() on an array-fence, the array-fence's lock is >>>> aquired >>>> first, and in it's ->enable_signaling() callback, it will install cb's >>>> on >>>> it's array-member fences, so the array-member's lock is acquired second. >>>> >>>> But in the signal path, the array-member's lock is acquired first, and >>>> the >>>> array-fence's lock acquired second. >>>> >>>> One approach to deal with this is avoid holding the fence's lock when >>>> calling the cb. It is a bit intrusive and I haven't fixed up all the >>>> other drivers that call directly or indirectly fence_signal_locked(). >>> >>> >>> In general: Oh! Yes! Please! We have the same issue in the AMD scheduler >>> when we want to register a new callback on the next fence in the list. >>> >>>> I guess the other option would be introduce a work-queue for >>>> array-fence? >>> >>> >>> That's what we do in the GPU scheduler and it adds quite a bunch of extra >>> overhead. >>> >>> So my preferences are clearly to fix calling the cb with any locks held >>> before using another work item for the array fences. But I'm not sure if >>> that is possible with all drivers. >> >> I guess it is probably not 100% possible to ensure driver isn't >> holding other of it's own locks when cb is called.. so I guess wq for >> cb that needs to take other locks is a safe solution. >> >> That said, and maybe I need more coffee, but I think the spinlock for >> iterating cb_list is probably not needed.. I think we could arrange >> things so the test_and_set(SIGNALED) is enough to protect iterating >> the list and calling the cb's. (Maybe protect the >> test_and_set(SIGNALED) w/ fence->lock just so it doesn't race someone >> who already got past the test_bit(SIGNALED) in _add_callback()?) >> >> Then "all" we have to do is figure out how to kill the >> fence_signal_locked()/fence_is_signaled_locked() paths.. >> >> I think I also wouldn't mind pushing the locking down into the >> ->enable_signaling() cb too for drivers that need that. Maybe we >> don't strictly need that. From quick look, seems like half the >> drivers just 'return true;' and seems a bit silly to grab a lock for >> that. >> >> Maybe wq in fence-array in the short term at least is the best way >> just to get things working without such an invasive change. But seems >> like if we could kill the current _locked() paths that there is >> potential to make the fence->lock situation less annoying. > > > Maybe a heretic question, but do we really need the lock after all ? > > I mean the only use case for a double linked list I can see is removing the > callbacks in the case of a timed out wait and that should be a rather rare > operation. > > So wouldn't a single linked list with atomic swaps do as well? Possibly, although I guess keeping the lock as long as it isn't held over callbacks would be sufficient. Either way, I guess this approach starts by untangling the fence_signal_locked()/fence_is_signaled_locked() paths.. that is the part I'm less sure how to do yet. BR, -R > Christian. > > >> >> BR, >> -R >> >>> Regards, >>> Christian. >>> >>> >>>> Or?? >>>> >>>> lockdep splat: >>>> >>>> ====================================================== >>>> [ INFO: possible circular locking dependency detected ] >>>> 4.7.0-rc7+ #489 Not tainted >>>> ------------------------------------------------------- >>>> surfaceflinger/2034 is trying to acquire lock: >>>> (&(&array->lock)->rlock){......}, at: [<ffff00000858cddc>] >>>> fence_signal+0x5c/0xf8 >>>> >>>> but task is already holding lock: >>>> (&(&obj->child_list_lock)->rlock){......}, at: [<ffff0000085902f8>] >>>> sw_sync_ioctl+0x228/0x3b0 >>>> >>>> which lock already depends on the new lock. >>>> >>>> the existing dependency chain (in reverse order) is: >>>> >>>> -> #1 (&(&obj->child_list_lock)->rlock){......}: >>>> [<ffff000008108924>] __lock_acquire+0x173c/0x18d8 >>>> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >>>> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >>>> [<ffff00000858d05c>] fence_add_callback+0x3c/0x100 >>>> [<ffff00000858f100>] fence_array_enable_signaling+0x80/0x170 >>>> [<ffff00000858d0d8>] fence_add_callback+0xb8/0x100 >>>> [<ffff00000858f504>] sync_file_poll+0xd4/0xf0 >>>> [<ffff0000081fd3a0>] do_sys_poll+0x220/0x438 >>>> [<ffff0000081fd8d0>] SyS_ppoll+0x1b0/0x1d8 >>>> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >>>> >>>> -> #0 (&(&array->lock)->rlock){......}: >>>> [<ffff000008104768>] print_circular_bug+0x80/0x2e0 >>>> [<ffff0000081089ac>] __lock_acquire+0x17c4/0x18d8 >>>> [<ffff000008108e0c>] lock_acquire+0x4c/0x68 >>>> [<ffff000008ac6a6c>] _raw_spin_lock_irqsave+0x54/0x70 >>>> [<ffff00000858cddc>] fence_signal+0x5c/0xf8 >>>> [<ffff00000858f268>] fence_array_cb_func+0x78/0x88 >>>> [<ffff00000858cb28>] fence_signal_locked+0x80/0xe0 >>>> [<ffff0000085903c8>] sw_sync_ioctl+0x2f8/0x3b0 >>>> [<ffff0000081faf6c>] do_vfs_ioctl+0xa4/0x790 >>>> [<ffff0000081fb6e4>] SyS_ioctl+0x8c/0xa0 >>>> [<ffff000008084f30>] el0_svc_naked+0x24/0x28 >>>> >>>> other info that might help us debug this: >>>> >>>> Possible unsafe locking scenario: >>>> >>>> CPU0 CPU1 >>>> ---- ---- >>>> lock(&(&obj->child_list_lock)->rlock); >>>> lock(&(&array->lock)->rlock); >>>> lock(&(&obj->child_list_lock)->rlock); >>>> lock(&(&array->lock)->rlock); >>>> >>>> *** DEADLOCK *** >>>> >>>> 1 lock held by surfaceflinger/2034: >>>> #0: (&(&obj->child_list_lock)->rlock){......}, at: >>>> [<ffff0000085902f8>] >>>> sw_sync_ioctl+0x228/0x3b0 >>>> --- >>>> drivers/dma-buf/fence.c | 22 ++++++++++++++-------- >>>> drivers/dma-buf/sw_sync.c | 2 +- >>>> drivers/dma-buf/sync_debug.c | 16 +++++++++------- >>>> include/linux/fence.h | 6 +++--- >>>> 4 files changed, 27 insertions(+), 19 deletions(-) >>>> >>>> diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c >>>> index cc05ddd..917f905 100644 >>>> --- a/drivers/dma-buf/fence.c >>>> +++ b/drivers/dma-buf/fence.c >>>> @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); >>>> * >>>> * Unlike fence_signal, this function must be called with fence->lock >>>> held. >>>> */ >>>> -int fence_signal_locked(struct fence *fence) >>>> +int fence_signal_locked(struct fence *fence, unsigned long flags) >>>> { >>>> - struct fence_cb *cur, *tmp; >>>> + struct fence_cb *cur; >>>> int ret = 0; >>>> lockdep_assert_held(fence->lock); >>>> @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) >>>> } else >>>> trace_fence_signaled(fence); >>>> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { >>>> + while (!list_empty(&fence->cb_list)) { >>>> + cur = list_first_entry(&fence->cb_list, struct fence_cb, >>>> node); >>>> list_del_init(&cur->node); >>>> + spin_unlock_irqrestore(fence->lock, flags); >>>> cur->func(fence, cur); >>>> + spin_lock_irqsave(fence->lock, flags); >>>> } >>>> return ret; >>>> } >>>> @@ -124,12 +127,15 @@ int fence_signal(struct fence *fence) >>>> trace_fence_signaled(fence); >>>> if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { >>>> - struct fence_cb *cur, *tmp; >>>> + struct fence_cb *cur; >>>> spin_lock_irqsave(fence->lock, flags); >>>> - list_for_each_entry_safe(cur, tmp, &fence->cb_list, >>>> node) >>>> { >>>> + while (!list_empty(&fence->cb_list)) { >>>> + cur = list_first_entry(&fence->cb_list, struct >>>> fence_cb, node); >>>> list_del_init(&cur->node); >>>> + spin_unlock_irqrestore(fence->lock, flags); >>>> cur->func(fence, cur); >>>> + spin_lock_irqsave(fence->lock, flags); >>>> } >>>> spin_unlock_irqrestore(fence->lock, flags); >>>> } >>>> @@ -211,7 +217,7 @@ void fence_enable_sw_signaling(struct fence *fence) >>>> spin_lock_irqsave(fence->lock, flags); >>>> if (!fence->ops->enable_signaling(fence)) >>>> - fence_signal_locked(fence); >>>> + fence_signal_locked(fence, flags); >>>> spin_unlock_irqrestore(fence->lock, flags); >>>> } >>>> @@ -266,7 +272,7 @@ int fence_add_callback(struct fence *fence, struct >>>> fence_cb *cb, >>>> trace_fence_enable_signal(fence); >>>> if (!fence->ops->enable_signaling(fence)) { >>>> - fence_signal_locked(fence); >>>> + fence_signal_locked(fence, flags); >>>> ret = -ENOENT; >>>> } >>>> } >>>> @@ -366,7 +372,7 @@ fence_default_wait(struct fence *fence, bool intr, >>>> signed long timeout) >>>> trace_fence_enable_signal(fence); >>>> if (!fence->ops->enable_signaling(fence)) { >>>> - fence_signal_locked(fence); >>>> + fence_signal_locked(fence, flags); >>>> goto out; >>>> } >>>> } >>>> diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c >>>> index 62e8e6d..2271f7f 100644 >>>> --- a/drivers/dma-buf/sw_sync.c >>>> +++ b/drivers/dma-buf/sw_sync.c >>>> @@ -146,7 +146,7 @@ static void sync_timeline_signal(struct >>>> sync_timeline >>>> *obj, unsigned int inc) >>>> list_for_each_entry_safe(pt, next, &obj->active_list_head, >>>> active_list) { >>>> - if (fence_is_signaled_locked(&pt->base)) >>>> + if (fence_is_signaled_locked(&pt->base, flags)) >>>> list_del_init(&pt->active_list); >>>> } >>>> diff --git a/drivers/dma-buf/sync_debug.c >>>> b/drivers/dma-buf/sync_debug.c >>>> index 2dd4c3d..a7556d3 100644 >>>> --- a/drivers/dma-buf/sync_debug.c >>>> +++ b/drivers/dma-buf/sync_debug.c >>>> @@ -71,12 +71,13 @@ static const char *sync_status_str(int status) >>>> return "error"; >>>> } >>>> -static void sync_print_fence(struct seq_file *s, struct fence >>>> *fence, >>>> bool show) >>>> +static void sync_print_fence(struct seq_file *s, struct fence *fence, >>>> + bool show, unsigned long flags) >>>> { >>>> int status = 1; >>>> struct sync_timeline *parent = fence_parent(fence); >>>> - if (fence_is_signaled_locked(fence)) >>>> + if (fence_is_signaled_locked(fence, flags)) >>>> status = fence->status; >>>> seq_printf(s, " %s%sfence %s", >>>> @@ -124,13 +125,14 @@ static void sync_print_obj(struct seq_file *s, >>>> struct sync_timeline *obj) >>>> list_for_each(pos, &obj->child_list_head) { >>>> struct sync_pt *pt = >>>> container_of(pos, struct sync_pt, child_list); >>>> - sync_print_fence(s, &pt->base, false); >>>> + sync_print_fence(s, &pt->base, false, flags); >>>> } >>>> spin_unlock_irqrestore(&obj->child_list_lock, flags); >>>> } >>>> static void sync_print_sync_file(struct seq_file *s, >>>> - struct sync_file *sync_file) >>>> + struct sync_file *sync_file, >>>> + unsigned long flags) >>>> { >>>> int i; >>>> @@ -141,9 +143,9 @@ static void sync_print_sync_file(struct seq_file >>>> *s, >>>> struct fence_array *array = >>>> to_fence_array(sync_file->fence); >>>> for (i = 0; i < array->num_fences; ++i) >>>> - sync_print_fence(s, array->fences[i], true); >>>> + sync_print_fence(s, array->fences[i], true, >>>> flags); >>>> } else { >>>> - sync_print_fence(s, sync_file->fence, true); >>>> + sync_print_fence(s, sync_file->fence, true, flags); >>>> } >>>> } >>>> @@ -172,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, >>>> void *unused) >>>> struct sync_file *sync_file = >>>> container_of(pos, struct sync_file, >>>> sync_file_list); >>>> - sync_print_sync_file(s, sync_file); >>>> + sync_print_sync_file(s, sync_file, flags); >>>> seq_puts(s, "\n"); >>>> } >>>> spin_unlock_irqrestore(&sync_file_list_lock, flags); >>>> diff --git a/include/linux/fence.h b/include/linux/fence.h >>>> index 0d76305..073f380 100644 >>>> --- a/include/linux/fence.h >>>> +++ b/include/linux/fence.h >>>> @@ -220,7 +220,7 @@ static inline void fence_put(struct fence *fence) >>>> } >>>> int fence_signal(struct fence *fence); >>>> -int fence_signal_locked(struct fence *fence); >>>> +int fence_signal_locked(struct fence *fence, unsigned long flags); >>>> signed long fence_default_wait(struct fence *fence, bool intr, signed >>>> long timeout); >>>> int fence_add_callback(struct fence *fence, struct fence_cb *cb, >>>> fence_func_t func); >>>> @@ -239,13 +239,13 @@ void fence_enable_sw_signaling(struct fence >>>> *fence); >>>> * This function requires fence->lock to be held. >>>> */ >>>> static inline bool >>>> -fence_is_signaled_locked(struct fence *fence) >>>> +fence_is_signaled_locked(struct fence *fence, unsigned long flags) >>>> { >>>> if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) >>>> return true; >>>> if (fence->ops->signaled && fence->ops->signaled(fence)) { >>>> - fence_signal_locked(fence); >>>> + fence_signal_locked(fence, flags); >>>> return true; >>>> } >>>> >>> >>> >
diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c index cc05ddd..917f905 100644 --- a/drivers/dma-buf/fence.c +++ b/drivers/dma-buf/fence.c @@ -63,9 +63,9 @@ EXPORT_SYMBOL(fence_context_alloc); * * Unlike fence_signal, this function must be called with fence->lock held. */ -int fence_signal_locked(struct fence *fence) +int fence_signal_locked(struct fence *fence, unsigned long flags) { - struct fence_cb *cur, *tmp; + struct fence_cb *cur; int ret = 0; lockdep_assert_held(fence->lock); @@ -88,9 +88,12 @@ int fence_signal_locked(struct fence *fence) } else trace_fence_signaled(fence); - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { + while (!list_empty(&fence->cb_list)) { + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); list_del_init(&cur->node); + spin_unlock_irqrestore(fence->lock, flags); cur->func(fence, cur); + spin_lock_irqsave(fence->lock, flags); } return ret; } @@ -124,12 +127,15 @@ int fence_signal(struct fence *fence) trace_fence_signaled(fence); if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { - struct fence_cb *cur, *tmp; + struct fence_cb *cur; spin_lock_irqsave(fence->lock, flags); - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { + while (!list_empty(&fence->cb_list)) { + cur = list_first_entry(&fence->cb_list, struct fence_cb, node); list_del_init(&cur->node); + spin_unlock_irqrestore(fence->lock, flags); cur->func(fence, cur); + spin_lock_irqsave(fence->lock, flags); } spin_unlock_irqrestore(fence->lock, flags); } @@ -211,7 +217,7 @@ void fence_enable_sw_signaling(struct fence *fence) spin_lock_irqsave(fence->lock, flags); if (!fence->ops->enable_signaling(fence)) - fence_signal_locked(fence); + fence_signal_locked(fence, flags); spin_unlock_irqrestore(fence->lock, flags); } @@ -266,7 +272,7 @@ int fence_add_callback(struct fence *fence, struct fence_cb *cb, trace_fence_enable_signal(fence); if (!fence->ops->enable_signaling(fence)) { - fence_signal_locked(fence); + fence_signal_locked(fence, flags); ret = -ENOENT; } } @@ -366,7 +372,7 @@ fence_default_wait(struct fence *fence, bool intr, signed long timeout) trace_fence_enable_signal(fence); if (!fence->ops->enable_signaling(fence)) { - fence_signal_locked(fence); + fence_signal_locked(fence, flags); goto out; } } diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 62e8e6d..2271f7f 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -146,7 +146,7 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) list_for_each_entry_safe(pt, next, &obj->active_list_head, active_list) { - if (fence_is_signaled_locked(&pt->base)) + if (fence_is_signaled_locked(&pt->base, flags)) list_del_init(&pt->active_list); } diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c index 2dd4c3d..a7556d3 100644 --- a/drivers/dma-buf/sync_debug.c +++ b/drivers/dma-buf/sync_debug.c @@ -71,12 +71,13 @@ static const char *sync_status_str(int status) return "error"; } -static void sync_print_fence(struct seq_file *s, struct fence *fence, bool show) +static void sync_print_fence(struct seq_file *s, struct fence *fence, + bool show, unsigned long flags) { int status = 1; struct sync_timeline *parent = fence_parent(fence); - if (fence_is_signaled_locked(fence)) + if (fence_is_signaled_locked(fence, flags)) status = fence->status; seq_printf(s, " %s%sfence %s", @@ -124,13 +125,14 @@ static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj) list_for_each(pos, &obj->child_list_head) { struct sync_pt *pt = container_of(pos, struct sync_pt, child_list); - sync_print_fence(s, &pt->base, false); + sync_print_fence(s, &pt->base, false, flags); } spin_unlock_irqrestore(&obj->child_list_lock, flags); } static void sync_print_sync_file(struct seq_file *s, - struct sync_file *sync_file) + struct sync_file *sync_file, + unsigned long flags) { int i; @@ -141,9 +143,9 @@ static void sync_print_sync_file(struct seq_file *s, struct fence_array *array = to_fence_array(sync_file->fence); for (i = 0; i < array->num_fences; ++i) - sync_print_fence(s, array->fences[i], true); + sync_print_fence(s, array->fences[i], true, flags); } else { - sync_print_fence(s, sync_file->fence, true); + sync_print_fence(s, sync_file->fence, true, flags); } } @@ -172,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused) struct sync_file *sync_file = container_of(pos, struct sync_file, sync_file_list); - sync_print_sync_file(s, sync_file); + sync_print_sync_file(s, sync_file, flags); seq_puts(s, "\n"); } spin_unlock_irqrestore(&sync_file_list_lock, flags); diff --git a/include/linux/fence.h b/include/linux/fence.h index 0d76305..073f380 100644 --- a/include/linux/fence.h +++ b/include/linux/fence.h @@ -220,7 +220,7 @@ static inline void fence_put(struct fence *fence) } int fence_signal(struct fence *fence); -int fence_signal_locked(struct fence *fence); +int fence_signal_locked(struct fence *fence, unsigned long flags); signed long fence_default_wait(struct fence *fence, bool intr, signed long timeout); int fence_add_callback(struct fence *fence, struct fence_cb *cb, fence_func_t func); @@ -239,13 +239,13 @@ void fence_enable_sw_signaling(struct fence *fence); * This function requires fence->lock to be held. */ static inline bool -fence_is_signaled_locked(struct fence *fence) +fence_is_signaled_locked(struct fence *fence, unsigned long flags) { if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return true; if (fence->ops->signaled && fence->ops->signaled(fence)) { - fence_signal_locked(fence); + fence_signal_locked(fence, flags); return true; }