diff mbox

[2/2] dma-buf/fence: add fence_array fences v4

Message ID 1463752571-28688-2-git-send-email-deathsimple@vodafone.de (mailing list archive)
State New, archived
Headers show

Commit Message

Christian König May 20, 2016, 1:56 p.m. UTC
From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

struct fence_collection inherits from struct fence and carries a
collection of fences that needs to be waited together.

It is useful to translate a sync_file to a fence to remove the complexity
of dealing with sync_files on DRM drivers. So even if there are many
fences in the sync_file that needs to waited for a commit to happen,
they all get added to the fence_collection and passed for DRM use as
a standard struct fence.

That means that no changes needed to any driver besides supporting fences.

fence_collection's fence doesn't belong to any timeline context, so
fence_is_later() and fence_later() are not meant to be called with
fence_collections fences.

v2: Comments by Daniel Vetter:
	- merge fence_collection_init() and fence_collection_add()
	- only add callbacks at ->enable_signalling()
	- remove fence_collection_put()
	- check for type on to_fence_collection()
	- adjust fence_is_later() and fence_later() to WARN_ON() if they
	are used with collection fences.

v3: - Initialize fence_cb.node at fence init.

    Comments by Chris Wilson:
	- return "unbound" on fence_collection_get_timeline_name()
	- don't stop adding callbacks if one fails
	- remove redundant !! on fence_collection_enable_signaling()
	- remove redundant () on fence_collection_signaled
	- use fence_default_wait() instead

v4 (chk): Rework, simplification and cleanup:
	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
	- Rename to fence_array.
	- Return fixed driver name.
	- Register only one callback at a time.
	- Document that create function takes ownership of array.

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/Makefile      |   2 +-
 drivers/dma-buf/fence-array.c | 132 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/fence-array.h   |  62 ++++++++++++++++++++
 3 files changed, 195 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma-buf/fence-array.c
 create mode 100644 include/linux/fence-array.h

Comments

Chris Wilson May 20, 2016, 2:42 p.m. UTC | #1
On Fri, May 20, 2016 at 03:56:11PM +0200, Christian König wrote:
> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> 
> struct fence_collection inherits from struct fence and carries a
> collection of fences that needs to be waited together.
> 
> It is useful to translate a sync_file to a fence to remove the complexity
> of dealing with sync_files on DRM drivers. So even if there are many
> fences in the sync_file that needs to waited for a commit to happen,
> they all get added to the fence_collection and passed for DRM use as
> a standard struct fence.
> 
> That means that no changes needed to any driver besides supporting fences.
> 
> fence_collection's fence doesn't belong to any timeline context, so
> fence_is_later() and fence_later() are not meant to be called with
> fence_collections fences.
> 
> v2: Comments by Daniel Vetter:
> 	- merge fence_collection_init() and fence_collection_add()
> 	- only add callbacks at ->enable_signalling()
> 	- remove fence_collection_put()
> 	- check for type on to_fence_collection()
> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
> 	are used with collection fences.
> 
> v3: - Initialize fence_cb.node at fence init.
> 
>     Comments by Chris Wilson:
> 	- return "unbound" on fence_collection_get_timeline_name()
> 	- don't stop adding callbacks if one fails
> 	- remove redundant !! on fence_collection_enable_signaling()
> 	- remove redundant () on fence_collection_signaled
> 	- use fence_default_wait() instead
> 
> v4 (chk): Rework, simplification and cleanup:
> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
> 	- Rename to fence_array.
> 	- Return fixed driver name.
> 	- Register only one callback at a time.

Why? Even within a driver I expected there to be some amoritization of
the signaling costs for handling multiple fences at once (at least the
driver I'm familar with!).

So more just curiousity as to your experience that favours sequential
enabling.

> +static bool fence_array_add_next_callback(struct fence_array *array)
> +{
> +	while (array->num_signaled < array->num_fences) {
> +		struct fence *next = array->fences[array->num_signaled];
> +
> +		if (!fence_add_callback(next, &array->cb, fence_array_cb_func))
> +			return true;
> +
> +		++array->num_signaled;
> +	}
> +
> +	return false;
> +}
> +
> +static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
> +{
> +	struct fence_array *array = container_of(cb, struct fence_array, cb);

Some chasing around would have been saved by a

	assert_spin_locked(&array->lock);

here.

> +
> +	++array->num_signaled;
> +	if (!fence_array_add_next_callback(array))
> +		fence_signal(&array->base);
> +}
> +
> +static bool fence_array_enable_signaling(struct fence *fence)
> +{
> +	struct fence_array *array = to_fence_array(fence);
> +
> +	return fence_array_add_next_callback(array);
> +}
> +
> +static bool fence_array_signaled(struct fence *fence)
> +{
> +	struct fence_array *array = to_fence_array(fence);
> +
> +	return ACCESS_ONCE(array->num_signaled) == array->num_fences;

Can just be READ_ONCE()
-Chris
Gustavo Padovan May 20, 2016, 2:47 p.m. UTC | #2
2016-05-20 Christian König <deathsimple@vodafone.de>:

> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> 
> struct fence_collection inherits from struct fence and carries a
> collection of fences that needs to be waited together.
> 
> It is useful to translate a sync_file to a fence to remove the complexity
> of dealing with sync_files on DRM drivers. So even if there are many
> fences in the sync_file that needs to waited for a commit to happen,
> they all get added to the fence_collection and passed for DRM use as
> a standard struct fence.
> 
> That means that no changes needed to any driver besides supporting fences.
> 
> fence_collection's fence doesn't belong to any timeline context, so
> fence_is_later() and fence_later() are not meant to be called with
> fence_collections fences.
> 
> v2: Comments by Daniel Vetter:
> 	- merge fence_collection_init() and fence_collection_add()
> 	- only add callbacks at ->enable_signalling()
> 	- remove fence_collection_put()
> 	- check for type on to_fence_collection()
> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
> 	are used with collection fences.
> 
> v3: - Initialize fence_cb.node at fence init.
> 
>     Comments by Chris Wilson:
> 	- return "unbound" on fence_collection_get_timeline_name()
> 	- don't stop adding callbacks if one fails
> 	- remove redundant !! on fence_collection_enable_signaling()
> 	- remove redundant () on fence_collection_signaled
> 	- use fence_default_wait() instead
> 
> v4 (chk): Rework, simplification and cleanup:
> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
> 	- Rename to fence_array.
> 	- Return fixed driver name.
> 	- Register only one callback at a time.
> 	- Document that create function takes ownership of array.

This looks good to me. Dropping NO_CONTEXT was a good idea, also
registering only one callback makes it looks better.

	Gustavo
Christian König May 20, 2016, 5:53 p.m. UTC | #3
Am 20.05.2016 um 16:47 schrieb Gustavo Padovan:
> 2016-05-20 Christian König <deathsimple@vodafone.de>:
>
>> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>
>> struct fence_collection inherits from struct fence and carries a
>> collection of fences that needs to be waited together.
>>
>> It is useful to translate a sync_file to a fence to remove the complexity
>> of dealing with sync_files on DRM drivers. So even if there are many
>> fences in the sync_file that needs to waited for a commit to happen,
>> they all get added to the fence_collection and passed for DRM use as
>> a standard struct fence.
>>
>> That means that no changes needed to any driver besides supporting fences.
>>
>> fence_collection's fence doesn't belong to any timeline context, so
>> fence_is_later() and fence_later() are not meant to be called with
>> fence_collections fences.
>>
>> v2: Comments by Daniel Vetter:
>> 	- merge fence_collection_init() and fence_collection_add()
>> 	- only add callbacks at ->enable_signalling()
>> 	- remove fence_collection_put()
>> 	- check for type on to_fence_collection()
>> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
>> 	are used with collection fences.
>>
>> v3: - Initialize fence_cb.node at fence init.
>>
>>      Comments by Chris Wilson:
>> 	- return "unbound" on fence_collection_get_timeline_name()
>> 	- don't stop adding callbacks if one fails
>> 	- remove redundant !! on fence_collection_enable_signaling()
>> 	- remove redundant () on fence_collection_signaled
>> 	- use fence_default_wait() instead
>>
>> v4 (chk): Rework, simplification and cleanup:
>> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
>> 	- Rename to fence_array.
>> 	- Return fixed driver name.
>> 	- Register only one callback at a time.
>> 	- Document that create function takes ownership of array.
> This looks good to me. Dropping NO_CONTEXT was a good idea, also
> registering only one callback makes it looks better.

Thinking about it a bit more I think we need to avoid removing the 
callback when the fence is released as well.

That stuff is just a bit to racy (see the comment on the 
fence_remove_callback function as well).

I will just grab a reference to the fence while there is any callback 
registered.

Also please note that this is only compile tested at the moment. I'm 
still working on integrating it into my code.

Regards,
Christian.

>
> 	Gustavo
Christian König May 23, 2016, 7:32 a.m. UTC | #4
Am 20.05.2016 um 16:42 schrieb Chris Wilson:
> On Fri, May 20, 2016 at 03:56:11PM +0200, Christian König wrote:
>> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>
>> struct fence_collection inherits from struct fence and carries a
>> collection of fences that needs to be waited together.
>>
>> It is useful to translate a sync_file to a fence to remove the complexity
>> of dealing with sync_files on DRM drivers. So even if there are many
>> fences in the sync_file that needs to waited for a commit to happen,
>> they all get added to the fence_collection and passed for DRM use as
>> a standard struct fence.
>>
>> That means that no changes needed to any driver besides supporting fences.
>>
>> fence_collection's fence doesn't belong to any timeline context, so
>> fence_is_later() and fence_later() are not meant to be called with
>> fence_collections fences.
>>
>> v2: Comments by Daniel Vetter:
>> 	- merge fence_collection_init() and fence_collection_add()
>> 	- only add callbacks at ->enable_signalling()
>> 	- remove fence_collection_put()
>> 	- check for type on to_fence_collection()
>> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
>> 	are used with collection fences.
>>
>> v3: - Initialize fence_cb.node at fence init.
>>
>>      Comments by Chris Wilson:
>> 	- return "unbound" on fence_collection_get_timeline_name()
>> 	- don't stop adding callbacks if one fails
>> 	- remove redundant !! on fence_collection_enable_signaling()
>> 	- remove redundant () on fence_collection_signaled
>> 	- use fence_default_wait() instead
>>
>> v4 (chk): Rework, simplification and cleanup:
>> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
>> 	- Rename to fence_array.
>> 	- Return fixed driver name.
>> 	- Register only one callback at a time.
> Why? Even within a driver I expected there to be some amoritization of
> the signaling costs for handling multiple fences at once (at least the
> driver I'm familar with!).
>
> So more just curiousity as to your experience that favours sequential
> enabling.

Just the profane reason that I want to save the memory for all the 
callbacks.

But thinking about it you are probably right that we should enable the 
signaling for all fences immediately. Going to fix this in the next 
version of the patch.

>
>> +static bool fence_array_add_next_callback(struct fence_array *array)
>> +{
>> +	while (array->num_signaled < array->num_fences) {
>> +		struct fence *next = array->fences[array->num_signaled];
>> +
>> +		if (!fence_add_callback(next, &array->cb, fence_array_cb_func))
>> +			return true;
>> +
>> +		++array->num_signaled;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
>> +{
>> +	struct fence_array *array = container_of(cb, struct fence_array, cb);
> Some chasing around would have been saved by a
>
> 	assert_spin_locked(&array->lock);
>
> here.

Mhm, actually the array lock isn't held here. Thinking more about it 
adding a new callback from a fence callback can badly deadlock under 
certain situations.

I need to double check why the callback is called with the fence lock 
held here.

>
>> +
>> +	++array->num_signaled;
>> +	if (!fence_array_add_next_callback(array))
>> +		fence_signal(&array->base);
>> +}
>> +
>> +static bool fence_array_enable_signaling(struct fence *fence)
>> +{
>> +	struct fence_array *array = to_fence_array(fence);
>> +
>> +	return fence_array_add_next_callback(array);
>> +}
>> +
>> +static bool fence_array_signaled(struct fence *fence)
>> +{
>> +	struct fence_array *array = to_fence_array(fence);
>> +
>> +	return ACCESS_ONCE(array->num_signaled) == array->num_fences;
> Can just be READ_ONCE()

Good point, going to fix that.

Christian.

> -Chris
>
Daniel Vetter May 23, 2016, 7:41 a.m. UTC | #5
On Fri, May 20, 2016 at 11:47:28AM -0300, Gustavo Padovan wrote:
> 2016-05-20 Christian König <deathsimple@vodafone.de>:
> 
> > From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> > 
> > struct fence_collection inherits from struct fence and carries a
> > collection of fences that needs to be waited together.
> > 
> > It is useful to translate a sync_file to a fence to remove the complexity
> > of dealing with sync_files on DRM drivers. So even if there are many
> > fences in the sync_file that needs to waited for a commit to happen,
> > they all get added to the fence_collection and passed for DRM use as
> > a standard struct fence.
> > 
> > That means that no changes needed to any driver besides supporting fences.
> > 
> > fence_collection's fence doesn't belong to any timeline context, so
> > fence_is_later() and fence_later() are not meant to be called with
> > fence_collections fences.
> > 
> > v2: Comments by Daniel Vetter:
> > 	- merge fence_collection_init() and fence_collection_add()
> > 	- only add callbacks at ->enable_signalling()
> > 	- remove fence_collection_put()
> > 	- check for type on to_fence_collection()
> > 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
> > 	are used with collection fences.
> > 
> > v3: - Initialize fence_cb.node at fence init.
> > 
> >     Comments by Chris Wilson:
> > 	- return "unbound" on fence_collection_get_timeline_name()
> > 	- don't stop adding callbacks if one fails
> > 	- remove redundant !! on fence_collection_enable_signaling()
> > 	- remove redundant () on fence_collection_signaled
> > 	- use fence_default_wait() instead
> > 
> > v4 (chk): Rework, simplification and cleanup:
> > 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
> > 	- Rename to fence_array.
> > 	- Return fixed driver name.
> > 	- Register only one callback at a time.
> > 	- Document that create function takes ownership of array.
> 
> This looks good to me. Dropping NO_CONTEXT was a good idea, also
> registering only one callback makes it looks better.

This will make it even harder to eventually add a real fence_context
structure for tracking and debugging. I know you don't care for amdgpu
since you have amdgpu-specific debug files, and there's some lifetime fun
that makes it not immediately obvious how to resolve it. But on "lots of
shitty little drivers" systems aka SoCs generic debugging information is
crucial I think. Not liking too much where this is going.
-Daniel
Christian König May 23, 2016, 11:29 a.m. UTC | #6
Am 23.05.2016 um 09:41 schrieb Daniel Vetter:
> On Fri, May 20, 2016 at 11:47:28AM -0300, Gustavo Padovan wrote:
>> 2016-05-20 Christian König <deathsimple@vodafone.de>:
>>
>>> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>>
>>> struct fence_collection inherits from struct fence and carries a
>>> collection of fences that needs to be waited together.
>>>
>>> It is useful to translate a sync_file to a fence to remove the complexity
>>> of dealing with sync_files on DRM drivers. So even if there are many
>>> fences in the sync_file that needs to waited for a commit to happen,
>>> they all get added to the fence_collection and passed for DRM use as
>>> a standard struct fence.
>>>
>>> That means that no changes needed to any driver besides supporting fences.
>>>
>>> fence_collection's fence doesn't belong to any timeline context, so
>>> fence_is_later() and fence_later() are not meant to be called with
>>> fence_collections fences.
>>>
>>> v2: Comments by Daniel Vetter:
>>> 	- merge fence_collection_init() and fence_collection_add()
>>> 	- only add callbacks at ->enable_signalling()
>>> 	- remove fence_collection_put()
>>> 	- check for type on to_fence_collection()
>>> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
>>> 	are used with collection fences.
>>>
>>> v3: - Initialize fence_cb.node at fence init.
>>>
>>>      Comments by Chris Wilson:
>>> 	- return "unbound" on fence_collection_get_timeline_name()
>>> 	- don't stop adding callbacks if one fails
>>> 	- remove redundant !! on fence_collection_enable_signaling()
>>> 	- remove redundant () on fence_collection_signaled
>>> 	- use fence_default_wait() instead
>>>
>>> v4 (chk): Rework, simplification and cleanup:
>>> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
>>> 	- Rename to fence_array.
>>> 	- Return fixed driver name.
>>> 	- Register only one callback at a time.
>>> 	- Document that create function takes ownership of array.
>> This looks good to me. Dropping NO_CONTEXT was a good idea, also
>> registering only one callback makes it looks better.
> This will make it even harder to eventually add a real fence_context
> structure for tracking and debugging. I know you don't care for amdgpu
> since you have amdgpu-specific debug files, and there's some lifetime fun
> that makes it not immediately obvious how to resolve it.

Completely independent of my work on amdgpu I still think that it's not 
such a good idea to use a complex structure for the fence context.

Especially on SoCs and small embedded systems you probably don't want to 
overhead associated with that only for debugging purposes in a 
production environment.

> But on "lots of
> shitty little drivers" systems aka SoCs generic debugging information is
> crucial I think. Not liking too much where this is going.

Yeah I agree that generic debugging information is usually crucial, but 
the lifetime issues indeed can't be solved without reference counting 
and a hole bunch of overhead.

How about V5 of the patch I've just send out? Apart from fixing a few 
issues I've made the context and sequence number parameters of the 
fence_array object.

This way you don't need to always allocate a new context for each 
object, but just enough to keep your timelines straight.

E.g. you don't get a lot of contexts only used once. This is at least 
sufficient for my amdgpu use case.

Regards,
Christian.

> -Daniel
Daniel Vetter May 23, 2016, 2 p.m. UTC | #7
On Mon, May 23, 2016 at 01:29:11PM +0200, Christian König wrote:
> Am 23.05.2016 um 09:41 schrieb Daniel Vetter:
> >On Fri, May 20, 2016 at 11:47:28AM -0300, Gustavo Padovan wrote:
> >>2016-05-20 Christian König <deathsimple@vodafone.de>:
> >>
> >>>From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> >>>
> >>>struct fence_collection inherits from struct fence and carries a
> >>>collection of fences that needs to be waited together.
> >>>
> >>>It is useful to translate a sync_file to a fence to remove the complexity
> >>>of dealing with sync_files on DRM drivers. So even if there are many
> >>>fences in the sync_file that needs to waited for a commit to happen,
> >>>they all get added to the fence_collection and passed for DRM use as
> >>>a standard struct fence.
> >>>
> >>>That means that no changes needed to any driver besides supporting fences.
> >>>
> >>>fence_collection's fence doesn't belong to any timeline context, so
> >>>fence_is_later() and fence_later() are not meant to be called with
> >>>fence_collections fences.
> >>>
> >>>v2: Comments by Daniel Vetter:
> >>>	- merge fence_collection_init() and fence_collection_add()
> >>>	- only add callbacks at ->enable_signalling()
> >>>	- remove fence_collection_put()
> >>>	- check for type on to_fence_collection()
> >>>	- adjust fence_is_later() and fence_later() to WARN_ON() if they
> >>>	are used with collection fences.
> >>>
> >>>v3: - Initialize fence_cb.node at fence init.
> >>>
> >>>     Comments by Chris Wilson:
> >>>	- return "unbound" on fence_collection_get_timeline_name()
> >>>	- don't stop adding callbacks if one fails
> >>>	- remove redundant !! on fence_collection_enable_signaling()
> >>>	- remove redundant () on fence_collection_signaled
> >>>	- use fence_default_wait() instead
> >>>
> >>>v4 (chk): Rework, simplification and cleanup:
> >>>	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
> >>>	- Rename to fence_array.
> >>>	- Return fixed driver name.
> >>>	- Register only one callback at a time.
> >>>	- Document that create function takes ownership of array.
> >>This looks good to me. Dropping NO_CONTEXT was a good idea, also
> >>registering only one callback makes it looks better.
> >This will make it even harder to eventually add a real fence_context
> >structure for tracking and debugging. I know you don't care for amdgpu
> >since you have amdgpu-specific debug files, and there's some lifetime fun
> >that makes it not immediately obvious how to resolve it.
> 
> Completely independent of my work on amdgpu I still think that it's not such
> a good idea to use a complex structure for the fence context.
> 
> Especially on SoCs and small embedded systems you probably don't want to
> overhead associated with that only for debugging purposes in a production
> environment.

At least in all the drivers I've seen you have to allocate a little bit of
stuff _anyway_ to store that context id, plus a bunch of hw state. So the
allocation itsel shouldn't be a problem at all, since that can be handled
by embedding.

If it's the atomic inc/dec for refcounting you're worried about, then that
could be made dependent on CONFIG_FENCE_DEBUGGING. And android didn't have
that Kconfig knob even, and people seemingly didn't care about the
overhead on arm socs.

> >But on "lots of
> >shitty little drivers" systems aka SoCs generic debugging information is
> >crucial I think. Not liking too much where this is going.
> 
> Yeah I agree that generic debugging information is usually crucial, but the
> lifetime issues indeed can't be solved without reference counting and a hole
> bunch of overhead.
> 
> How about V5 of the patch I've just send out? Apart from fixing a few issues
> I've made the context and sequence number parameters of the fence_array
> object.
> 
> This way you don't need to always allocate a new context for each object,
> but just enough to keep your timelines straight.
> 
> E.g. you don't get a lot of contexts only used once. This is at least
> sufficient for my amdgpu use case.

Well my idea behind NO_CONTEXT was that this way there'd be only one
special context for merged fences, which could have a linked list of all
fences ever (with debugging stuff). That way there's no need to allocate
one context per fence_array. It has the downside of a slightly leaky
abstraction, as in you must use the provided interface functions to figure
out whether a fence is on the same timeline, and if so, which one is
later.
-Daniel
diff mbox

Patch

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 57a675f..85f6928 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1 +1 @@ 
-obj-y := dma-buf.o fence.o reservation.o seqno-fence.o
+obj-y := dma-buf.o fence.o reservation.o seqno-fence.o fence-array.o
diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
new file mode 100644
index 0000000..a700c6e
--- /dev/null
+++ b/drivers/dma-buf/fence-array.c
@@ -0,0 +1,132 @@ 
+/*
+ * fence-array: aggregate fences to be waited together
+ *
+ * Copyright (C) 2016 Collabora Ltd
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Gustavo Padovan <gustavo@padovan.org>
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/fence-array.h>
+
+static void fence_array_cb_func(struct fence *f, struct fence_cb *cb);
+
+static const char *fence_array_get_driver_name(struct fence *fence)
+{
+	return "fence_array";
+}
+
+static const char *fence_array_get_timeline_name(struct fence *fence)
+{
+	return "unbound";
+}
+
+static bool fence_array_add_next_callback(struct fence_array *array)
+{
+	while (array->num_signaled < array->num_fences) {
+		struct fence *next = array->fences[array->num_signaled];
+
+		if (!fence_add_callback(next, &array->cb, fence_array_cb_func))
+			return true;
+
+		++array->num_signaled;
+	}
+
+	return false;
+}
+
+static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
+{
+	struct fence_array *array = container_of(cb, struct fence_array, cb);
+
+	++array->num_signaled;
+	if (!fence_array_add_next_callback(array))
+		fence_signal(&array->base);
+}
+
+static bool fence_array_enable_signaling(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+
+	return fence_array_add_next_callback(array);
+}
+
+static bool fence_array_signaled(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+
+	return ACCESS_ONCE(array->num_signaled) == array->num_fences;
+}
+
+static void fence_array_release(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+	unsigned i;
+
+	i = ACCESS_ONCE(array->num_signaled);
+	if (i < array->num_fences) {
+		struct fence *last = array->fences[i];
+
+		fence_remove_callback(last, &array->cb);
+	}
+
+	for (i = 0; i < array->num_fences; ++i)
+		fence_put(array->fences[i]);
+
+	kfree(array->fences);
+	fence_free(fence);
+}
+
+const struct fence_ops fence_array_ops = {
+	.get_driver_name = fence_array_get_driver_name,
+	.get_timeline_name = fence_array_get_timeline_name,
+	.enable_signaling = fence_array_enable_signaling,
+	.signaled = fence_array_signaled,
+	.wait = fence_default_wait,
+	.release = fence_array_release,
+};
+
+/**
+ * fence_array_create - Create a custom fence array
+ * @num_fences:	[in]	number of fences to add in the array
+ * @fences:	[in]	array containing the fences
+ *
+ * Allocate a fence_array object and initialize the base fence with fence_init().
+ * In case of error it returns NULL.
+ *
+ * The caller should allocte the fences array with num_fences size
+ * and fill it with the fences it wants to add to the object. Ownership of this
+ * array is take and fence_put() is used on each fence on release.
+ */
+struct fence_array *fence_array_create(int num_fences, struct fence **fences)
+{
+	struct fence_array *array;
+	int i;
+
+	array = kzalloc(sizeof(*array), GFP_KERNEL);
+	if (!array)
+		return NULL;
+
+	spin_lock_init(&array->lock);
+	fence_init(&array->base, &fence_array_ops, &array->lock,
+		   fence_context_alloc(1), 0);
+
+	array->num_signaled = 0;
+	array->num_fences = num_fences;
+	array->fences = fences;
+
+	return array;
+}
+EXPORT_SYMBOL(fence_array_create);
diff --git a/include/linux/fence-array.h b/include/linux/fence-array.h
new file mode 100644
index 0000000..f1daeb3
--- /dev/null
+++ b/include/linux/fence-array.h
@@ -0,0 +1,62 @@ 
+/*
+ * fence-array: aggregates fence to be waited together
+ *
+ * Copyright (C) 2016 Collabora Ltd
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Gustavo Padovan <gustavo@padovan.org>
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __LINUX_FENCE_ARRAY_H
+#define __LINUX_FENCE_ARRAY_H
+
+#include <linux/fence.h>
+
+/**
+ * struct fence_array - fence to represent an array of fences
+ * @base: fence base class
+ * @lock: spinlock for fence handling
+ * @cb: fence callback structure for signaling
+ * @num_signaled: fences in the array already signaled
+ * @num_fences: number of fences in the array
+ * @fences: array of the fences
+ */
+struct fence_array {
+	struct fence base;
+
+	spinlock_t lock;
+	struct fence_cb cb;
+	unsigned num_signaled;
+	unsigned num_fences;
+	struct fence **fences;
+};
+
+extern const struct fence_ops fence_array_ops;
+
+/**
+ * to_fence_array - cast a fence to a fence_array
+ * @fence: fence to cast to a fence_array
+ *
+ * Returns NULL if the fence is not a fence_array,
+ * or the fence_array otherwise.
+ */
+static inline struct fence_array *to_fence_array(struct fence *fence)
+{
+	if (fence->ops != &fence_array_ops)
+		return NULL;
+	return container_of(fence, struct fence_array, base);
+}
+
+struct fence_array *fence_array_create(int num_fences, struct fence **fences);
+
+#endif /* __LINUX_FENCE_ARRAY_H */