Message ID | 1489662310.2303.21.camel@pengutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-03-16 Philipp Zabel <p.zabel@pengutronix.de>: > Hi Gustavo, > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > [...] > > I was thinking on some function that would iterate over all fences in > > the fence_array and check their context. The if we find our own gpu > > context in there we fail the submit. > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > it be enough to just wait if there are foreign fences in the array? You don't need to fail necessarily. In my mind I had the use case that maybe some driver could deadlock waiting for his own fence. What you do with the information that the array has it a fence with the driver's context is entirely up to the driver to decide. > > How about something like this: > > ----------8<---------- > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > index 364fe50d020de..0b0bdaf4406d4 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > @@ -296,6 +296,22 @@ static void submit_cleanup(struct etnaviv_gem_submit *submit) > kfree(submit); > } > > +static bool dma_fence_all_in_context(struct dma_fence *fence, u64 context) > +{ > + if (dma_fence_is_array(fence)) { > + struct dma_fence_array *array = to_dma_fence_array(fence); > + int i; > + > + for (i = 0; i < array->num_fences; i++) { > + if (array->fences[i]->context != context) > + return false; > + } > + return true; > + } > + > + return fence->context == context; > +} If we don't mind having fences with our own context, why should we check them? Gustavo
On Fri, 2017-03-17 at 11:00 -0300, Gustavo Padovan wrote: > 2017-03-16 Philipp Zabel <p.zabel@pengutronix.de>: > > > Hi Gustavo, > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > [...] > > > I was thinking on some function that would iterate over all fences in > > > the fence_array and check their context. The if we find our own gpu > > > context in there we fail the submit. > > > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > > it be enough to just wait if there are foreign fences in the array? > > You don't need to fail necessarily. In my mind I had the use case that > maybe some driver could deadlock waiting for his own fence. What you > do with the information that the array has it a fence with the driver's > context is entirely up to the driver to decide. [...] > If we don't mind having fences with our own context, why should we check > them? My understanding was that for foreign fences we have to wait, for example before resolving into a framebuffer that is still being scanned out. But if we are fed fences from our own context, we can just ignore them because etnaviv just has a single command queue, so anything waiting for a fence from our own context will be queued after that fence is signaled anyway. regards Philipp
Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: > Hi Gustavo, > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > [...] > > I was thinking on some function that would iterate over all fences in > > the fence_array and check their context. The if we find our own gpu > > context in there we fail the submit. > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > it be enough to just wait if there are foreign fences in the array? Yes, skipping the wait if all fences are from our own context is an optimization and it's certainly not an issue if someone feeds us our own fences. > > How about something like this: > > ----------8<---------- > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > index 364fe50d020de..0b0bdaf4406d4 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > @@ -296,6 +296,22 @@ static void submit_cleanup(struct etnaviv_gem_submit *submit) > kfree(submit); > } > > +static bool dma_fence_all_in_context(struct dma_fence *fence, u64 context) dma_fence_match_context? > +{ > + if (dma_fence_is_array(fence)) { > + struct dma_fence_array *array = to_dma_fence_array(fence); > + int i; > + > + for (i = 0; i < array->num_fences; i++) { > + if (array->fences[i]->context != context) > + return false; > + } > + return true; > + } > + > + return fence->context == context; > +} > + This looks good to me. Please add this to a common location in the next submission. > int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, > struct drm_file *file) > { > @@ -413,12 +429,11 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, > goto err_submit_objects; > } > > - /* TODO if we get an array-fence due to userspace merging > - * multiple fences, we need a way to determine if all the > - * backing fences are from our own context.. > + /* > + * Wait if the fence is from a foreign context, or if the fence > + * array contains any fence from a foreign context. > */ > - > - if (in_fence->context != gpu->fence_context) { > + if (!dma_fence_all_in_context(in_fence, gpu->fence_context)) { > ret = dma_fence_wait(in_fence, true); > if (ret) > goto err_submit_objects; > ---------->8---------- > > [...] > > > > > @@ -154,6 +154,10 @@ struct drm_etnaviv_gem_submit_bo { > > > > > * one or more cmdstream buffers. This allows for conditional execution > > > > > * (context-restore), and IB buffers needed for per tile/bin draw cmds. > > > > > */ > > > > > +#define ETNA_SUBMIT_NO_IMPLICIT 0x0001 > > > > > +#define ETNA_SUBMIT_FENCE_FD_IN 0x0002 > > > > > > > > ETNA_SUBMIT_NO_IMPLICIT and ETNA_SUBMIT_FENCE_FD_IN are the same, when > > > > you send and fence_fd to the kernel you are requesting on explicit sync > > > > thus I think the ETNA_SUBMIT_NO_IMPLICIT can be dropped and > > > > ETNA_SUBMIT_FENCE_FD_IN would be the one to look at. > > > > > > I just followed Rob's lead. I'm not sure if there may be a reason to > > > submit an in fence still keep implicit fencing enabled at the same time, > > > or to allow a submit without any fencing whatsoever. Maybe for testing > > > purposes? > > > I'm happy to drop the NO_IMPLICIT flag if there is no need for it. > > > > Right. My understanding is that the flags would mean the same thing, but > > I'm no expert on the GPU side of things. Maybe Rob can provide us some > > insight on why he added NO_IMPLICIT. > > Yes, that would be welcome.
On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: > Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: > > Hi Gustavo, > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > [...] > > > I was thinking on some function that would iterate over all fences in > > > the fence_array and check their context. The if we find our own gpu > > > context in there we fail the submit. > > > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > > it be enough to just wait if there are foreign fences in the array? > > Yes, skipping the wait if all fences are from our own context is an > optimization and it's certainly not an issue if someone feeds us our own > fences. Are you sure about that - what if we have two GPUs, a 2D and 3D GPU, and we're fed an etnaviv fence for one GPU when submitting to the other GPU. So we do end up being fed our own fences, and we have to respect them otherwise we lose inter-GPU synchronisation, and that will break existing userspace.
Am Freitag, den 17.03.2017, 14:42 +0000 schrieb Russell King - ARM Linux: > On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: > > Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: > > > Hi Gustavo, > > > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > > [...] > > > > I was thinking on some function that would iterate over all fences in > > > > the fence_array and check their context. The if we find our own gpu > > > > context in there we fail the submit. > > > > > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > > > it be enough to just wait if there are foreign fences in the array? > > > > Yes, skipping the wait if all fences are from our own context is an > > optimization and it's certainly not an issue if someone feeds us our own > > fences. > > Are you sure about that - what if we have two GPUs, a 2D and 3D GPU, > and we're fed an etnaviv fence for one GPU when submitting to the > other GPU. > > So we do end up being fed our own fences, and we have to respect them > otherwise we lose inter-GPU synchronisation, and that will break > existing userspace. > The etnaviv GPUs, while being on the same DRM device, have distinct fence contexts. So the 3D GPU will consider a fence from the 2D GPU as foreign and properly wait on it. It's only when we get an in fence that has been generated as an out fence by one (or multiple) submits to the same GPU, that we are able to skip the wait and enqueue the command without waiting for the fence to signal. Regards, Lucas
On Fri, Mar 17, 2017 at 03:58:27PM +0100, Lucas Stach wrote: > Am Freitag, den 17.03.2017, 14:42 +0000 schrieb Russell King - ARM > Linux: > > On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: > > > Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: > > > > Hi Gustavo, > > > > > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > > > [...] > > > > > I was thinking on some function that would iterate over all fences in > > > > > the fence_array and check their context. The if we find our own gpu > > > > > context in there we fail the submit. > > > > > > > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > > > > it be enough to just wait if there are foreign fences in the array? > > > > > > Yes, skipping the wait if all fences are from our own context is an > > > optimization and it's certainly not an issue if someone feeds us our own > > > fences. > > > > Are you sure about that - what if we have two GPUs, a 2D and 3D GPU, > > and we're fed an etnaviv fence for one GPU when submitting to the > > other GPU. > > > > So we do end up being fed our own fences, and we have to respect them > > otherwise we lose inter-GPU synchronisation, and that will break > > existing userspace. > > > The etnaviv GPUs, while being on the same DRM device, have distinct > fence contexts. So the 3D GPU will consider a fence from the 2D GPU as > foreign and properly wait on it. > > It's only when we get an in fence that has been generated as an out > fence by one (or multiple) submits to the same GPU, that we are able to > skip the wait and enqueue the command without waiting for the fence to > signal. Sounds fine then. Thanks.
On Fri, Mar 17, 2017 at 7:58 AM, Lucas Stach <l.stach@pengutronix.de> wrote: > Am Freitag, den 17.03.2017, 14:42 +0000 schrieb Russell King - ARM > Linux: >> On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: >> > Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: >> > > Hi Gustavo, >> > > >> > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: >> > > [...] >> > > > I was thinking on some function that would iterate over all fences in >> > > > the fence_array and check their context. The if we find our own gpu >> > > > context in there we fail the submit. >> > > >> > > Why would we have to fail if somebody feeds us our own fences? Wouldn't >> > > it be enough to just wait if there are foreign fences in the array? >> > >> > Yes, skipping the wait if all fences are from our own context is an >> > optimization and it's certainly not an issue if someone feeds us our own >> > fences. >> >> Are you sure about that - what if we have two GPUs, a 2D and 3D GPU, >> and we're fed an etnaviv fence for one GPU when submitting to the >> other GPU. >> >> So we do end up being fed our own fences, and we have to respect them >> otherwise we lose inter-GPU synchronisation, and that will break >> existing userspace. >> > The etnaviv GPUs, while being on the same DRM device, have distinct > fence contexts. So the 3D GPU will consider a fence from the 2D GPU as > foreign and properly wait on it. > > It's only when we get an in fence that has been generated as an out > fence by one (or multiple) submits to the same GPU, that we are able to > skip the wait and enqueue the command without waiting for the fence to > signal. With regard to the 2D and 3D GPU case, it seems to me that a good example use case would be where the 3D GPU is used in Android for all the surface generation using OpenGL and then the 2D GPU would get used to composite all those surfaces together, leaving the 3D open to work on other stuff. As I understand it, the 2D GPU is much faster at 2D compositing than the 3D GPU would be, (not to mention less power hungry.) > > Regards, > Lucas > > _______________________________________________ > etnaviv mailing list > etnaviv@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/etnaviv
Am 17.03.2017 um 15:58 schrieb Lucas Stach: > Am Freitag, den 17.03.2017, 14:42 +0000 schrieb Russell King - ARM > Linux: >> On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: >>> Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp Zabel: >>>> Hi Gustavo, >>>> >>>> On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: >>>> [...] >>>>> I was thinking on some function that would iterate over all fences in >>>>> the fence_array and check their context. The if we find our own gpu >>>>> context in there we fail the submit. >>>> Why would we have to fail if somebody feeds us our own fences? Wouldn't >>>> it be enough to just wait if there are foreign fences in the array? >>> Yes, skipping the wait if all fences are from our own context is an >>> optimization and it's certainly not an issue if someone feeds us our own >>> fences. >> Are you sure about that - what if we have two GPUs, a 2D and 3D GPU, >> and we're fed an etnaviv fence for one GPU when submitting to the >> other GPU. >> >> So we do end up being fed our own fences, and we have to respect them >> otherwise we lose inter-GPU synchronisation, and that will break >> existing userspace. >> > The etnaviv GPUs, while being on the same DRM device, have distinct > fence contexts. So the 3D GPU will consider a fence from the 2D GPU as > foreign and properly wait on it. > > It's only when we get an in fence that has been generated as an out > fence by one (or multiple) submits to the same GPU, that we are able to > skip the wait and enqueue the command without waiting for the fence to > signal. BTW: Do you still have the needs for a GPU scheduler? The scheduler amdgpu uses is hopefully still hardware agnostic and has all that handling already included. Using it can avoid blocking for foreign fences during your command submission and I won't mind seeing that moved into common drm code. Regards, Christian. > > Regards, > Lucas > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am Samstag, den 18.03.2017, 15:19 +0100 schrieb Christian König: > Am 17.03.2017 um 15:58 schrieb Lucas Stach: > > Am Freitag, den 17.03.2017, 14:42 +0000 schrieb Russell King - ARM > > Linux: > > > On Fri, Mar 17, 2017 at 03:10:21PM +0100, Lucas Stach wrote: > > > > Am Donnerstag, den 16.03.2017, 12:05 +0100 schrieb Philipp > > > > Zabel: > > > > > Hi Gustavo, > > > > > > > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > > > > [...] > > > > > > I was thinking on some function that would iterate over all > > > > > > fences in > > > > > > the fence_array and check their context. The if we find our > > > > > > own gpu > > > > > > context in there we fail the submit. > > > > > > > > > > Why would we have to fail if somebody feeds us our own > > > > > fences? Wouldn't > > > > > it be enough to just wait if there are foreign fences in the > > > > > array? > > > > > > > > Yes, skipping the wait if all fences are from our own context > > > > is an > > > > optimization and it's certainly not an issue if someone feeds > > > > us our own > > > > fences. > > > > > > Are you sure about that - what if we have two GPUs, a 2D and 3D > > > GPU, > > > and we're fed an etnaviv fence for one GPU when submitting to the > > > other GPU. > > > > > > So we do end up being fed our own fences, and we have to respect > > > them > > > otherwise we lose inter-GPU synchronisation, and that will break > > > existing userspace. > > > > > > > The etnaviv GPUs, while being on the same DRM device, have distinct > > fence contexts. So the 3D GPU will consider a fence from the 2D GPU > > as > > foreign and properly wait on it. > > > > It's only when we get an in fence that has been generated as an out > > fence by one (or multiple) submits to the same GPU, that we are > > able to > > skip the wait and enqueue the command without waiting for the fence > > to > > signal. > > BTW: Do you still have the needs for a GPU scheduler? > > The scheduler amdgpu uses is hopefully still hardware agnostic and > has > all that handling already included. > > Using it can avoid blocking for foreign fences during your command > submission and I won't mind seeing that moved into common drm code. Yes, it's still on my list of features to enable for etnaviv. It's just that other things like enabling more hardware and getting performance up had priority over this. I've looked at the amdgpu scheduler and I agree that it's probably the right thing to move this out into common code and make use of it in etnaviv. Regards, Lucas
On Fri, Mar 17, 2017 at 11:00:44AM -0300, Gustavo Padovan wrote: > 2017-03-16 Philipp Zabel <p.zabel@pengutronix.de>: > > > Hi Gustavo, > > > > On Mon, 2017-03-13 at 14:37 -0300, Gustavo Padovan wrote: > > [...] > > > I was thinking on some function that would iterate over all fences in > > > the fence_array and check their context. The if we find our own gpu > > > context in there we fail the submit. > > > > Why would we have to fail if somebody feeds us our own fences? Wouldn't > > it be enough to just wait if there are foreign fences in the array? > > You don't need to fail necessarily. In my mind I had the use case that > maybe some driver could deadlock waiting for his own fence. What you > do with the information that the array has it a fence with the driver's > context is entirely up to the driver to decide. With the current infrastructure you can't have future fences (i.e. a fence for something that might happen somewhen in the future, but for which no driver is yet committed to execute - i.e. it's not yet queued). And without future fences you can't ever have deadlocks. The "are these all my own fences" check is more useful just to import the fences into your own internal objects and use your driver-internal depency handling (which usually means, more hw, less cpu waiting). But that's entirely orthogonal to "can we deadlock", which atm is "no"[1]. -Daniel 1: You can deadlock when e.g. one driver holds a lock while waiting for a fence, and another driver needs that lock before it is willing to signal said fence. But that's not any different from waiting for anything else in the kernel, and will be checked by the cross-release lockdep stuff rsn. > > > > > How about something like this: > > > > ----------8<---------- > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > > index 364fe50d020de..0b0bdaf4406d4 100644 > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > > @@ -296,6 +296,22 @@ static void submit_cleanup(struct etnaviv_gem_submit *submit) > > kfree(submit); > > } > > > > +static bool dma_fence_all_in_context(struct dma_fence *fence, u64 context) > > +{ > > + if (dma_fence_is_array(fence)) { > > + struct dma_fence_array *array = to_dma_fence_array(fence); > > + int i; > > + > > + for (i = 0; i < array->num_fences; i++) { > > + if (array->fences[i]->context != context) > > + return false; > > + } > > + return true; > > + } > > + > > + return fence->context == context; > > +} > > If we don't mind having fences with our own context, why should we check > them? > > Gustavo > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 364fe50d020de..0b0bdaf4406d4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -296,6 +296,22 @@ static void submit_cleanup(struct etnaviv_gem_submit *submit) kfree(submit); } +static bool dma_fence_all_in_context(struct dma_fence *fence, u64 context) +{ + if (dma_fence_is_array(fence)) { + struct dma_fence_array *array = to_dma_fence_array(fence); + int i; + + for (i = 0; i < array->num_fences; i++) { + if (array->fences[i]->context != context) + return false; + } + return true; + } + + return fence->context == context; +} + int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, struct drm_file *file) { @@ -413,12 +429,11 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, goto err_submit_objects; } - /* TODO if we get an array-fence due to userspace merging - * multiple fences, we need a way to determine if all the - * backing fences are from our own context.. + /* + * Wait if the fence is from a foreign context, or if the fence + * array contains any fence from a foreign context. */ - - if (in_fence->context != gpu->fence_context) { + if (!dma_fence_all_in_context(in_fence, gpu->fence_context)) { ret = dma_fence_wait(in_fence, true); if (ret) goto err_submit_objects;