diff mbox

[RFC,7/9] drm/i915: Interrupt driven fences

Message ID 1437143483-6234-8-git-send-email-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

John Harrison July 17, 2015, 2:31 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

The intended usage model for struct fence is that the signalled status should be
set on demand rather than polled. That is, there should not be a need for a
'signaled' function to be called everytime the status is queried. Instead,
'something' should be done to enable a signal callback from the hardware which
will update the state directly. In the case of requests, this is the seqno
update interrupt. The idea is that this callback will only be enabled on demand
when something actually tries to wait on the fence.

This change removes the polling test and replaces it with the callback scheme.
Each fence is added to a 'please poke me' list at the start of
i915_add_request(). The interrupt handler then scans through the 'poke me' list
when a new seqno pops out and signals any matching fence/request. The fence is
then removed from the list so the entire request stack does not need to be
scanned every time. Note that the fence is added to the list before the commands
to generate the seqno interrupt are added to the ring. Thus the sequence is
guaranteed to be race free if the interrupt is already enabled.

Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
called). Thus there is still a potential race when enabling the interrupt as the
request may already have completed. However, this is simply solved by calling
the interrupt processing code immediately after enabling the interrupt and
thereby checking for already completed requests.

Lastly, the ring clean up code has the possibility to cancel outstanding
requests (e.g. because TDR has reset the ring). These requests will never get
signalled and so must be removed from the signal list manually. This is done by
setting a 'cancelled' flag and then calling the regular notify/retire code path
rather than attempting to duplicate the list manipulatation and clean up code in
multiple places. This also avoid any race condition where the cancellation
request might occur after/during the completion interrupt actually arriving.

v2: Updated to take advantage of the request unreference no longer requiring the
mutex lock.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |   8 ++
 drivers/gpu/drm/i915/i915_gem.c         | 132 +++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_irq.c         |   2 +
 drivers/gpu/drm/i915/intel_lrc.c        |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
 6 files changed, 136 insertions(+), 9 deletions(-)

Comments

Maarten Lankhorst July 20, 2015, 9:09 a.m. UTC | #1
Op 17-07-15 om 16:31 schreef John.C.Harrison@Intel.com:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
This race will happen on any enable_signaling implementation, just something to be aware of.
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.

I notice in this commit you only clean up requests when the refcount drops to 0.
What resources need to be kept after the fence is signaled? Can't you queue the
delayed work from the function that signals the fence and make sure it holds a
refcount?

I'm curious because if userspace can grab a reference for a fence it might lead
to not freeing resources for a long time..

~Maarten
Daniel Vetter July 21, 2015, 7:19 a.m. UTC | #2
On Fri, Jul 17, 2015 at 03:31:21PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
> 
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
> 
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
> 
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.
> 
> v2: Updated to take advantage of the request unreference no longer requiring the
> mutex lock.
> 
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |   8 ++
>  drivers/gpu/drm/i915/i915_gem.c         | 132 +++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_irq.c         |   2 +
>  drivers/gpu/drm/i915/intel_lrc.c        |   1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>  6 files changed, 136 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 61c3db2..d7f1aa5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2163,7 +2163,11 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>  struct drm_i915_gem_request {
>  	/** Underlying object for implementing the signal/wait stuff. */
>  	struct fence fence;
> +	struct list_head signal_list;
> +	struct list_head unsignal_list;
>  	struct list_head delay_free_list;

From a very quick look a request can only ever be on one of these lists.
It would be clearer to just use one list and list_move to make that
reassignement and changing of ownership clearer.

> +	bool cancelled;
> +	bool irq_enabled;
>  
>  	/** On Which ring this request was generated */
>  	struct drm_i915_private *i915;
> @@ -2241,6 +2245,10 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct drm_i915_gem_request **req_out);
>  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  
> +void i915_gem_request_submit(struct drm_i915_gem_request *req);
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req);
> +void i915_gem_request_notify(struct intel_engine_cs *ring);
> +
>  int i915_create_fence_timeline(struct drm_device *dev,
>  			       struct intel_context *ctx,
>  			       struct intel_engine_cs *ring);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 482835a..7c589a9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1222,6 +1222,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  	if (list_empty(&req->list))
>  		return 0;
>  
> +	/*
> +	 * Enable interrupt completion of the request.
> +	 */
> +	i915_gem_request_enable_interrupt(req);
> +
>  	if (i915_gem_request_completed(req))
>  		return 0;
>  
> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>  	list_del_init(&request->list);
>  	i915_gem_request_remove_from_client(request);
>  
> +	/* In case the request is still in the signal pending list */
> +	if (!list_empty(&request->signal_list))
> +		request->cancelled = true;
> +
>  	i915_gem_request_unreference(request);
>  }
>  
> @@ -2534,6 +2543,12 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>  	 */
>  	request->postfix = intel_ring_get_tail(ringbuf);
>  
> +	/*
> +	 * Add the fence to the pending list before emitting the commands to
> +	 * generate a seqno notification interrupt.
> +	 */
> +	i915_gem_request_submit(request);
> +
>  	if (i915.enable_execlists)
>  		ret = ring->emit_request(request);
>  	else {
> @@ -2653,6 +2668,9 @@ static void i915_gem_request_free(struct drm_i915_gem_request *req)
>  		i915_gem_context_unreference(ctx);
>  	}
>  
> +	if (req->irq_enabled)
> +		req->ring->irq_put(req->ring);
> +
>  	kmem_cache_free(req->i915->requests, req);
>  }
>  
> @@ -2668,24 +2686,105 @@ static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
>  	return req->ring->name;
>  }
>  
> -static bool i915_gem_request_enable_signaling(struct fence *req_fence)
> +/*
> + * The request has been submitted to the hardware so add the fence to the
> + * list of signalable fences.
> + *
> + * NB: This does not enable interrupts yet. That only occurs on demand when
> + * the request is actually waited on. However, adding it to the list early
> + * ensures that there is no race condition where the interrupt could pop
> + * out prematurely and thus be completely lost. The race is merely that the
> + * interrupt must be manually checked for after being enabled.
> + */
> +void i915_gem_request_submit(struct drm_i915_gem_request *req)
>  {
> -	/* Interrupt driven fences are not implemented yet.*/
> -	WARN(true, "This should not be called!");
> -	return true;
> +	fence_enable_sw_signaling(&req->fence);
>  }
>  
> -static bool i915_gem_request_is_completed(struct fence *req_fence)
> +/*
> + * The request is being actively waited on, so enable interrupt based
> + * completion signalling.
> + */
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req)
> +{
> +	if (req->irq_enabled)
> +		return;
> +
> +	WARN_ON(!req->ring->irq_get(req->ring));
> +	req->irq_enabled = true;
> +
> +	/*
> +	 * Because the interrupt is only enabled on demand, there is a race
> +	 * where the interrupt can fire before anyone is looking for it. So
> +	 * do an explicit check for missed interrupts.
> +	 */
> +	i915_gem_request_notify(req->ring);
> +}
> +
> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>  {
>  	struct drm_i915_gem_request *req = container_of(req_fence,
>  						 typeof(*req), fence);
> +
> +	i915_gem_request_reference(req);
> +	WARN_ON(!list_empty(&req->signal_list));
> +	list_add_tail(&req->signal_list, &req->ring->fence_signal_list);
> +
> +	/*
> +	 * Note that signalling is always enabled for every request before
> +	 * that request is submitted to the hardware. Therefore there is
> +	 * no race condition whereby the signal could pop out before the
> +	 * request has been added to the list. Hence no need to check
> +	 * for completion, undo the list add and return false.
> +	 *
> +	 * NB: Interrupts are only enabled on demand. Thus there is still a
> +	 * race where the request could complete before the interrupt has
> +	 * been enabled. Thus care must be taken at that point.
> +	 */
> +
> +	return true;

fence->enable_signalling is the part that should enable timely signalling,
i.e. interrupts. Adding the fence to completion lists for eventual
signalling should be done unconditionally.

Note that struct fence supports irq context callbacks and not just
waiting, so just enabling interrupts in the wait callback is not enough.

fence->wait callback is optional and really just for some additional
tricks when waiting with a process context, like busy-spinning (which we
do). The current i915_wait_request should be moved into that callback.

Also I'd like to move i915 internally over to calling fence functions so
that we can test this code without the android syncpts.
-Daniel

> +}
> +
> +void i915_gem_request_notify(struct intel_engine_cs *ring)
> +{
> +	struct drm_i915_gem_request *req, *req_next;
> +	unsigned long flags;
>  	u32 seqno;
> +	LIST_HEAD(free_list);
>  
> -	BUG_ON(req == NULL);
> +	if (list_empty(&ring->fence_signal_list))
> +		return;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&ring->fence_lock, flags);
> +	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_list) {
> +		if (!req->cancelled) {
> +			if (!i915_seqno_passed(seqno, req->seqno))
> +				continue;
>  
> -	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
> +			fence_signal_locked(&req->fence);
> +		}
> +
> +		list_del_init(&req->signal_list);
> +		if (req->irq_enabled) {
> +			req->ring->irq_put(req->ring);
> +			req->irq_enabled = false;
> +		}
>  
> -	return i915_seqno_passed(seqno, req->seqno);
> +		/* Can't unreference here because that might grab fence_lock */
> +		list_add_tail(&req->unsignal_list, &free_list);
> +	}
> +	spin_unlock_irqrestore(&ring->fence_lock, flags);
> +
> +	/* It should now be safe to actually free the requests */
> +	while (!list_empty(&free_list)) {
> +		req = list_first_entry(&free_list,
> +				       struct drm_i915_gem_request, unsignal_list);
> +		list_del(&req->unsignal_list);
> +
> +		i915_gem_request_unreference(req);
> +	}
>  }
>  
>  static void i915_fence_timeline_value_str(struct fence *fence, char *str, int size)
> @@ -2711,7 +2810,6 @@ static const struct fence_ops i915_gem_request_fops = {
>  	.get_driver_name	= i915_gem_request_get_driver_name,
>  	.get_timeline_name	= i915_gem_request_get_timeline_name,
>  	.enable_signaling	= i915_gem_request_enable_signaling,
> -	.signaled		= i915_gem_request_is_completed,
>  	.wait			= fence_default_wait,
>  	.release		= i915_gem_request_release,
>  	.fence_value_str	= i915_fence_value_str,
> @@ -2791,6 +2889,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  		goto err;
>  	}
>  
> +	INIT_LIST_HEAD(&req->signal_list);
>  	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
>  		   ctx->engine[ring->id].fence_timeline.fence_context,
>  		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
> @@ -2913,6 +3012,13 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
>  
>  		i915_gem_request_retire(request);
>  	}
> +
> +	/*
> +	 * Make sure any requests that were on the signal pending list get
> +	 * cleaned up.
> +	 */
> +	i915_gem_request_notify(ring);
> +	i915_gem_retire_requests_ring(ring);
>  }
>  
>  void i915_gem_restore_fences(struct drm_device *dev)
> @@ -2968,6 +3074,13 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>  {
>  	WARN_ON(i915_verify_lists(ring->dev));
>  
> +	/*
> +	 * If no-one has waited on a request recently then interrupts will
> +	 * not have been enabled and thus no requests will ever be marked as
> +	 * completed. So do an interrupt check now.
> +	 */
> +	i915_gem_request_notify(ring);
> +
>  	/* Retire requests first as we use it above for the early return.
>  	 * If we retire requests last, we may use a later seqno and so clear
>  	 * the requests lists without clearing the active list, leading to
> @@ -5345,6 +5458,7 @@ init_ring_lists(struct intel_engine_cs *ring)
>  {
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
>  	INIT_LIST_HEAD(&ring->delayed_free_list);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index d87f173..e446509 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -853,6 +853,8 @@ static void notify_ring(struct intel_engine_cs *ring)
>  
>  	trace_i915_gem_request_notify(ring);
>  
> +	i915_gem_request_notify(ring);
> +
>  	wake_up_all(&ring->irq_queue);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9ee80f5..18dbd5c 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1808,6 +1808,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>  	ring->dev = dev;
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
>  	INIT_LIST_HEAD(&ring->delayed_free_list);
>  	spin_lock_init(&ring->fence_lock);
>  	spin_lock_init(&ring->delayed_free_lock);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 11494a3..83a5254 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2040,6 +2040,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);
>  	INIT_LIST_HEAD(&ring->execlist_queue);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
>  	INIT_LIST_HEAD(&ring->delayed_free_list);
>  	spin_lock_init(&ring->fence_lock);
>  	spin_lock_init(&ring->delayed_free_lock);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 68173a3..2e68b73 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -352,6 +352,7 @@ struct  intel_engine_cs {
>  	u32 (*get_cmd_length_mask)(u32 cmd_header);
>  
>  	spinlock_t fence_lock;
> +	struct list_head fence_signal_list;
>  };
>  
>  bool intel_ring_initialized(struct intel_engine_cs *ring);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Tvrtko Ursulin July 27, 2015, 11:33 a.m. UTC | #3
Hi,

On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer requiring the
> mutex lock.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |   8 ++
>   drivers/gpu/drm/i915/i915_gem.c         | 132 +++++++++++++++++++++++++++++---
>   drivers/gpu/drm/i915/i915_irq.c         |   2 +
>   drivers/gpu/drm/i915/intel_lrc.c        |   1 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>   6 files changed, 136 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 61c3db2..d7f1aa5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2163,7 +2163,11 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>   struct drm_i915_gem_request {
>   	/** Underlying object for implementing the signal/wait stuff. */
>   	struct fence fence;
> +	struct list_head signal_list;
> +	struct list_head unsignal_list;

In addition to what Daniel said (one list_head looks enough) it is 
customary to call it _link.

>   	struct list_head delay_free_list;
> +	bool cancelled;
> +	bool irq_enabled;
>
>   	/** On Which ring this request was generated */
>   	struct drm_i915_private *i915;
> @@ -2241,6 +2245,10 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   			   struct drm_i915_gem_request **req_out);
>   void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>
> +void i915_gem_request_submit(struct drm_i915_gem_request *req);
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req);
> +void i915_gem_request_notify(struct intel_engine_cs *ring);
> +
>   int i915_create_fence_timeline(struct drm_device *dev,
>   			       struct intel_context *ctx,
>   			       struct intel_engine_cs *ring);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 482835a..7c589a9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1222,6 +1222,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	if (list_empty(&req->list))
>   		return 0;
>
> +	/*
> +	 * Enable interrupt completion of the request.
> +	 */
> +	i915_gem_request_enable_interrupt(req);
> +
>   	if (i915_gem_request_completed(req))
>   		return 0;
>
> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   	list_del_init(&request->list);
>   	i915_gem_request_remove_from_client(request);
>
> +	/* In case the request is still in the signal pending list */
> +	if (!list_empty(&request->signal_list))
> +		request->cancelled = true;
> +
>   	i915_gem_request_unreference(request);
>   }
>
> @@ -2534,6 +2543,12 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>   	 */
>   	request->postfix = intel_ring_get_tail(ringbuf);
>
> +	/*
> +	 * Add the fence to the pending list before emitting the commands to
> +	 * generate a seqno notification interrupt.
> +	 */
> +	i915_gem_request_submit(request);
> +
>   	if (i915.enable_execlists)
>   		ret = ring->emit_request(request);
>   	else {
> @@ -2653,6 +2668,9 @@ static void i915_gem_request_free(struct drm_i915_gem_request *req)
>   		i915_gem_context_unreference(ctx);
>   	}
>
> +	if (req->irq_enabled)
> +		req->ring->irq_put(req->ring);
> +

We get here with interrupts still enabled only if userspace is 
abandoning a wait on an unsignaled fence, did I get that right?

>   	kmem_cache_free(req->i915->requests, req);
>   }
>
> @@ -2668,24 +2686,105 @@ static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
>   	return req->ring->name;
>   }
>
> -static bool i915_gem_request_enable_signaling(struct fence *req_fence)
> +/*
> + * The request has been submitted to the hardware so add the fence to the
> + * list of signalable fences.
> + *
> + * NB: This does not enable interrupts yet. That only occurs on demand when
> + * the request is actually waited on. However, adding it to the list early
> + * ensures that there is no race condition where the interrupt could pop
> + * out prematurely and thus be completely lost. The race is merely that the
> + * interrupt must be manually checked for after being enabled.
> + */
> +void i915_gem_request_submit(struct drm_i915_gem_request *req)
>   {
> -	/* Interrupt driven fences are not implemented yet.*/
> -	WARN(true, "This should not be called!");
> -	return true;
> +	fence_enable_sw_signaling(&req->fence);
>   }
>
> -static bool i915_gem_request_is_completed(struct fence *req_fence)
> +/*
> + * The request is being actively waited on, so enable interrupt based
> + * completion signalling.
> + */
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req)
> +{
> +	if (req->irq_enabled)
> +		return;
> +
> +	WARN_ON(!req->ring->irq_get(req->ring));
> +	req->irq_enabled = true;

req->irq_enabled manipulations look racy. Here and in request free it is 
protected by struct_mutex, but that is not held in 
i915_gem_request_notify. Initial feeling is you should use 
ring->fence_lock everyplace you query/manipulate req->irq_enabled.

> +
> +	/*
> +	 * Because the interrupt is only enabled on demand, there is a race
> +	 * where the interrupt can fire before anyone is looking for it. So
> +	 * do an explicit check for missed interrupts.
> +	 */
> +	i915_gem_request_notify(req->ring);
> +}
> +
> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>   {
>   	struct drm_i915_gem_request *req = container_of(req_fence,
>   						 typeof(*req), fence);
> +
> +	i915_gem_request_reference(req);
> +	WARN_ON(!list_empty(&req->signal_list));

It looks very unsafe to proceed normally after this WARN_ON. It should 
probably return false here to preserve data structure sanity.

> +	list_add_tail(&req->signal_list, &req->ring->fence_signal_list);
> +
> +	/*
> +	 * Note that signalling is always enabled for every request before
> +	 * that request is submitted to the hardware. Therefore there is
> +	 * no race condition whereby the signal could pop out before the
> +	 * request has been added to the list. Hence no need to check
> +	 * for completion, undo the list add and return false.
> +	 *
> +	 * NB: Interrupts are only enabled on demand. Thus there is still a
> +	 * race where the request could complete before the interrupt has
> +	 * been enabled. Thus care must be taken at that point.
> +	 */
> +
> +	return true;
> +}
> +
> +void i915_gem_request_notify(struct intel_engine_cs *ring)
> +{
> +	struct drm_i915_gem_request *req, *req_next;
> +	unsigned long flags;
>   	u32 seqno;
> +	LIST_HEAD(free_list);
>
> -	BUG_ON(req == NULL);
> +	if (list_empty(&ring->fence_signal_list))
> +		return;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&ring->fence_lock, flags);
> +	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_list) {
> +		if (!req->cancelled) {
> +			if (!i915_seqno_passed(seqno, req->seqno))
> +				continue;
>
> -	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
> +			fence_signal_locked(&req->fence);
> +		}
> +
> +		list_del_init(&req->signal_list);

I haven't managed to figure out why is this apparently removing requests 
which have not been signalled from the signal_list?

Shouldn't they be moved to free_list only if i915_seqno_passed?

> +		if (req->irq_enabled) {
> +			req->ring->irq_put(req->ring);
> +			req->irq_enabled = false;
> +		}
>
> -	return i915_seqno_passed(seqno, req->seqno);
> +		/* Can't unreference here because that might grab fence_lock */
> +		list_add_tail(&req->unsignal_list, &free_list);
> +	}
> +	spin_unlock_irqrestore(&ring->fence_lock, flags);
> +
> +	/* It should now be safe to actually free the requests */
> +	while (!list_empty(&free_list)) {
> +		req = list_first_entry(&free_list,
> +				       struct drm_i915_gem_request, unsignal_list);
> +		list_del(&req->unsignal_list);
> +
> +		i915_gem_request_unreference(req);
> +	}
>   }
>
>   static void i915_fence_timeline_value_str(struct fence *fence, char *str, int size)
> @@ -2711,7 +2810,6 @@ static const struct fence_ops i915_gem_request_fops = {
>   	.get_driver_name	= i915_gem_request_get_driver_name,
>   	.get_timeline_name	= i915_gem_request_get_timeline_name,
>   	.enable_signaling	= i915_gem_request_enable_signaling,
> -	.signaled		= i915_gem_request_is_completed,
>   	.wait			= fence_default_wait,
>   	.release		= i915_gem_request_release,
>   	.fence_value_str	= i915_fence_value_str,
> @@ -2791,6 +2889,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   		goto err;
>   	}
>
> +	INIT_LIST_HEAD(&req->signal_list);
>   	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
>   		   ctx->engine[ring->id].fence_timeline.fence_context,
>   		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
> @@ -2913,6 +3012,13 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
>
>   		i915_gem_request_retire(request);
>   	}
> +
> +	/*
> +	 * Make sure any requests that were on the signal pending list get
> +	 * cleaned up.
> +	 */
> +	i915_gem_request_notify(ring);
> +	i915_gem_retire_requests_ring(ring);

Would i915_gem_retire_requests_ring be enough given how it calls 
i915_gem_request_notify itself as the first thing below?

>   }
>
>   void i915_gem_restore_fences(struct drm_device *dev)
> @@ -2968,6 +3074,13 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>   {
>   	WARN_ON(i915_verify_lists(ring->dev));
>
> +	/*
> +	 * If no-one has waited on a request recently then interrupts will
> +	 * not have been enabled and thus no requests will ever be marked as
> +	 * completed. So do an interrupt check now.
> +	 */
> +	i915_gem_request_notify(ring);
> +
>   	/* Retire requests first as we use it above for the early return.
>   	 * If we retire requests last, we may use a later seqno and so clear
>   	 * the requests lists without clearing the active list, leading to
> @@ -5345,6 +5458,7 @@ init_ring_lists(struct intel_engine_cs *ring)
>   {
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
>   	INIT_LIST_HEAD(&ring->delayed_free_list);
>   }
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index d87f173..e446509 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -853,6 +853,8 @@ static void notify_ring(struct intel_engine_cs *ring)
>
>   	trace_i915_gem_request_notify(ring);
>
> +	i915_gem_request_notify(ring);
> +

How many requests are typically on signal_list on some typical 
workloads? This could be a significant performance change since on every 
user interrupt it would talk it all potentially only removing one 
request at a time.

These are just review comments on this particular patch without thinking 
yet of the bigger design questions Daniel has raised.

Regards,

Tvrtko
Tvrtko Ursulin July 27, 2015, 1:20 p.m. UTC | #4
On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status should be
> set on demand rather than polled. That is, there should not be a need for a
> 'signaled' function to be called everytime the status is queried. Instead,
> 'something' should be done to enable a signal callback from the hardware which
> will update the state directly. In the case of requests, this is the seqno
> update interrupt. The idea is that this callback will only be enabled on demand
> when something actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback scheme.
> Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke me' list
> when a new seqno pops out and signals any matching fence/request. The fence is
> then removed from the list so the entire request stack does not need to be
> scanned every time. Note that the fence is added to the list before the commands
> to generate the seqno interrupt are added to the ring. Thus the sequence is
> guaranteed to be race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> called). Thus there is still a potential race when enabling the interrupt as the
> request may already have completed. However, this is simply solved by calling
> the interrupt processing code immediately after enabling the interrupt and
> thereby checking for already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel outstanding
> requests (e.g. because TDR has reset the ring). These requests will never get
> signalled and so must be removed from the signal list manually. This is done by
> setting a 'cancelled' flag and then calling the regular notify/retire code path
> rather than attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the cancellation
> request might occur after/during the completion interrupt actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer requiring the
> mutex lock.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---

[snip]

> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   	list_del_init(&request->list);
>   	i915_gem_request_remove_from_client(request);
>
> +	/* In case the request is still in the signal pending list */
> +	if (!list_empty(&request->signal_list))
> +		request->cancelled = true;
> +

Another thing I did not see implemented is the sync_fence error state.

This is more about the Android part, but related to this canceled flag 
so I am commenting here.

I thought when TDR kicks in and we set request->cancelled to true, there 
should be a code path which somehow makes sync_fence->status negative.

As it is, because fence_signal will not be called on canceled, I thought 
waiters will wait until timeout, rather than being woken up and return 
error status.

For this to work you would somehow need to make sync_fence->status go 
negative. With normal fence completion it goes from 1 -> 0, via the 
completion callback. I did not immediately see how to make it go 
negative using the existing API.

Regards,

Tvrtko
Daniel Vetter July 27, 2015, 2 p.m. UTC | #5
On Mon, Jul 27, 2015 at 02:20:43PM +0100, Tvrtko Ursulin wrote:
> 
> On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
> >From: John Harrison <John.C.Harrison@Intel.com>
> >
> >The intended usage model for struct fence is that the signalled status should be
> >set on demand rather than polled. That is, there should not be a need for a
> >'signaled' function to be called everytime the status is queried. Instead,
> >'something' should be done to enable a signal callback from the hardware which
> >will update the state directly. In the case of requests, this is the seqno
> >update interrupt. The idea is that this callback will only be enabled on demand
> >when something actually tries to wait on the fence.
> >
> >This change removes the polling test and replaces it with the callback scheme.
> >Each fence is added to a 'please poke me' list at the start of
> >i915_add_request(). The interrupt handler then scans through the 'poke me' list
> >when a new seqno pops out and signals any matching fence/request. The fence is
> >then removed from the list so the entire request stack does not need to be
> >scanned every time. Note that the fence is added to the list before the commands
> >to generate the seqno interrupt are added to the ring. Thus the sequence is
> >guaranteed to be race free if the interrupt is already enabled.
> >
> >Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> >called). Thus there is still a potential race when enabling the interrupt as the
> >request may already have completed. However, this is simply solved by calling
> >the interrupt processing code immediately after enabling the interrupt and
> >thereby checking for already completed requests.
> >
> >Lastly, the ring clean up code has the possibility to cancel outstanding
> >requests (e.g. because TDR has reset the ring). These requests will never get
> >signalled and so must be removed from the signal list manually. This is done by
> >setting a 'cancelled' flag and then calling the regular notify/retire code path
> >rather than attempting to duplicate the list manipulatation and clean up code in
> >multiple places. This also avoid any race condition where the cancellation
> >request might occur after/during the completion interrupt actually arriving.
> >
> >v2: Updated to take advantage of the request unreference no longer requiring the
> >mutex lock.
> >
> >For: VIZ-5190
> >Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >---
> 
> [snip]
> 
> >@@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> >  	list_del_init(&request->list);
> >  	i915_gem_request_remove_from_client(request);
> >
> >+	/* In case the request is still in the signal pending list */
> >+	if (!list_empty(&request->signal_list))
> >+		request->cancelled = true;
> >+
> 
> Another thing I did not see implemented is the sync_fence error state.
> 
> This is more about the Android part, but related to this canceled flag so I
> am commenting here.
> 
> I thought when TDR kicks in and we set request->cancelled to true, there
> should be a code path which somehow makes sync_fence->status negative.
> 
> As it is, because fence_signal will not be called on canceled, I thought
> waiters will wait until timeout, rather than being woken up and return error
> status.
> 
> For this to work you would somehow need to make sync_fence->status go
> negative. With normal fence completion it goes from 1 -> 0, via the
> completion callback. I did not immediately see how to make it go negative
> using the existing API.

I think back when we did struct fence we decided that we won't care yet
about forwarding error state since doing that across drivers if you have a
chain of fences looked complicated. And no one had any clear idea about
what kind of semantics we really want. If we want this we'd need to add
it, but probably better to do that as a follow-up (usual caveat about
open-source userspace and demonstration vehicles apply and all that).
-Daniel
Tvrtko Ursulin Aug. 3, 2015, 9:20 a.m. UTC | #6
On 07/27/2015 03:00 PM, Daniel Vetter wrote:
> On Mon, Jul 27, 2015 at 02:20:43PM +0100, Tvrtko Ursulin wrote:
>>
>> On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> The intended usage model for struct fence is that the signalled status should be
>>> set on demand rather than polled. That is, there should not be a need for a
>>> 'signaled' function to be called everytime the status is queried. Instead,
>>> 'something' should be done to enable a signal callback from the hardware which
>>> will update the state directly. In the case of requests, this is the seqno
>>> update interrupt. The idea is that this callback will only be enabled on demand
>>> when something actually tries to wait on the fence.
>>>
>>> This change removes the polling test and replaces it with the callback scheme.
>>> Each fence is added to a 'please poke me' list at the start of
>>> i915_add_request(). The interrupt handler then scans through the 'poke me' list
>>> when a new seqno pops out and signals any matching fence/request. The fence is
>>> then removed from the list so the entire request stack does not need to be
>>> scanned every time. Note that the fence is added to the list before the commands
>>> to generate the seqno interrupt are added to the ring. Thus the sequence is
>>> guaranteed to be race free if the interrupt is already enabled.
>>>
>>> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
>>> called). Thus there is still a potential race when enabling the interrupt as the
>>> request may already have completed. However, this is simply solved by calling
>>> the interrupt processing code immediately after enabling the interrupt and
>>> thereby checking for already completed requests.
>>>
>>> Lastly, the ring clean up code has the possibility to cancel outstanding
>>> requests (e.g. because TDR has reset the ring). These requests will never get
>>> signalled and so must be removed from the signal list manually. This is done by
>>> setting a 'cancelled' flag and then calling the regular notify/retire code path
>>> rather than attempting to duplicate the list manipulatation and clean up code in
>>> multiple places. This also avoid any race condition where the cancellation
>>> request might occur after/during the completion interrupt actually arriving.
>>>
>>> v2: Updated to take advantage of the request unreference no longer requiring the
>>> mutex lock.
>>>
>>> For: VIZ-5190
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> ---
>>
>> [snip]
>>
>>> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>>>   	list_del_init(&request->list);
>>>   	i915_gem_request_remove_from_client(request);
>>>
>>> +	/* In case the request is still in the signal pending list */
>>> +	if (!list_empty(&request->signal_list))
>>> +		request->cancelled = true;
>>> +
>>
>> Another thing I did not see implemented is the sync_fence error state.
>>
>> This is more about the Android part, but related to this canceled flag so I
>> am commenting here.
>>
>> I thought when TDR kicks in and we set request->cancelled to true, there
>> should be a code path which somehow makes sync_fence->status negative.
>>
>> As it is, because fence_signal will not be called on canceled, I thought
>> waiters will wait until timeout, rather than being woken up and return error
>> status.
>>
>> For this to work you would somehow need to make sync_fence->status go
>> negative. With normal fence completion it goes from 1 -> 0, via the
>> completion callback. I did not immediately see how to make it go negative
>> using the existing API.
>
> I think back when we did struct fence we decided that we won't care yet
> about forwarding error state since doing that across drivers if you have a
> chain of fences looked complicated. And no one had any clear idea about
> what kind of semantics we really want. If we want this we'd need to add
> it, but probably better to do that as a follow-up (usual caveat about
> open-source userspace and demonstration vehicles apply and all that).

Hm, I am not sure but it feels to me not having an error state is a 
problem. Without it userspace can just keep waiting and waiting upon a 
fence even though the driver has discarded that workload and never plans 
to resubmit it. Am I missing something?

Tvrtko
Daniel Vetter Aug. 5, 2015, 8:05 a.m. UTC | #7
On Mon, Aug 03, 2015 at 10:20:29AM +0100, Tvrtko Ursulin wrote:
> 
> On 07/27/2015 03:00 PM, Daniel Vetter wrote:
> >On Mon, Jul 27, 2015 at 02:20:43PM +0100, Tvrtko Ursulin wrote:
> >>
> >>On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
> >>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>
> >>>The intended usage model for struct fence is that the signalled status should be
> >>>set on demand rather than polled. That is, there should not be a need for a
> >>>'signaled' function to be called everytime the status is queried. Instead,
> >>>'something' should be done to enable a signal callback from the hardware which
> >>>will update the state directly. In the case of requests, this is the seqno
> >>>update interrupt. The idea is that this callback will only be enabled on demand
> >>>when something actually tries to wait on the fence.
> >>>
> >>>This change removes the polling test and replaces it with the callback scheme.
> >>>Each fence is added to a 'please poke me' list at the start of
> >>>i915_add_request(). The interrupt handler then scans through the 'poke me' list
> >>>when a new seqno pops out and signals any matching fence/request. The fence is
> >>>then removed from the list so the entire request stack does not need to be
> >>>scanned every time. Note that the fence is added to the list before the commands
> >>>to generate the seqno interrupt are added to the ring. Thus the sequence is
> >>>guaranteed to be race free if the interrupt is already enabled.
> >>>
> >>>Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
> >>>called). Thus there is still a potential race when enabling the interrupt as the
> >>>request may already have completed. However, this is simply solved by calling
> >>>the interrupt processing code immediately after enabling the interrupt and
> >>>thereby checking for already completed requests.
> >>>
> >>>Lastly, the ring clean up code has the possibility to cancel outstanding
> >>>requests (e.g. because TDR has reset the ring). These requests will never get
> >>>signalled and so must be removed from the signal list manually. This is done by
> >>>setting a 'cancelled' flag and then calling the regular notify/retire code path
> >>>rather than attempting to duplicate the list manipulatation and clean up code in
> >>>multiple places. This also avoid any race condition where the cancellation
> >>>request might occur after/during the completion interrupt actually arriving.
> >>>
> >>>v2: Updated to take advantage of the request unreference no longer requiring the
> >>>mutex lock.
> >>>
> >>>For: VIZ-5190
> >>>Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> >>>---
> >>
> >>[snip]
> >>
> >>>@@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
> >>>  	list_del_init(&request->list);
> >>>  	i915_gem_request_remove_from_client(request);
> >>>
> >>>+	/* In case the request is still in the signal pending list */
> >>>+	if (!list_empty(&request->signal_list))
> >>>+		request->cancelled = true;
> >>>+
> >>
> >>Another thing I did not see implemented is the sync_fence error state.
> >>
> >>This is more about the Android part, but related to this canceled flag so I
> >>am commenting here.
> >>
> >>I thought when TDR kicks in and we set request->cancelled to true, there
> >>should be a code path which somehow makes sync_fence->status negative.
> >>
> >>As it is, because fence_signal will not be called on canceled, I thought
> >>waiters will wait until timeout, rather than being woken up and return error
> >>status.
> >>
> >>For this to work you would somehow need to make sync_fence->status go
> >>negative. With normal fence completion it goes from 1 -> 0, via the
> >>completion callback. I did not immediately see how to make it go negative
> >>using the existing API.
> >
> >I think back when we did struct fence we decided that we won't care yet
> >about forwarding error state since doing that across drivers if you have a
> >chain of fences looked complicated. And no one had any clear idea about
> >what kind of semantics we really want. If we want this we'd need to add
> >it, but probably better to do that as a follow-up (usual caveat about
> >open-source userspace and demonstration vehicles apply and all that).
> 
> Hm, I am not sure but it feels to me not having an error state is a problem.
> Without it userspace can just keep waiting and waiting upon a fence even
> though the driver has discarded that workload and never plans to resubmit
> it. Am I missing something?

fences still must complete eventually, they simply won't tell you whether
the gpu died before completing the fence or not. Which is in line with gl
spec for fences and arb_robustness - inquiring about gpu hangs is done
through sideband apis.
-Daniel
Maarten Lankhorst Aug. 5, 2015, 11:05 a.m. UTC | #8
Op 05-08-15 om 10:05 schreef Daniel Vetter:
> On Mon, Aug 03, 2015 at 10:20:29AM +0100, Tvrtko Ursulin wrote:
>> On 07/27/2015 03:00 PM, Daniel Vetter wrote:
>>> On Mon, Jul 27, 2015 at 02:20:43PM +0100, Tvrtko Ursulin wrote:
>>>> On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> The intended usage model for struct fence is that the signalled status should be
>>>>> set on demand rather than polled. That is, there should not be a need for a
>>>>> 'signaled' function to be called everytime the status is queried. Instead,
>>>>> 'something' should be done to enable a signal callback from the hardware which
>>>>> will update the state directly. In the case of requests, this is the seqno
>>>>> update interrupt. The idea is that this callback will only be enabled on demand
>>>>> when something actually tries to wait on the fence.
>>>>>
>>>>> This change removes the polling test and replaces it with the callback scheme.
>>>>> Each fence is added to a 'please poke me' list at the start of
>>>>> i915_add_request(). The interrupt handler then scans through the 'poke me' list
>>>>> when a new seqno pops out and signals any matching fence/request. The fence is
>>>>> then removed from the list so the entire request stack does not need to be
>>>>> scanned every time. Note that the fence is added to the list before the commands
>>>>> to generate the seqno interrupt are added to the ring. Thus the sequence is
>>>>> guaranteed to be race free if the interrupt is already enabled.
>>>>>
>>>>> Note that the interrupt is only enabled on demand (i.e. when __wait_request() is
>>>>> called). Thus there is still a potential race when enabling the interrupt as the
>>>>> request may already have completed. However, this is simply solved by calling
>>>>> the interrupt processing code immediately after enabling the interrupt and
>>>>> thereby checking for already completed requests.
>>>>>
>>>>> Lastly, the ring clean up code has the possibility to cancel outstanding
>>>>> requests (e.g. because TDR has reset the ring). These requests will never get
>>>>> signalled and so must be removed from the signal list manually. This is done by
>>>>> setting a 'cancelled' flag and then calling the regular notify/retire code path
>>>>> rather than attempting to duplicate the list manipulatation and clean up code in
>>>>> multiple places. This also avoid any race condition where the cancellation
>>>>> request might occur after/during the completion interrupt actually arriving.
>>>>>
>>>>> v2: Updated to take advantage of the request unreference no longer requiring the
>>>>> mutex lock.
>>>>>
>>>>> For: VIZ-5190
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> ---
>>>> [snip]
>>>>
>>>>> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>>>>>  	list_del_init(&request->list);
>>>>>  	i915_gem_request_remove_from_client(request);
>>>>>
>>>>> +	/* In case the request is still in the signal pending list */
>>>>> +	if (!list_empty(&request->signal_list))
>>>>> +		request->cancelled = true;
>>>>> +
>>>> Another thing I did not see implemented is the sync_fence error state.
>>>>
>>>> This is more about the Android part, but related to this canceled flag so I
>>>> am commenting here.
>>>>
>>>> I thought when TDR kicks in and we set request->cancelled to true, there
>>>> should be a code path which somehow makes sync_fence->status negative.
>>>>
>>>> As it is, because fence_signal will not be called on canceled, I thought
>>>> waiters will wait until timeout, rather than being woken up and return error
>>>> status.
>>>>
>>>> For this to work you would somehow need to make sync_fence->status go
>>>> negative. With normal fence completion it goes from 1 -> 0, via the
>>>> completion callback. I did not immediately see how to make it go negative
>>>> using the existing API.
>>> I think back when we did struct fence we decided that we won't care yet
>>> about forwarding error state since doing that across drivers if you have a
>>> chain of fences looked complicated. And no one had any clear idea about
>>> what kind of semantics we really want. If we want this we'd need to add
>>> it, but probably better to do that as a follow-up (usual caveat about
>>> open-source userspace and demonstration vehicles apply and all that).
>> Hm, I am not sure but it feels to me not having an error state is a problem.
>> Without it userspace can just keep waiting and waiting upon a fence even
>> though the driver has discarded that workload and never plans to resubmit
>> it. Am I missing something?
> fences still must complete eventually, they simply won't tell you whether
> the gpu died before completing the fence or not. Which is in line with gl
> spec for fences and arb_robustness - inquiring about gpu hangs is done
> through sideband apis.
Actually you can tell. Set fence->status before signalling.

~Maarten
John Harrison Oct. 28, 2015, 1 p.m. UTC | #9
On 27/07/2015 12:33, Tvrtko Ursulin wrote:
>
> Hi,
>
> On 07/17/2015 03:31 PM, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The intended usage model for struct fence is that the signalled 
>> status should be
>> set on demand rather than polled. That is, there should not be a need 
>> for a
>> 'signaled' function to be called everytime the status is queried. 
>> Instead,
>> 'something' should be done to enable a signal callback from the 
>> hardware which
>> will update the state directly. In the case of requests, this is the 
>> seqno
>> update interrupt. The idea is that this callback will only be enabled 
>> on demand
>> when something actually tries to wait on the fence.
>>
>> This change removes the polling test and replaces it with the 
>> callback scheme.
>> Each fence is added to a 'please poke me' list at the start of
>> i915_add_request(). The interrupt handler then scans through the 
>> 'poke me' list
>> when a new seqno pops out and signals any matching fence/request. The 
>> fence is
>> then removed from the list so the entire request stack does not need 
>> to be
>> scanned every time. Note that the fence is added to the list before 
>> the commands
>> to generate the seqno interrupt are added to the ring. Thus the 
>> sequence is
>> guaranteed to be race free if the interrupt is already enabled.
>>
>> Note that the interrupt is only enabled on demand (i.e. when 
>> __wait_request() is
>> called). Thus there is still a potential race when enabling the 
>> interrupt as the
>> request may already have completed. However, this is simply solved by 
>> calling
>> the interrupt processing code immediately after enabling the 
>> interrupt and
>> thereby checking for already completed requests.
>>
>> Lastly, the ring clean up code has the possibility to cancel outstanding
>> requests (e.g. because TDR has reset the ring). These requests will 
>> never get
>> signalled and so must be removed from the signal list manually. This 
>> is done by
>> setting a 'cancelled' flag and then calling the regular notify/retire 
>> code path
>> rather than attempting to duplicate the list manipulatation and clean 
>> up code in
>> multiple places. This also avoid any race condition where the 
>> cancellation
>> request might occur after/during the completion interrupt actually 
>> arriving.
>>
>> v2: Updated to take advantage of the request unreference no longer 
>> requiring the
>> mutex lock.
>>
>> For: VIZ-5190
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         |   8 ++
>>   drivers/gpu/drm/i915/i915_gem.c         | 132 
>> +++++++++++++++++++++++++++++---
>>   drivers/gpu/drm/i915/i915_irq.c         |   2 +
>>   drivers/gpu/drm/i915/intel_lrc.c        |   1 +
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   1 +
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |   1 +
>>   6 files changed, 136 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 61c3db2..d7f1aa5 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2163,7 +2163,11 @@ void i915_gem_track_fb(struct 
>> drm_i915_gem_object *old,
>>   struct drm_i915_gem_request {
>>       /** Underlying object for implementing the signal/wait stuff. */
>>       struct fence fence;
>> +    struct list_head signal_list;
>> +    struct list_head unsignal_list;
>
> In addition to what Daniel said (one list_head looks enough) it is 
> customary to call it _link.
>
>>       struct list_head delay_free_list;
>> +    bool cancelled;
>> +    bool irq_enabled;
>>
>>       /** On Which ring this request was generated */
>>       struct drm_i915_private *i915;
>> @@ -2241,6 +2245,10 @@ int i915_gem_request_alloc(struct 
>> intel_engine_cs *ring,
>>                  struct drm_i915_gem_request **req_out);
>>   void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>
>> +void i915_gem_request_submit(struct drm_i915_gem_request *req);
>> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request 
>> *req);
>> +void i915_gem_request_notify(struct intel_engine_cs *ring);
>> +
>>   int i915_create_fence_timeline(struct drm_device *dev,
>>                      struct intel_context *ctx,
>>                      struct intel_engine_cs *ring);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 482835a..7c589a9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1222,6 +1222,11 @@ int __i915_wait_request(struct 
>> drm_i915_gem_request *req,
>>       if (list_empty(&req->list))
>>           return 0;
>>
>> +    /*
>> +     * Enable interrupt completion of the request.
>> +     */
>> +    i915_gem_request_enable_interrupt(req);
>> +
>>       if (i915_gem_request_completed(req))
>>           return 0;
>>
>> @@ -1382,6 +1387,10 @@ static void i915_gem_request_retire(struct 
>> drm_i915_gem_request *request)
>>       list_del_init(&request->list);
>>       i915_gem_request_remove_from_client(request);
>>
>> +    /* In case the request is still in the signal pending list */
>> +    if (!list_empty(&request->signal_list))
>> +        request->cancelled = true;
>> +
>>       i915_gem_request_unreference(request);
>>   }
>>
>> @@ -2534,6 +2543,12 @@ void __i915_add_request(struct 
>> drm_i915_gem_request *request,
>>        */
>>       request->postfix = intel_ring_get_tail(ringbuf);
>>
>> +    /*
>> +     * Add the fence to the pending list before emitting the 
>> commands to
>> +     * generate a seqno notification interrupt.
>> +     */
>> +    i915_gem_request_submit(request);
>> +
>>       if (i915.enable_execlists)
>>           ret = ring->emit_request(request);
>>       else {
>> @@ -2653,6 +2668,9 @@ static void i915_gem_request_free(struct 
>> drm_i915_gem_request *req)
>>           i915_gem_context_unreference(ctx);
>>       }
>>
>> +    if (req->irq_enabled)
>> +        req->ring->irq_put(req->ring);
>> +
>
> We get here with interrupts still enabled only if userspace is 
> abandoning a wait on an unsignaled fence, did I get that right?
It implies the request has been abandoned in some manner, yes. E.g. TDR 
has killed it, user space has given up, ...

>
>> kmem_cache_free(req->i915->requests, req);
>>   }
>>
>> @@ -2668,24 +2686,105 @@ static const char 
>> *i915_gem_request_get_timeline_name(struct fence *req_fence)
>>       return req->ring->name;
>>   }
>>
>> -static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>> +/*
>> + * The request has been submitted to the hardware so add the fence 
>> to the
>> + * list of signalable fences.
>> + *
>> + * NB: This does not enable interrupts yet. That only occurs on 
>> demand when
>> + * the request is actually waited on. However, adding it to the list 
>> early
>> + * ensures that there is no race condition where the interrupt could 
>> pop
>> + * out prematurely and thus be completely lost. The race is merely 
>> that the
>> + * interrupt must be manually checked for after being enabled.
>> + */
>> +void i915_gem_request_submit(struct drm_i915_gem_request *req)
>>   {
>> -    /* Interrupt driven fences are not implemented yet.*/
>> -    WARN(true, "This should not be called!");
>> -    return true;
>> +    fence_enable_sw_signaling(&req->fence);
>>   }
>>
>> -static bool i915_gem_request_is_completed(struct fence *req_fence)
>> +/*
>> + * The request is being actively waited on, so enable interrupt based
>> + * completion signalling.
>> + */
>> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request 
>> *req)
>> +{
>> +    if (req->irq_enabled)
>> +        return;
>> +
>> +    WARN_ON(!req->ring->irq_get(req->ring));
>> +    req->irq_enabled = true;
>
> req->irq_enabled manipulations look racy. Here and in request free it 
> is protected by struct_mutex, but that is not held in 
> i915_gem_request_notify. Initial feeling is you should use 
> ring->fence_lock everyplace you query/manipulate req->irq_enabled.

The only asynchronous access is from _notify() which disables IRQs if 
the flag is set and then clears it. That can't race with the enable 
because the enable only sets the flag after setting IRQs on. The worst 
that can happen on a race is that IRQs are enabled and then immediately 
disabled - truly concurrent execution would result in one test or the 
other failing and so only one code path would be taken. The only other 
usage is in _request_free() but that can only run when the last 
reference has been dropped and that means it is no longer on any list 
that _notify() can see.


>> +
>> +    /*
>> +     * Because the interrupt is only enabled on demand, there is a race
>> +     * where the interrupt can fire before anyone is looking for it. So
>> +     * do an explicit check for missed interrupts.
>> +     */
>> +    i915_gem_request_notify(req->ring);
>> +}
>> +
>> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>>   {
>>       struct drm_i915_gem_request *req = container_of(req_fence,
>>                            typeof(*req), fence);
>> +
>> +    i915_gem_request_reference(req);
>> +    WARN_ON(!list_empty(&req->signal_list));
>
> It looks very unsafe to proceed normally after this WARN_ON. It should 
> probably return false here to preserve data structure sanity.
This really should be a BUG_ON but Daniel doesn't like those. It should 
be an impossible code path and not something that can be hit by the user 
being dumb. Anyway, this code has all been changed in the latest 
incarnation.


>
>> + list_add_tail(&req->signal_list, &req->ring->fence_signal_list);
>> +
>> +    /*
>> +     * Note that signalling is always enabled for every request before
>> +     * that request is submitted to the hardware. Therefore there is
>> +     * no race condition whereby the signal could pop out before the
>> +     * request has been added to the list. Hence no need to check
>> +     * for completion, undo the list add and return false.
>> +     *
>> +     * NB: Interrupts are only enabled on demand. Thus there is still a
>> +     * race where the request could complete before the interrupt has
>> +     * been enabled. Thus care must be taken at that point.
>> +     */
>> +
>> +    return true;
>> +}
>> +
>> +void i915_gem_request_notify(struct intel_engine_cs *ring)
>> +{
>> +    struct drm_i915_gem_request *req, *req_next;
>> +    unsigned long flags;
>>       u32 seqno;
>> +    LIST_HEAD(free_list);
>>
>> -    BUG_ON(req == NULL);
>> +    if (list_empty(&ring->fence_signal_list))
>> +        return;
>> +
>> +    seqno = ring->get_seqno(ring, false);
>> +
>> +    spin_lock_irqsave(&ring->fence_lock, flags);
>> +    list_for_each_entry_safe(req, req_next, 
>> &ring->fence_signal_list, signal_list) {
>> +        if (!req->cancelled) {
>> +            if (!i915_seqno_passed(seqno, req->seqno))
>> +                continue;
>>
>> -    seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
>> +            fence_signal_locked(&req->fence);
>> +        }
>> +
>> +        list_del_init(&req->signal_list);
>
> I haven't managed to figure out why is this apparently removing 
> requests which have not been signalled from the signal_list?
>
> Shouldn't they be moved to free_list only if i915_seqno_passed?
Requests are only removed from the signal list if either a) the seqno 
has passed or b) the request has been cancelled and thus will never 
actually complete. Not sure what other scenario you are seeing.

>
>> +        if (req->irq_enabled) {
>> +            req->ring->irq_put(req->ring);
>> +            req->irq_enabled = false;
>> +        }
>>
>> -    return i915_seqno_passed(seqno, req->seqno);
>> +        /* Can't unreference here because that might grab fence_lock */
>> +        list_add_tail(&req->unsignal_list, &free_list);
>> +    }
>> +    spin_unlock_irqrestore(&ring->fence_lock, flags);
>> +
>> +    /* It should now be safe to actually free the requests */
>> +    while (!list_empty(&free_list)) {
>> +        req = list_first_entry(&free_list,
>> +                       struct drm_i915_gem_request, unsignal_list);
>> +        list_del(&req->unsignal_list);
>> +
>> +        i915_gem_request_unreference(req);
>> +    }
>>   }
>>
>>   static void i915_fence_timeline_value_str(struct fence *fence, char 
>> *str, int size)
>> @@ -2711,7 +2810,6 @@ static const struct fence_ops 
>> i915_gem_request_fops = {
>>       .get_driver_name    = i915_gem_request_get_driver_name,
>>       .get_timeline_name    = i915_gem_request_get_timeline_name,
>>       .enable_signaling    = i915_gem_request_enable_signaling,
>> -    .signaled        = i915_gem_request_is_completed,
>>       .wait            = fence_default_wait,
>>       .release        = i915_gem_request_release,
>>       .fence_value_str    = i915_fence_value_str,
>> @@ -2791,6 +2889,7 @@ int i915_gem_request_alloc(struct 
>> intel_engine_cs *ring,
>>           goto err;
>>       }
>>
>> +    INIT_LIST_HEAD(&req->signal_list);
>>       fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
>> ctx->engine[ring->id].fence_timeline.fence_context,
>> i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
>> @@ -2913,6 +3012,13 @@ static void i915_gem_reset_ring_cleanup(struct 
>> drm_i915_private *dev_priv,
>>
>>           i915_gem_request_retire(request);
>>       }
>> +
>> +    /*
>> +     * Make sure any requests that were on the signal pending list get
>> +     * cleaned up.
>> +     */
>> +    i915_gem_request_notify(ring);
>> +    i915_gem_retire_requests_ring(ring);
>
> Would i915_gem_retire_requests_ring be enough given how it calls 
> i915_gem_request_notify itself as the first thing below?
>
Oops, left over from before retire called notify explicitly.

>>   }
>>
>>   void i915_gem_restore_fences(struct drm_device *dev)
>> @@ -2968,6 +3074,13 @@ i915_gem_retire_requests_ring(struct 
>> intel_engine_cs *ring)
>>   {
>>       WARN_ON(i915_verify_lists(ring->dev));
>>
>> +    /*
>> +     * If no-one has waited on a request recently then interrupts will
>> +     * not have been enabled and thus no requests will ever be 
>> marked as
>> +     * completed. So do an interrupt check now.
>> +     */
>> +    i915_gem_request_notify(ring);
>> +
>>       /* Retire requests first as we use it above for the early return.
>>        * If we retire requests last, we may use a later seqno and so 
>> clear
>>        * the requests lists without clearing the active list, leading to
>> @@ -5345,6 +5458,7 @@ init_ring_lists(struct intel_engine_cs *ring)
>>   {
>>       INIT_LIST_HEAD(&ring->active_list);
>>       INIT_LIST_HEAD(&ring->request_list);
>> +    INIT_LIST_HEAD(&ring->fence_signal_list);
>>       INIT_LIST_HEAD(&ring->delayed_free_list);
>>   }
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c 
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index d87f173..e446509 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -853,6 +853,8 @@ static void notify_ring(struct intel_engine_cs 
>> *ring)
>>
>>       trace_i915_gem_request_notify(ring);
>>
>> +    i915_gem_request_notify(ring);
>> +
>
> How many requests are typically on signal_list on some typical 
> workloads? This could be a significant performance change since on 
> every user interrupt it would talk it all potentially only removing 
> one request at a time.
>
Obviously, some of the IGT tests can produce very large request lists 
(e.g. gem_exec_nop) but running 'normal' stuff rarely seems to generate 
a long list. E.g. running GLBench + GLXGears on an Ubuntu desktop I get 
95% of the time there are ten or fewer requests in the list but the loop 
iterates only once (49%) or twice (46%) because the first request gets 
signalled and the second (if present) aborts the loop.

The biggest problem seems to be that the hardware is brain dead with 
respect to generating interrupts. So if two requests complete in quick 
succession and the ISR only gets to see the second seqno, it still gets 
called a second time. Thus we actually get a ridiculous figure of 60% of 
ISR calls being no-ops because the seqno has not actually advanced. The 
code now checks for duplicate seqnos and early exits. Not sure how to 
get rid of the call completely.


> These are just review comments on this particular patch without 
> thinking yet of the bigger design questions Daniel has raised.
>
> Regards,
>
> Tvrtko
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 61c3db2..d7f1aa5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2163,7 +2163,11 @@  void i915_gem_track_fb(struct drm_i915_gem_object *old,
 struct drm_i915_gem_request {
 	/** Underlying object for implementing the signal/wait stuff. */
 	struct fence fence;
+	struct list_head signal_list;
+	struct list_head unsignal_list;
 	struct list_head delay_free_list;
+	bool cancelled;
+	bool irq_enabled;
 
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
@@ -2241,6 +2245,10 @@  int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 
+void i915_gem_request_submit(struct drm_i915_gem_request *req);
+void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req);
+void i915_gem_request_notify(struct intel_engine_cs *ring);
+
 int i915_create_fence_timeline(struct drm_device *dev,
 			       struct intel_context *ctx,
 			       struct intel_engine_cs *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 482835a..7c589a9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1222,6 +1222,11 @@  int __i915_wait_request(struct drm_i915_gem_request *req,
 	if (list_empty(&req->list))
 		return 0;
 
+	/*
+	 * Enable interrupt completion of the request.
+	 */
+	i915_gem_request_enable_interrupt(req);
+
 	if (i915_gem_request_completed(req))
 		return 0;
 
@@ -1382,6 +1387,10 @@  static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->list);
 	i915_gem_request_remove_from_client(request);
 
+	/* In case the request is still in the signal pending list */
+	if (!list_empty(&request->signal_list))
+		request->cancelled = true;
+
 	i915_gem_request_unreference(request);
 }
 
@@ -2534,6 +2543,12 @@  void __i915_add_request(struct drm_i915_gem_request *request,
 	 */
 	request->postfix = intel_ring_get_tail(ringbuf);
 
+	/*
+	 * Add the fence to the pending list before emitting the commands to
+	 * generate a seqno notification interrupt.
+	 */
+	i915_gem_request_submit(request);
+
 	if (i915.enable_execlists)
 		ret = ring->emit_request(request);
 	else {
@@ -2653,6 +2668,9 @@  static void i915_gem_request_free(struct drm_i915_gem_request *req)
 		i915_gem_context_unreference(ctx);
 	}
 
+	if (req->irq_enabled)
+		req->ring->irq_put(req->ring);
+
 	kmem_cache_free(req->i915->requests, req);
 }
 
@@ -2668,24 +2686,105 @@  static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
 	return req->ring->name;
 }
 
-static bool i915_gem_request_enable_signaling(struct fence *req_fence)
+/*
+ * The request has been submitted to the hardware so add the fence to the
+ * list of signalable fences.
+ *
+ * NB: This does not enable interrupts yet. That only occurs on demand when
+ * the request is actually waited on. However, adding it to the list early
+ * ensures that there is no race condition where the interrupt could pop
+ * out prematurely and thus be completely lost. The race is merely that the
+ * interrupt must be manually checked for after being enabled.
+ */
+void i915_gem_request_submit(struct drm_i915_gem_request *req)
 {
-	/* Interrupt driven fences are not implemented yet.*/
-	WARN(true, "This should not be called!");
-	return true;
+	fence_enable_sw_signaling(&req->fence);
 }
 
-static bool i915_gem_request_is_completed(struct fence *req_fence)
+/*
+ * The request is being actively waited on, so enable interrupt based
+ * completion signalling.
+ */
+void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req)
+{
+	if (req->irq_enabled)
+		return;
+
+	WARN_ON(!req->ring->irq_get(req->ring));
+	req->irq_enabled = true;
+
+	/*
+	 * Because the interrupt is only enabled on demand, there is a race
+	 * where the interrupt can fire before anyone is looking for it. So
+	 * do an explicit check for missed interrupts.
+	 */
+	i915_gem_request_notify(req->ring);
+}
+
+static bool i915_gem_request_enable_signaling(struct fence *req_fence)
 {
 	struct drm_i915_gem_request *req = container_of(req_fence,
 						 typeof(*req), fence);
+
+	i915_gem_request_reference(req);
+	WARN_ON(!list_empty(&req->signal_list));
+	list_add_tail(&req->signal_list, &req->ring->fence_signal_list);
+
+	/*
+	 * Note that signalling is always enabled for every request before
+	 * that request is submitted to the hardware. Therefore there is
+	 * no race condition whereby the signal could pop out before the
+	 * request has been added to the list. Hence no need to check
+	 * for completion, undo the list add and return false.
+	 *
+	 * NB: Interrupts are only enabled on demand. Thus there is still a
+	 * race where the request could complete before the interrupt has
+	 * been enabled. Thus care must be taken at that point.
+	 */
+
+	return true;
+}
+
+void i915_gem_request_notify(struct intel_engine_cs *ring)
+{
+	struct drm_i915_gem_request *req, *req_next;
+	unsigned long flags;
 	u32 seqno;
+	LIST_HEAD(free_list);
 
-	BUG_ON(req == NULL);
+	if (list_empty(&ring->fence_signal_list))
+		return;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&ring->fence_lock, flags);
+	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_list) {
+		if (!req->cancelled) {
+			if (!i915_seqno_passed(seqno, req->seqno))
+				continue;
 
-	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
+			fence_signal_locked(&req->fence);
+		}
+
+		list_del_init(&req->signal_list);
+		if (req->irq_enabled) {
+			req->ring->irq_put(req->ring);
+			req->irq_enabled = false;
+		}
 
-	return i915_seqno_passed(seqno, req->seqno);
+		/* Can't unreference here because that might grab fence_lock */
+		list_add_tail(&req->unsignal_list, &free_list);
+	}
+	spin_unlock_irqrestore(&ring->fence_lock, flags);
+
+	/* It should now be safe to actually free the requests */
+	while (!list_empty(&free_list)) {
+		req = list_first_entry(&free_list,
+				       struct drm_i915_gem_request, unsignal_list);
+		list_del(&req->unsignal_list);
+
+		i915_gem_request_unreference(req);
+	}
 }
 
 static void i915_fence_timeline_value_str(struct fence *fence, char *str, int size)
@@ -2711,7 +2810,6 @@  static const struct fence_ops i915_gem_request_fops = {
 	.get_driver_name	= i915_gem_request_get_driver_name,
 	.get_timeline_name	= i915_gem_request_get_timeline_name,
 	.enable_signaling	= i915_gem_request_enable_signaling,
-	.signaled		= i915_gem_request_is_completed,
 	.wait			= fence_default_wait,
 	.release		= i915_gem_request_release,
 	.fence_value_str	= i915_fence_value_str,
@@ -2791,6 +2889,7 @@  int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		goto err;
 	}
 
+	INIT_LIST_HEAD(&req->signal_list);
 	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
 		   ctx->engine[ring->id].fence_timeline.fence_context,
 		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
@@ -2913,6 +3012,13 @@  static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 
 		i915_gem_request_retire(request);
 	}
+
+	/*
+	 * Make sure any requests that were on the signal pending list get
+	 * cleaned up.
+	 */
+	i915_gem_request_notify(ring);
+	i915_gem_retire_requests_ring(ring);
 }
 
 void i915_gem_restore_fences(struct drm_device *dev)
@@ -2968,6 +3074,13 @@  i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 {
 	WARN_ON(i915_verify_lists(ring->dev));
 
+	/*
+	 * If no-one has waited on a request recently then interrupts will
+	 * not have been enabled and thus no requests will ever be marked as
+	 * completed. So do an interrupt check now.
+	 */
+	i915_gem_request_notify(ring);
+
 	/* Retire requests first as we use it above for the early return.
 	 * If we retire requests last, we may use a later seqno and so clear
 	 * the requests lists without clearing the active list, leading to
@@ -5345,6 +5458,7 @@  init_ring_lists(struct intel_engine_cs *ring)
 {
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index d87f173..e446509 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -853,6 +853,8 @@  static void notify_ring(struct intel_engine_cs *ring)
 
 	trace_i915_gem_request_notify(ring);
 
+	i915_gem_request_notify(ring);
+
 	wake_up_all(&ring->irq_queue);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9ee80f5..18dbd5c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1808,6 +1808,7 @@  static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
 	spin_lock_init(&ring->delayed_free_lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 11494a3..83a5254 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2040,6 +2040,7 @@  static int intel_init_ring_buffer(struct drm_device *dev,
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
 	spin_lock_init(&ring->delayed_free_lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 68173a3..2e68b73 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -352,6 +352,7 @@  struct  intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 
 	spinlock_t fence_lock;
+	struct list_head fence_signal_list;
 };
 
 bool intel_ring_initialized(struct intel_engine_cs *ring);