diff mbox

[1/4] drm/i915/bdw: Clean up execlist queue items in retire_work

Message ID 1414576373-25121-1-git-send-email-thomas.daniel@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Thomas Daniel Oct. 29, 2014, 9:52 a.m. UTC
No longer create a work item to clean each execlist queue item.
Instead, move retired execlist requests to a queue and clean up the
items during retire_requests.

v2: Fix legacy ring path broken during overzealous cleanup

v3: Update idle detection to take execlists queue into account

Issue: VIZ-4274
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         |    4 +++
 drivers/gpu/drm/i915/intel_lrc.c        |   52 ++++++++++++++++++-------------
 drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
 4 files changed, 36 insertions(+), 23 deletions(-)

Comments

Daniel Vetter Nov. 3, 2014, 3:33 p.m. UTC | #1
On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote:
> No longer create a work item to clean each execlist queue item.
> Instead, move retired execlist requests to a queue and clean up the
> items during retire_requests.
> 
> v2: Fix legacy ring path broken during overzealous cleanup
> 
> v3: Update idle detection to take execlists queue into account
> 
> Issue: VIZ-4274
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c         |    4 +++
>  drivers/gpu/drm/i915/intel_lrc.c        |   52 ++++++++++++++++++-------------
>  drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
>  4 files changed, 36 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 827edb5..df28202 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device *dev)
>  	for_each_ring(ring, dev_priv, i) {
>  		i915_gem_retire_requests_ring(ring);
>  		idle &= list_empty(&ring->request_list);
> +		if (i915.enable_execlists) {
> +			idle &= list_empty(&ring->execlist_queue);
> +			intel_execlists_retire_requests(ring);

This needs to be the other way round I think - we care about idleness
after all the currently processed stuff is retired, not before. Otherwise
we might notice the busy->idle transition one invocation too late.
-Daniel
Thomas Daniel Nov. 3, 2014, 4:05 p.m. UTC | #2
> -----Original Message-----
> From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Monday, November 03, 2014 3:33 PM
> To: Daniel, Thomas
> Cc: intel-gfx@lists.freedesktop.org; shuang.he@linux.intel.com
> Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915/bdw: Clean up execlist queue
> items in retire_work
> 
> On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote:
> > No longer create a work item to clean each execlist queue item.
> > Instead, move retired execlist requests to a queue and clean up the
> > items during retire_requests.
> >
> > v2: Fix legacy ring path broken during overzealous cleanup
> >
> > v3: Update idle detection to take execlists queue into account
> >
> > Issue: VIZ-4274
> > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c         |    4 +++
> >  drivers/gpu/drm/i915/intel_lrc.c        |   52 ++++++++++++++++++-----------
> --
> >  drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
> >  4 files changed, 36 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > b/drivers/gpu/drm/i915/i915_gem.c index 827edb5..df28202 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device
> *dev)
> >  	for_each_ring(ring, dev_priv, i) {
> >  		i915_gem_retire_requests_ring(ring);
> >  		idle &= list_empty(&ring->request_list);
> > +		if (i915.enable_execlists) {
> > +			idle &= list_empty(&ring->execlist_queue);
> > +			intel_execlists_retire_requests(ring);
> 
> This needs to be the other way round I think - we care about idleness after all
> the currently processed stuff is retired, not before. Otherwise we might
> notice the busy->idle transition one invocation too late.
I thought for a while about this.  The GPU will be idle when the
execlist_queues are empty.
Intel_execlists_retire_requests() cleans up requests which have already
finished so it is more conservative (in terms of CPU idleness) to check the
queue beforehand.  I thought this would be more desirable than
potentially reporting idleness early...
Intel_execlists_retire_requests() can not affect the state of the queue.
And there is no point checking the execlist_retired_req_list because
execlists_retire_requests() always empties it.

Thomas.

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Daniel Vetter Nov. 3, 2014, 4:17 p.m. UTC | #3
On Mon, Nov 03, 2014 at 04:05:03PM +0000, Daniel, Thomas wrote:
> > -----Original Message-----
> > From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel
> > Vetter
> > Sent: Monday, November 03, 2014 3:33 PM
> > To: Daniel, Thomas
> > Cc: intel-gfx@lists.freedesktop.org; shuang.he@linux.intel.com
> > Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915/bdw: Clean up execlist queue
> > items in retire_work
> > 
> > On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote:
> > > No longer create a work item to clean each execlist queue item.
> > > Instead, move retired execlist requests to a queue and clean up the
> > > items during retire_requests.
> > >
> > > v2: Fix legacy ring path broken during overzealous cleanup
> > >
> > > v3: Update idle detection to take execlists queue into account
> > >
> > > Issue: VIZ-4274
> > > Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c         |    4 +++
> > >  drivers/gpu/drm/i915/intel_lrc.c        |   52 ++++++++++++++++++-----------
> > --
> > >  drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
> > >  drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
> > >  4 files changed, 36 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c
> > > b/drivers/gpu/drm/i915/i915_gem.c index 827edb5..df28202 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device
> > *dev)
> > >  	for_each_ring(ring, dev_priv, i) {
> > >  		i915_gem_retire_requests_ring(ring);
> > >  		idle &= list_empty(&ring->request_list);
> > > +		if (i915.enable_execlists) {
> > > +			idle &= list_empty(&ring->execlist_queue);
> > > +			intel_execlists_retire_requests(ring);
> > 
> > This needs to be the other way round I think - we care about idleness after all
> > the currently processed stuff is retired, not before. Otherwise we might
> > notice the busy->idle transition one invocation too late.
> I thought for a while about this.  The GPU will be idle when the
> execlist_queues are empty.
> Intel_execlists_retire_requests() cleans up requests which have already
> finished so it is more conservative (in terms of CPU idleness) to check the
> queue beforehand.  I thought this would be more desirable than
> potentially reporting idleness early...
> Intel_execlists_retire_requests() can not affect the state of the queue.
> And there is no point checking the execlist_retired_req_list because
> execlists_retire_requests() always empties it.

Ok, I mixed things up without looking ;-)

But that means you acces the execlist_queue, which is also accessed from
irq code, without holding the required locks? This is all a bit confusing
to poor me ...
-Daniel
Chris Wilson Nov. 4, 2014, 9:11 a.m. UTC | #4
On Wed, Oct 29, 2014 at 09:52:50AM +0000, Thomas Daniel wrote:
> No longer create a work item to clean each execlist queue item.
> Instead, move retired execlist requests to a queue and clean up the
> items during retire_requests.
> 
> v2: Fix legacy ring path broken during overzealous cleanup
> 
> v3: Update idle detection to take execlists queue into account
> 
> Issue: VIZ-4274
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c         |    4 +++
>  drivers/gpu/drm/i915/intel_lrc.c        |   52 ++++++++++++++++++-------------
>  drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |    1 +
>  4 files changed, 36 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 827edb5..df28202 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2718,6 +2718,10 @@ i915_gem_retire_requests(struct drm_device *dev)
>  	for_each_ring(ring, dev_priv, i) {
>  		i915_gem_retire_requests_ring(ring);
>  		idle &= list_empty(&ring->request_list);
> +		if (i915.enable_execlists) {

Every time you do this, a kitten dies.

If only we have an intel_engine_cs that could abstract the differences
between retirement on the various submission ports and encapsulate that
away from the core GEM buffer/request handling.

If only I hadn't already sent a patch showing exactly how to do that.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 827edb5..df28202 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2718,6 +2718,10 @@  i915_gem_retire_requests(struct drm_device *dev)
 	for_each_ring(ring, dev_priv, i) {
 		i915_gem_retire_requests_ring(ring);
 		idle &= list_empty(&ring->request_list);
+		if (i915.enable_execlists) {
+			idle &= list_empty(&ring->execlist_queue);
+			intel_execlists_retire_requests(ring);
+		}
 	}
 
 	if (idle)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index cd74e5c..87ce445 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -386,7 +386,6 @@  static void execlists_context_unqueue(struct intel_engine_cs *ring)
 {
 	struct intel_ctx_submit_request *req0 = NULL, *req1 = NULL;
 	struct intel_ctx_submit_request *cursor = NULL, *tmp = NULL;
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 
 	assert_spin_locked(&ring->execlist_lock);
 
@@ -403,7 +402,8 @@  static void execlists_context_unqueue(struct intel_engine_cs *ring)
 			 * will update tail past first request's workload */
 			cursor->elsp_submitted = req0->elsp_submitted;
 			list_del(&req0->execlist_link);
-			queue_work(dev_priv->wq, &req0->work);
+			list_add_tail(&req0->execlist_link,
+				&ring->execlist_retired_req_list);
 			req0 = cursor;
 		} else {
 			req1 = cursor;
@@ -425,7 +425,6 @@  static void execlists_context_unqueue(struct intel_engine_cs *ring)
 static bool execlists_check_remove_request(struct intel_engine_cs *ring,
 					   u32 request_id)
 {
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct intel_ctx_submit_request *head_req;
 
 	assert_spin_locked(&ring->execlist_lock);
@@ -443,7 +442,8 @@  static bool execlists_check_remove_request(struct intel_engine_cs *ring,
 
 			if (--head_req->elsp_submitted <= 0) {
 				list_del(&head_req->execlist_link);
-				queue_work(dev_priv->wq, &head_req->work);
+				list_add_tail(&head_req->execlist_link,
+					&ring->execlist_retired_req_list);
 				return true;
 			}
 		}
@@ -512,22 +512,6 @@  void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring)
 		   ((u32)ring->next_context_status_buffer & 0x07) << 8);
 }
 
-static void execlists_free_request_task(struct work_struct *work)
-{
-	struct intel_ctx_submit_request *req =
-		container_of(work, struct intel_ctx_submit_request, work);
-	struct drm_device *dev = req->ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-
-	intel_runtime_pm_put(dev_priv);
-
-	mutex_lock(&dev->struct_mutex);
-	i915_gem_context_unreference(req->ctx);
-	mutex_unlock(&dev->struct_mutex);
-
-	kfree(req);
-}
-
 static int execlists_context_queue(struct intel_engine_cs *ring,
 				   struct intel_context *to,
 				   u32 tail)
@@ -544,7 +528,6 @@  static int execlists_context_queue(struct intel_engine_cs *ring,
 	i915_gem_context_reference(req->ctx);
 	req->ring = ring;
 	req->tail = tail;
-	INIT_WORK(&req->work, execlists_free_request_task);
 
 	intel_runtime_pm_get(dev_priv);
 
@@ -565,7 +548,8 @@  static int execlists_context_queue(struct intel_engine_cs *ring,
 			WARN(tail_req->elsp_submitted != 0,
 			     "More than 2 already-submitted reqs queued\n");
 			list_del(&tail_req->execlist_link);
-			queue_work(dev_priv->wq, &tail_req->work);
+			list_add_tail(&tail_req->execlist_link,
+				&ring->execlist_retired_req_list);
 		}
 	}
 
@@ -733,6 +717,29 @@  int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 	return 0;
 }
 
+void intel_execlists_retire_requests(struct intel_engine_cs *ring)
+{
+	struct intel_ctx_submit_request *req, *tmp;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	unsigned long flags;
+	struct list_head retired_list;
+
+	WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
+	if (list_empty(&ring->execlist_retired_req_list))
+		return;
+
+	INIT_LIST_HEAD(&retired_list);
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+	list_replace_init(&ring->execlist_retired_req_list, &retired_list);
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
+	list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) {
+		intel_runtime_pm_put(dev_priv);
+		i915_gem_context_unreference(req->ctx);
+		list_del(&req->execlist_link);
+	}
+}
+
 void intel_logical_ring_stop(struct intel_engine_cs *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -1248,6 +1255,7 @@  static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	init_waitqueue_head(&ring->irq_queue);
 
 	INIT_LIST_HEAD(&ring->execlist_queue);
+	INIT_LIST_HEAD(&ring->execlist_retired_req_list);
 	spin_lock_init(&ring->execlist_lock);
 	ring->next_context_status_buffer = 0;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 33c3b4b..84bbf19 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -104,11 +104,11 @@  struct intel_ctx_submit_request {
 	u32 tail;
 
 	struct list_head execlist_link;
-	struct work_struct work;
 
 	int elsp_submitted;
 };
 
 void intel_execlists_handle_ctx_events(struct intel_engine_cs *ring);
+void intel_execlists_retire_requests(struct intel_engine_cs *ring);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 96479c8..8c002d2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -235,6 +235,7 @@  struct  intel_engine_cs {
 	/* Execlists */
 	spinlock_t execlist_lock;
 	struct list_head execlist_queue;
+	struct list_head execlist_retired_req_list;
 	u8 next_context_status_buffer;
 	u32             irq_keep_mask; /* bitmask for interrupts that should not be masked */
 	int		(*emit_request)(struct intel_ringbuffer *ringbuf);