diff mbox

[5/5] i915: fence workqueue optimization

Message ID 20170406232347.988-6-aarcange@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andrea Arcangeli April 6, 2017, 11:23 p.m. UTC
Insist to run llist_del_all() until the free_list is found empty, this
may avoid having to schedule more workqueues.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
 drivers/gpu/drm/i915/intel_display.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Chris Wilson April 7, 2017, 9:58 a.m. UTC | #1
On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote:
> Insist to run llist_del_all() until the free_list is found empty, this
> may avoid having to schedule more workqueues.

The work will already be scheduled (everytime we add the first element,
the work is scheduled, and the scheduled bit is cleared before the work
is executed). So we aren't saving the kworker from having to process
another work, but we may make that having nothing to do. The question is
whether we want to trap the kworker here, and presumably you will also want
to add a cond_resched() between passes.
-Chris
Andrea Arcangeli April 7, 2017, 1:13 p.m. UTC | #2
On Fri, Apr 07, 2017 at 10:58:38AM +0100, Chris Wilson wrote:
> On Fri, Apr 07, 2017 at 01:23:47AM +0200, Andrea Arcangeli wrote:
> > Insist to run llist_del_all() until the free_list is found empty, this
> > may avoid having to schedule more workqueues.
> 
> The work will already be scheduled (everytime we add the first element,
> the work is scheduled, and the scheduled bit is cleared before the work
> is executed). So we aren't saving the kworker from having to process
> another work, but we may make that having nothing to do. The question is
> whether we want to trap the kworker here, and presumably you will also want
> to add a cond_resched() between passes.

Yes it is somewhat dubious in the two event only case, but it will
save kworker in case of more events if there is a flood of
llist_add. It just looked fast enough but it's up to you, it's a
cmpxchg more for each intel_atomic_helper_free_state. If it's unlikely
more work is added, it's better to drop it. Agree about
cond_resched() if we keep it.

The same issue exists in __i915_gem_free_work, but I guess it's more
likely there that by the time __i915_gem_free_objects returns the
free_list isn't empty anymore because __i915_gem_free_objects has a
longer runtime but then you may want to re-evaluate that too as it's
slower for the two llist_add in a row case and only pays off from the
third.

	while ((freed = llist_del_all(&i915->mm.free_list)))
		__i915_gem_free_objects(i915, freed);
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 24f303e..931f0c7 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14374,9 +14374,9 @@  static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
 	struct intel_atomic_state *state, *next;
 	struct llist_node *freed;
 
-	freed = llist_del_all(&dev_priv->atomic_helper.free_list);
-	llist_for_each_entry_safe(state, next, freed, freed)
-		drm_atomic_state_put(&state->base);
+	while ((freed = llist_del_all(&dev_priv->atomic_helper.free_list)))
+		llist_for_each_entry_safe(state, next, freed, freed)
+			drm_atomic_state_put(&state->base);
 }
 
 static void intel_atomic_helper_free_state_worker(struct work_struct *work)