From patchwork Fri Jun 14 16:46:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10996017 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 529F976 for ; Fri, 14 Jun 2019 16:46:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 422ED26538 for ; Fri, 14 Jun 2019 16:46:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 36C46286B9; Fri, 14 Jun 2019 16:46:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id AE4B826538 for ; Fri, 14 Jun 2019 16:46:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 263E889B66; Fri, 14 Jun 2019 16:46:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6648289B66 for ; Fri, 14 Jun 2019 16:46:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16905374-1500050 for ; Fri, 14 Jun 2019 17:46:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 14 Jun 2019 17:46:04 +0100 Message-Id: <20190614164606.15633-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [CI 1/3] drm/i915: Keep contexts pinned until after the next kernel context switch X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We need to keep the context image pinned in memory until after the GPU has finished writing into it. Since it continues to write as we signal the final breadcrumb, we need to keep it pinned until the request after it is complete. Currently we know the order in which requests execute on each engine, and so to remove that presumption we need to identify a request/context-switch we know must occur after our completion. Any request queued after the signal must imply a context switch, for simplicity we use a fresh request from the kernel context. The sequence of operations for keeping the context pinned until saved is: - On context activation, we preallocate a node for each physical engine the context may operate on. This is to avoid allocations during unpinning, which may be from inside FS_RECLAIM context (aka the shrinker) - On context deactivation on retirement of the last active request (which is before we know the context has been saved), we add the preallocated node onto a barrier list on each engine - On engine idling, we emit a switch to kernel context. When this switch completes, we know that all previous contexts must have been saved, and so on retiring this request we can finally unpin all the contexts that were marked as deactivated prior to the switch. We can enhance this in future by flushing all the idle contexts on a regular heartbeat pulse of a switch to kernel context, which will also be used to check for hung engines. v2: intel_context_active_acquire/_release Signed-off-by: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 24 ++---- drivers/gpu/drm/i915/gem/i915_gem_context.h | 1 - drivers/gpu/drm/i915/gem/i915_gem_pm.c | 20 ++++- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 ++-- drivers/gpu/drm/i915/gt/intel_context.c | 80 ++++++++++++++++--- drivers/gpu/drm/i915/gt/intel_context.h | 3 + drivers/gpu/drm/i915/gt/intel_context_types.h | 6 +- drivers/gpu/drm/i915/gt/intel_engine.h | 2 - drivers/gpu/drm/i915/gt/intel_engine_cs.c | 23 +----- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 13 +-- drivers/gpu/drm/i915/gt/intel_lrc.c | 62 ++------------ drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 44 +--------- drivers/gpu/drm/i915/gt/mock_engine.c | 11 +-- drivers/gpu/drm/i915/i915_active.c | 78 ++++++++++++++++++ drivers/gpu/drm/i915/i915_active.h | 5 ++ drivers/gpu/drm/i915/i915_active_types.h | 3 + drivers/gpu/drm/i915/i915_gem.c | 4 - drivers/gpu/drm/i915/i915_request.c | 15 ---- .../gpu/drm/i915/selftests/mock_gem_device.c | 1 - 20 files changed, 219 insertions(+), 195 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index c86ca9f21532..6200060aef05 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -692,17 +692,6 @@ int i915_gem_contexts_init(struct drm_i915_private *dev_priv) return 0; } -void i915_gem_contexts_lost(struct drm_i915_private *dev_priv) -{ - struct intel_engine_cs *engine; - enum intel_engine_id id; - - lockdep_assert_held(&dev_priv->drm.struct_mutex); - - for_each_engine(engine, dev_priv, id) - intel_engine_lost_context(engine); -} - void i915_gem_contexts_fini(struct drm_i915_private *i915) { lockdep_assert_held(&i915->drm.struct_mutex); @@ -1203,10 +1192,6 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu) if (ret) goto out_add; - ret = gen8_emit_rpcs_config(rq, ce, sseu); - if (ret) - goto out_add; - /* * Guarantee context image and the timeline remains pinned until the * modifying request is retired by setting the ce activity tracker. @@ -1214,9 +1199,12 @@ gen8_modify_rpcs(struct intel_context *ce, struct intel_sseu sseu) * But we only need to take one pin on the account of it. Or in other * words transfer the pinned ce object to tracked active request. */ - if (!i915_active_request_isset(&ce->active_tracker)) - __intel_context_pin(ce); - __i915_active_request_set(&ce->active_tracker, rq); + GEM_BUG_ON(i915_active_is_idle(&ce->active)); + ret = i915_active_ref(&ce->active, rq->fence.context, rq); + if (ret) + goto out_add; + + ret = gen8_emit_rpcs_config(rq, ce, sseu); out_add: i915_request_add(rq); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index 630392c77e48..9691dd062f72 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -134,7 +134,6 @@ static inline bool i915_gem_context_is_kernel(struct i915_gem_context *ctx) /* i915_gem_context.c */ int __must_check i915_gem_contexts_init(struct drm_i915_private *dev_priv); -void i915_gem_contexts_lost(struct drm_i915_private *dev_priv); void i915_gem_contexts_fini(struct drm_i915_private *dev_priv); int i915_gem_context_open(struct drm_i915_private *i915, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c index 6e75702c5671..141f3ea349a4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c @@ -10,6 +10,22 @@ #include "i915_drv.h" #include "i915_globals.h" +static void call_idle_barriers(struct intel_engine_cs *engine) +{ + struct llist_node *node, *next; + + llist_for_each_safe(node, next, llist_del_all(&engine->barrier_tasks)) { + struct i915_active_request *active = + container_of((struct list_head *)node, + typeof(*active), link); + + INIT_LIST_HEAD(&active->link); + RCU_INIT_POINTER(active->request, NULL); + + active->retire(active, NULL); + } +} + static void i915_gem_park(struct drm_i915_private *i915) { struct intel_engine_cs *engine; @@ -17,8 +33,10 @@ static void i915_gem_park(struct drm_i915_private *i915) lockdep_assert_held(&i915->drm.struct_mutex); - for_each_engine(engine, i915, id) + for_each_engine(engine, i915, id) { + call_idle_barriers(engine); /* cleanup after wedging */ i915_gem_batch_pool_fini(&engine->batch_pool); + } i915_timelines_park(i915); i915_vma_parked(i915); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index a521f23c18ad..c851c4029597 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -160,18 +160,13 @@ i915_gem_shrink(struct drm_i915_private *i915, return 0; /* - * When shrinking the active list, also consider active contexts. - * Active contexts are pinned until they are retired, and so can - * not be simply unbound to retire and unpin their pages. To shrink - * the contexts, we must wait until the gpu is idle. - * - * We don't care about errors here; if we cannot wait upon the GPU, - * we will free as much as we can and hope to get a second chance. + * When shrinking the active list, we should also consider active + * contexts. Active contexts are pinned until they are retired, and + * so can not be simply unbound to retire and unpin their pages. To + * shrink the contexts, we must wait until the gpu is idle and + * completed its switch to the kernel context. In short, we do + * not have a good mechanism for idling a specific context. */ - if (shrink & I915_SHRINK_ACTIVE) - i915_gem_wait_for_idle(i915, - I915_WAIT_LOCKED, - MAX_SCHEDULE_TIMEOUT); trace_i915_gem_shrink(i915, target, shrink); i915_retire_requests(i915); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 285e869713e8..42f45744d859 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -61,7 +61,6 @@ int __intel_context_do_pin(struct intel_context *ce) i915_gem_context_get(ce->gem_context); /* for ctx->ppgtt */ - intel_context_get(ce); smp_mb__before_atomic(); /* flush pin before it is visible */ } @@ -89,20 +88,45 @@ void intel_context_unpin(struct intel_context *ce) ce->ops->unpin(ce); i915_gem_context_put(ce->gem_context); - intel_context_put(ce); + intel_context_active_release(ce); } mutex_unlock(&ce->pin_mutex); intel_context_put(ce); } -static void intel_context_retire(struct i915_active_request *active, - struct i915_request *rq) +static int __context_pin_state(struct i915_vma *vma, unsigned long flags) { - struct intel_context *ce = - container_of(active, typeof(*ce), active_tracker); + int err; - intel_context_unpin(ce); + err = i915_vma_pin(vma, 0, 0, flags | PIN_GLOBAL); + if (err) + return err; + + /* + * And mark it as a globally pinned object to let the shrinker know + * it cannot reclaim the object until we release it. + */ + vma->obj->pin_global++; + vma->obj->mm.dirty = true; + + return 0; +} + +static void __context_unpin_state(struct i915_vma *vma) +{ + vma->obj->pin_global--; + __i915_vma_unpin(vma); +} + +static void intel_context_retire(struct i915_active *active) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + if (ce->state) + __context_unpin_state(ce->state); + + intel_context_put(ce); } void @@ -125,8 +149,46 @@ intel_context_init(struct intel_context *ce, mutex_init(&ce->pin_mutex); - i915_active_request_init(&ce->active_tracker, - NULL, intel_context_retire); + i915_active_init(ctx->i915, &ce->active, intel_context_retire); +} + +int intel_context_active_acquire(struct intel_context *ce, unsigned long flags) +{ + int err; + + if (!i915_active_acquire(&ce->active)) + return 0; + + intel_context_get(ce); + + if (!ce->state) + return 0; + + err = __context_pin_state(ce->state, flags); + if (err) { + i915_active_cancel(&ce->active); + intel_context_put(ce); + return err; + } + + /* Preallocate tracking nodes */ + if (!i915_gem_context_is_kernel(ce->gem_context)) { + err = i915_active_acquire_preallocate_barrier(&ce->active, + ce->engine); + if (err) { + i915_active_release(&ce->active); + return err; + } + } + + return 0; +} + +void intel_context_active_release(struct intel_context *ce) +{ + /* Nodes preallocated in intel_context_active() */ + i915_active_acquire_barrier(&ce->active); + i915_active_release(&ce->active); } static void i915_global_context_shrink(void) diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 6d5453ba2c1e..a47275bc4f01 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -102,6 +102,9 @@ static inline void intel_context_exit(struct intel_context *ce) ce->ops->exit(ce); } +int intel_context_active_acquire(struct intel_context *ce, unsigned long flags); +void intel_context_active_release(struct intel_context *ce); + static inline struct intel_context *intel_context_get(struct intel_context *ce) { kref_get(&ce->ref); diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 825fcf0ac9c4..e95be4be9612 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -56,10 +56,10 @@ struct intel_context { intel_engine_mask_t saturated; /* submitting semaphores too late? */ /** - * active_tracker: Active tracker for the external rq activity - * on this intel_context object. + * active: Active tracker for the rq activity (inc. external) on this + * intel_context object. */ - struct i915_active_request active_tracker; + struct i915_active active; const struct intel_context_ops *ops; diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 1439fa4093ac..3b5a6d152997 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -467,8 +467,6 @@ static inline void intel_engine_reset(struct intel_engine_cs *engine, bool intel_engine_is_idle(struct intel_engine_cs *engine); bool intel_engines_are_idle(struct drm_i915_private *dev_priv); -void intel_engine_lost_context(struct intel_engine_cs *engine); - void intel_engines_reset_default_submission(struct drm_i915_private *i915); unsigned int intel_engines_has_context_isolation(struct drm_i915_private *i915); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 39220e16ea26..22242e927baa 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -611,6 +611,8 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) { int err; + init_llist_head(&engine->barrier_tasks); + err = init_status_page(engine); if (err) return err; @@ -870,6 +872,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) if (engine->preempt_context) intel_context_unpin(engine->preempt_context); intel_context_unpin(engine->kernel_context); + GEM_BUG_ON(!llist_empty(&engine->barrier_tasks)); i915_timeline_fini(&engine->timeline); @@ -1201,26 +1204,6 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915) engine->set_default_submission(engine); } -/** - * intel_engine_lost_context: called when the GPU is reset into unknown state - * @engine: the engine - * - * We have either reset the GPU or otherwise about to lose state tracking of - * the current GPU logical state (e.g. suspend). On next use, it is therefore - * imperative that we make no presumptions about the current state and load - * from scratch. - */ -void intel_engine_lost_context(struct intel_engine_cs *engine) -{ - struct intel_context *ce; - - lockdep_assert_held(&engine->i915->drm.struct_mutex); - - ce = fetch_and_zero(&engine->last_retired_context); - if (ce) - intel_context_unpin(ce); -} - bool intel_engine_can_store_dword(struct intel_engine_cs *engine) { switch (INTEL_GEN(engine->i915)) { diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 903bee3d6c6d..d14e352b0b17 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -88,6 +88,8 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) /* Check again on the next retirement. */ engine->wakeref_serial = engine->serial + 1; + + i915_request_add_barriers(rq); __i915_request_commit(rq); return false; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 01223864237a..33a31aa2d2ae 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -11,6 +11,7 @@ #include #include #include +#include #include #include "i915_gem.h" @@ -288,6 +289,7 @@ struct intel_engine_cs { struct intel_ring *buffer; struct i915_timeline timeline; + struct llist_head barrier_tasks; struct intel_context *kernel_context; /* pinned */ struct intel_context *preempt_context; /* pinned; optional */ @@ -435,17 +437,6 @@ struct intel_engine_cs { struct intel_engine_execlists execlists; - /* Contexts are pinned whilst they are active on the GPU. The last - * context executed remains active whilst the GPU is idle - the - * switch away and write to the context object only occurs on the - * next execution. Contexts are only unpinned on retirement of the - * following request ensuring that we can always write to the object - * on the context switch even after idling. Across suspend, we switch - * to the kernel context and trash it as the save may not happen - * before the hardware is powered down. - */ - struct intel_context *last_retired_context; - /* status_notifier: list of callbacks for context-switch changes */ struct atomic_notifier_head context_status_notifier; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index b8f5592da18f..d0a51752386f 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1422,60 +1422,11 @@ static void execlists_context_destroy(struct kref *kref) intel_context_free(ce); } -static int __context_pin(struct i915_vma *vma) -{ - unsigned int flags; - int err; - - flags = PIN_GLOBAL | PIN_HIGH; - flags |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma); - - err = i915_vma_pin(vma, 0, 0, flags); - if (err) - return err; - - vma->obj->pin_global++; - vma->obj->mm.dirty = true; - - return 0; -} - -static void __context_unpin(struct i915_vma *vma) -{ - vma->obj->pin_global--; - __i915_vma_unpin(vma); -} - static void execlists_context_unpin(struct intel_context *ce) { - struct intel_engine_cs *engine; - - /* - * The tasklet may still be using a pointer to our state, via an - * old request. However, since we know we only unpin the context - * on retirement of the following request, we know that the last - * request referencing us will have had a completion CS interrupt. - * If we see that it is still active, it means that the tasklet hasn't - * had the chance to run yet; let it run before we teardown the - * reference it may use. - */ - engine = READ_ONCE(ce->inflight); - if (unlikely(engine)) { - unsigned long flags; - - spin_lock_irqsave(&engine->timeline.lock, flags); - process_csb(engine); - spin_unlock_irqrestore(&engine->timeline.lock, flags); - - GEM_BUG_ON(READ_ONCE(ce->inflight)); - } - i915_gem_context_unpin_hw_id(ce->gem_context); - - intel_ring_unpin(ce->ring); - i915_gem_object_unpin_map(ce->state->obj); - __context_unpin(ce->state); + intel_ring_unpin(ce->ring); } static void @@ -1512,7 +1463,10 @@ __execlists_context_pin(struct intel_context *ce, goto err; GEM_BUG_ON(!ce->state); - ret = __context_pin(ce->state); + ret = intel_context_active_acquire(ce, + engine->i915->ggtt.pin_bias | + PIN_OFFSET_BIAS | + PIN_HIGH); if (ret) goto err; @@ -1521,7 +1475,7 @@ __execlists_context_pin(struct intel_context *ce, I915_MAP_OVERRIDE); if (IS_ERR(vaddr)) { ret = PTR_ERR(vaddr); - goto unpin_vma; + goto unpin_active; } ret = intel_ring_pin(ce->ring); @@ -1542,8 +1496,8 @@ __execlists_context_pin(struct intel_context *ce, intel_ring_unpin(ce->ring); unpin_map: i915_gem_object_unpin_map(ce->state->obj); -unpin_vma: - __context_unpin(ce->state); +unpin_active: + intel_context_active_release(ce); err: return ret; } diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index b3bf47e8162f..cc901edec09a 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -1349,45 +1349,9 @@ static void __context_unpin_ppgtt(struct i915_gem_context *ctx) gen6_ppgtt_unpin(i915_vm_to_ppgtt(vm)); } -static int __context_pin(struct intel_context *ce) -{ - struct i915_vma *vma; - int err; - - vma = ce->state; - if (!vma) - return 0; - - err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH); - if (err) - return err; - - /* - * And mark is as a globally pinned object to let the shrinker know - * it cannot reclaim the object until we release it. - */ - vma->obj->pin_global++; - vma->obj->mm.dirty = true; - - return 0; -} - -static void __context_unpin(struct intel_context *ce) -{ - struct i915_vma *vma; - - vma = ce->state; - if (!vma) - return; - - vma->obj->pin_global--; - i915_vma_unpin(vma); -} - static void ring_context_unpin(struct intel_context *ce) { __context_unpin_ppgtt(ce->gem_context); - __context_unpin(ce); } static struct i915_vma * @@ -1477,18 +1441,18 @@ static int ring_context_pin(struct intel_context *ce) ce->state = vma; } - err = __context_pin(ce); + err = intel_context_active_acquire(ce, PIN_HIGH); if (err) return err; err = __context_pin_ppgtt(ce->gem_context); if (err) - goto err_unpin; + goto err_active; return 0; -err_unpin: - __context_unpin(ce); +err_active: + intel_context_active_release(ce); return err; } diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 6d7562769eb2..d1ef515bac8d 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -146,12 +146,18 @@ static void mock_context_destroy(struct kref *ref) static int mock_context_pin(struct intel_context *ce) { + int ret; + if (!ce->ring) { ce->ring = mock_ring(ce->engine); if (!ce->ring) return -ENOMEM; } + ret = intel_context_active_acquire(ce, PIN_HIGH); + if (ret) + return ret; + mock_timeline_pin(ce->ring->timeline); return 0; } @@ -328,14 +334,9 @@ void mock_engine_free(struct intel_engine_cs *engine) { struct mock_engine *mock = container_of(engine, typeof(*mock), base); - struct intel_context *ce; GEM_BUG_ON(timer_pending(&mock->hw_delay)); - ce = fetch_and_zero(&engine->last_retired_context); - if (ce) - intel_context_unpin(ce); - intel_context_unpin(engine->kernel_context); intel_engine_fini_breadcrumbs(engine); diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 863ae12707ba..2d019ac6db20 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -157,6 +157,7 @@ void i915_active_init(struct drm_i915_private *i915, ref->retire = retire; ref->tree = RB_ROOT; i915_active_request_init(&ref->last, NULL, last_retire); + init_llist_head(&ref->barriers); ref->count = 0; } @@ -263,6 +264,83 @@ void i915_active_fini(struct i915_active *ref) } #endif +int i915_active_acquire_preallocate_barrier(struct i915_active *ref, + struct intel_engine_cs *engine) +{ + struct drm_i915_private *i915 = engine->i915; + unsigned long tmp; + int err = 0; + + GEM_BUG_ON(!engine->mask); + for_each_engine_masked(engine, i915, engine->mask, tmp) { + struct intel_context *kctx = engine->kernel_context; + struct active_node *node; + + node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); + if (unlikely(!node)) { + err = -ENOMEM; + break; + } + + i915_active_request_init(&node->base, + (void *)engine, node_retire); + node->timeline = kctx->ring->timeline->fence_context; + node->ref = ref; + ref->count++; + + llist_add((struct llist_node *)&node->base.link, + &ref->barriers); + } + + return err; +} + +void i915_active_acquire_barrier(struct i915_active *ref) +{ + struct llist_node *pos, *next; + + i915_active_acquire(ref); + + llist_for_each_safe(pos, next, llist_del_all(&ref->barriers)) { + struct intel_engine_cs *engine; + struct active_node *node; + struct rb_node **p, *parent; + + node = container_of((struct list_head *)pos, + typeof(*node), base.link); + + engine = (void *)rcu_access_pointer(node->base.request); + RCU_INIT_POINTER(node->base.request, ERR_PTR(-EAGAIN)); + + parent = NULL; + p = &ref->tree.rb_node; + while (*p) { + parent = *p; + if (rb_entry(parent, + struct active_node, + node)->timeline < node->timeline) + p = &parent->rb_right; + else + p = &parent->rb_left; + } + rb_link_node(&node->node, parent, p); + rb_insert_color(&node->node, &ref->tree); + + llist_add((struct llist_node *)&node->base.link, + &engine->barrier_tasks); + } + i915_active_release(ref); +} + +void i915_request_add_barriers(struct i915_request *rq) +{ + struct intel_engine_cs *engine = rq->engine; + struct llist_node *node, *next; + + llist_for_each_safe(node, next, llist_del_all(&engine->barrier_tasks)) + list_add_tail((struct list_head *)node, &rq->active_list); +} + int i915_active_request_set(struct i915_active_request *active, struct i915_request *rq) { diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index 7d758719ce39..d55d37673944 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -406,4 +406,9 @@ void i915_active_fini(struct i915_active *ref); static inline void i915_active_fini(struct i915_active *ref) { } #endif +int i915_active_acquire_preallocate_barrier(struct i915_active *ref, + struct intel_engine_cs *engine); +void i915_active_acquire_barrier(struct i915_active *ref); +void i915_request_add_barriers(struct i915_request *rq); + #endif /* _I915_ACTIVE_H_ */ diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h index b679253b53a5..c025991b9233 100644 --- a/drivers/gpu/drm/i915/i915_active_types.h +++ b/drivers/gpu/drm/i915/i915_active_types.h @@ -7,6 +7,7 @@ #ifndef _I915_ACTIVE_TYPES_H_ #define _I915_ACTIVE_TYPES_H_ +#include #include #include @@ -31,6 +32,8 @@ struct i915_active { unsigned int count; void (*retire)(struct i915_active *ref); + + struct llist_head barriers; }; #endif /* _I915_ACTIVE_TYPES_H_ */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 217236596202..335efeaad4f1 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1199,10 +1199,6 @@ void i915_gem_sanitize(struct drm_i915_private *i915) intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL); intel_runtime_pm_put(&i915->runtime_pm, wakeref); - - mutex_lock(&i915->drm.struct_mutex); - i915_gem_contexts_lost(i915); - mutex_unlock(&i915->drm.struct_mutex); } void i915_gem_init_swizzling(struct drm_i915_private *dev_priv) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index da76e4d1c7f1..38d112d8aba7 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -213,18 +213,6 @@ static void __retire_engine_request(struct intel_engine_cs *engine, spin_unlock(&rq->lock); local_irq_enable(); - - /* - * The backing object for the context is done after switching to the - * *next* context. Therefore we cannot retire the previous context until - * the next context has already started running. However, since we - * cannot take the required locks at i915_request_submit() we - * defer the unpinning of the active context to now, retirement of - * the subsequent request. - */ - if (engine->last_retired_context) - intel_context_unpin(engine->last_retired_context); - engine->last_retired_context = rq->hw_context; } static void __retire_engine_upto(struct intel_engine_cs *engine, @@ -759,9 +747,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq->infix = rq->ring->emit; /* end of header; start of user payload */ - /* Keep a second pin for the dual retirement along engine and ring */ - __intel_context_pin(ce); - intel_context_mark_active(ce); return rq; diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c index 82f2be6cec5b..64bc51400ae7 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c @@ -56,7 +56,6 @@ static void mock_device_release(struct drm_device *dev) mutex_lock(&i915->drm.struct_mutex); mock_device_flush(i915); - i915_gem_contexts_lost(i915); mutex_unlock(&i915->drm.struct_mutex); flush_work(&i915->gem.idle_work); From patchwork Fri Jun 14 16:46:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10996019 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0184D1398 for ; Fri, 14 Jun 2019 16:46:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E777D285D6 for ; Fri, 14 Jun 2019 16:46:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DC23E286BE; Fri, 14 Jun 2019 16:46:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 653C3285D6 for ; Fri, 14 Jun 2019 16:46:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B716889B78; Fri, 14 Jun 2019 16:46:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3DEF489B67 for ; Fri, 14 Jun 2019 16:46:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16905375-1500050 for ; Fri, 14 Jun 2019 17:46:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 14 Jun 2019 17:46:05 +0100 Message-Id: <20190614164606.15633-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190614164606.15633-1-chris@chris-wilson.co.uk> References: <20190614164606.15633-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 2/3] drm/i915: Stop retiring along engine X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We no longer track the execution order along the engine and so no longer need to enforce ordering of retire along the engine. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/i915_request.c | 131 +++++++++++----------------- 1 file changed, 53 insertions(+), 78 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 38d112d8aba7..c99136f78af9 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -183,72 +183,23 @@ static void free_capture_list(struct i915_request *request) } } -static void __retire_engine_request(struct intel_engine_cs *engine, - struct i915_request *rq) -{ - GEM_TRACE("%s(%s) fence %llx:%lld, current %d\n", - __func__, engine->name, - rq->fence.context, rq->fence.seqno, - hwsp_seqno(rq)); - - GEM_BUG_ON(!i915_request_completed(rq)); - - local_irq_disable(); - - spin_lock(&engine->timeline.lock); - GEM_BUG_ON(!list_is_first(&rq->link, &engine->timeline.requests)); - list_del_init(&rq->link); - spin_unlock(&engine->timeline.lock); - - spin_lock(&rq->lock); - i915_request_mark_complete(rq); - if (!i915_request_signaled(rq)) - dma_fence_signal_locked(&rq->fence); - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) - i915_request_cancel_breadcrumb(rq); - if (rq->waitboost) { - GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters)); - atomic_dec(&rq->i915->gt_pm.rps.num_waiters); - } - spin_unlock(&rq->lock); - - local_irq_enable(); -} - -static void __retire_engine_upto(struct intel_engine_cs *engine, - struct i915_request *rq) -{ - struct i915_request *tmp; - - if (list_empty(&rq->link)) - return; - - do { - tmp = list_first_entry(&engine->timeline.requests, - typeof(*tmp), link); - - GEM_BUG_ON(tmp->engine != engine); - __retire_engine_request(engine, tmp); - } while (tmp != rq); -} - -static void i915_request_retire(struct i915_request *request) +static bool i915_request_retire(struct i915_request *rq) { struct i915_active_request *active, *next; - GEM_TRACE("%s fence %llx:%lld, current %d\n", - request->engine->name, - request->fence.context, request->fence.seqno, - hwsp_seqno(request)); + lockdep_assert_held(&rq->i915->drm.struct_mutex); + if (!i915_request_completed(rq)) + return false; - lockdep_assert_held(&request->i915->drm.struct_mutex); - GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit)); - GEM_BUG_ON(!i915_request_completed(request)); + GEM_TRACE("%s fence %llx:%lld, current %d\n", + rq->engine->name, + rq->fence.context, rq->fence.seqno, + hwsp_seqno(rq)); - trace_i915_request_retire(request); + GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit)); + trace_i915_request_retire(rq); - advance_ring(request); - free_capture_list(request); + advance_ring(rq); /* * Walk through the active list, calling retire on each. This allows @@ -260,7 +211,7 @@ static void i915_request_retire(struct i915_request *request) * pass along the auxiliary information (to avoid dereferencing * the node after the callback). */ - list_for_each_entry_safe(active, next, &request->active_list, link) { + list_for_each_entry_safe(active, next, &rq->active_list, link) { /* * In microbenchmarks or focusing upon time inside the kernel, * we may spend an inordinate amount of time simply handling @@ -276,18 +227,39 @@ static void i915_request_retire(struct i915_request *request) INIT_LIST_HEAD(&active->link); RCU_INIT_POINTER(active->request, NULL); - active->retire(active, request); + active->retire(active, rq); + } + + local_irq_disable(); + + spin_lock(&rq->engine->timeline.lock); + list_del(&rq->link); + spin_unlock(&rq->engine->timeline.lock); + + spin_lock(&rq->lock); + i915_request_mark_complete(rq); + if (!i915_request_signaled(rq)) + dma_fence_signal_locked(&rq->fence); + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) + i915_request_cancel_breadcrumb(rq); + if (rq->waitboost) { + GEM_BUG_ON(!atomic_read(&rq->i915->gt_pm.rps.num_waiters)); + atomic_dec(&rq->i915->gt_pm.rps.num_waiters); } + spin_unlock(&rq->lock); + + local_irq_enable(); - i915_request_remove_from_client(request); + intel_context_exit(rq->hw_context); + intel_context_unpin(rq->hw_context); - __retire_engine_upto(request->engine, request); + i915_request_remove_from_client(rq); - intel_context_exit(request->hw_context); - intel_context_unpin(request->hw_context); + free_capture_list(rq); + i915_sched_node_fini(&rq->sched); + i915_request_put(rq); - i915_sched_node_fini(&request->sched); - i915_request_put(request); + return true; } void i915_request_retire_upto(struct i915_request *rq) @@ -309,9 +281,7 @@ void i915_request_retire_upto(struct i915_request *rq) do { tmp = list_first_entry(&ring->request_list, typeof(*tmp), ring_link); - - i915_request_retire(tmp); - } while (tmp != rq); + } while (i915_request_retire(tmp) && tmp != rq); } static void irq_execute_cb(struct irq_work *wrk) @@ -600,12 +570,9 @@ static void ring_retire_requests(struct intel_ring *ring) { struct i915_request *rq, *rn; - list_for_each_entry_safe(rq, rn, &ring->request_list, ring_link) { - if (!i915_request_completed(rq)) + list_for_each_entry_safe(rq, rn, &ring->request_list, ring_link) + if (!i915_request_retire(rq)) break; - - i915_request_retire(rq); - } } static noinline struct i915_request * @@ -620,6 +587,15 @@ request_alloc_slow(struct intel_context *ce, gfp_t gfp) if (!gfpflags_allow_blocking(gfp)) goto out; + /* Move our oldest request to the slab-cache (if not in use!) */ + rq = list_first_entry(&ring->request_list, typeof(*rq), ring_link); + i915_request_retire(rq); + + rq = kmem_cache_alloc(global.slab_requests, + gfp | __GFP_RETRY_MAYFAIL | __GFP_NOWARN); + if (rq) + return rq; + /* Ratelimit ourselves to prevent oom from malicious clients */ rq = list_last_entry(&ring->request_list, typeof(*rq), ring_link); cond_synchronize_rcu(rq->rcustate); @@ -777,8 +753,7 @@ i915_request_create(struct intel_context *ce) /* Move our oldest request to the slab-cache (if not in use!) */ rq = list_first_entry(&ce->ring->request_list, typeof(*rq), ring_link); - if (!list_is_last(&rq->ring_link, &ce->ring->request_list) && - i915_request_completed(rq)) + if (!list_is_last(&rq->ring_link, &ce->ring->request_list)) i915_request_retire(rq); intel_context_enter(ce); From patchwork Fri Jun 14 16:46:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10996021 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8EF2915E6 for ; Fri, 14 Jun 2019 16:46:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7991426538 for ; Fri, 14 Jun 2019 16:46:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6DBBA286B9; Fri, 14 Jun 2019 16:46:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6BD6226538 for ; Fri, 14 Jun 2019 16:46:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C1F4489B69; Fri, 14 Jun 2019 16:46:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6316E89B66 for ; Fri, 14 Jun 2019 16:46:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16905376-1500050 for ; Fri, 14 Jun 2019 17:46:07 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 14 Jun 2019 17:46:06 +0100 Message-Id: <20190614164606.15633-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190614164606.15633-1-chris@chris-wilson.co.uk> References: <20190614164606.15633-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [CI 3/3] drm/i915: Replace engine->timeline with a plain list X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_engine.h | 6 ++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 62 ++++++------ drivers/gpu/drm/i915/gt/intel_engine_types.h | 6 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 95 ++++++++++--------- drivers/gpu/drm/i915/gt/intel_reset.c | 10 +- drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 15 ++- drivers/gpu/drm/i915/gt/mock_engine.c | 18 ++-- drivers/gpu/drm/i915/i915_gpu_error.c | 5 +- drivers/gpu/drm/i915/i915_request.c | 43 +++------ drivers/gpu/drm/i915/i915_request.h | 2 +- drivers/gpu/drm/i915/i915_scheduler.c | 38 ++++---- drivers/gpu/drm/i915/i915_timeline.c | 1 - drivers/gpu/drm/i915/i915_timeline.h | 19 ---- drivers/gpu/drm/i915/i915_timeline_types.h | 4 - drivers/gpu/drm/i915/intel_guc_submission.c | 16 ++-- .../gpu/drm/i915/selftests/mock_timeline.c | 1 - 16 files changed, 153 insertions(+), 188 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 3b5a6d152997..2f1c6871ee95 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -565,4 +565,10 @@ static inline bool inject_preempt_hang(struct intel_engine_execlists *execlists) #endif +void intel_engine_init_active(struct intel_engine_cs *engine, + unsigned int subclass); +#define ENGINE_PHYSICAL 0 +#define ENGINE_MOCK 1 +#define ENGINE_VIRTUAL 2 + #endif /* _INTEL_RINGBUFFER_H_ */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 22242e927baa..898692989313 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -617,14 +617,7 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) if (err) return err; - err = i915_timeline_init(engine->i915, - &engine->timeline, - engine->status_page.vma); - if (err) - goto err_hwsp; - - i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE); - + intel_engine_init_active(engine, ENGINE_PHYSICAL); intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); intel_engine_init_hangcheck(engine); @@ -637,10 +630,6 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine) intel_sseu_from_device_info(&RUNTIME_INFO(engine->i915)->sseu); return 0; - -err_hwsp: - cleanup_status_page(engine); - return err; } /** @@ -797,6 +786,27 @@ static int pin_context(struct i915_gem_context *ctx, return 0; } +void +intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass) +{ + INIT_LIST_HEAD(&engine->active.requests); + + spin_lock_init(&engine->active.lock); + lockdep_set_subclass(&engine->active.lock, subclass); + + /* + * Due to an interesting quirk in lockdep's internal debug tracking, + * after setting a subclass we must ensure the lock is used. Otherwise, + * nr_unused_locks is incremented once too often. + */ +#ifdef CONFIG_DEBUG_LOCK_ALLOC + local_irq_disable(); + lock_map_acquire(&engine->active.lock.dep_map); + lock_map_release(&engine->active.lock.dep_map); + local_irq_enable(); +#endif +} + /** * intel_engines_init_common - initialize cengine state which might require hw access * @engine: Engine to initialize. @@ -860,6 +870,8 @@ int intel_engine_init_common(struct intel_engine_cs *engine) */ void intel_engine_cleanup_common(struct intel_engine_cs *engine) { + GEM_BUG_ON(!list_empty(&engine->active.requests)); + cleanup_status_page(engine); intel_engine_fini_breadcrumbs(engine); @@ -874,8 +886,6 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) intel_context_unpin(engine->kernel_context); GEM_BUG_ON(!llist_empty(&engine->barrier_tasks)); - i915_timeline_fini(&engine->timeline); - intel_wa_list_free(&engine->ctx_wa_list); intel_wa_list_free(&engine->wa_list); intel_wa_list_free(&engine->whitelist); @@ -1482,16 +1492,6 @@ void intel_engine_dump(struct intel_engine_cs *engine, drm_printf(m, "\tRequests:\n"); - rq = list_first_entry(&engine->timeline.requests, - struct i915_request, link); - if (&rq->link != &engine->timeline.requests) - print_request(m, rq, "\t\tfirst "); - - rq = list_last_entry(&engine->timeline.requests, - struct i915_request, link); - if (&rq->link != &engine->timeline.requests) - print_request(m, rq, "\t\tlast "); - rq = intel_engine_find_active_request(engine); if (rq) { print_request(m, rq, "\t\tactive "); @@ -1572,7 +1572,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine) if (!intel_engine_supports_stats(engine)) return -ENODEV; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); write_seqlock(&engine->stats.lock); if (unlikely(engine->stats.enabled == ~0)) { @@ -1598,7 +1598,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine) unlock: write_sequnlock(&engine->stats.lock); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); return err; } @@ -1683,22 +1683,22 @@ intel_engine_find_active_request(struct intel_engine_cs *engine) * At all other times, we must assume the GPU is still running, but * we only care about the snapshot of this moment. */ - spin_lock_irqsave(&engine->timeline.lock, flags); - list_for_each_entry(request, &engine->timeline.requests, link) { + spin_lock_irqsave(&engine->active.lock, flags); + list_for_each_entry(request, &engine->active.requests, sched.link) { if (i915_request_completed(request)) continue; if (!i915_request_started(request)) - break; + continue; /* More than one preemptible request may match! */ if (!match_ring(request)) - break; + continue; active = request; break; } - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); return active; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 33a31aa2d2ae..b2faca8e5dec 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -288,7 +288,11 @@ struct intel_engine_cs { struct intel_ring *buffer; - struct i915_timeline timeline; + struct { + spinlock_t lock; + struct list_head requests; + } active; + struct llist_head barrier_tasks; struct intel_context *kernel_context; /* pinned */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d0a51752386f..c400c66d0ee5 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -298,8 +298,8 @@ static inline bool need_preempt(const struct intel_engine_cs *engine, * Check against the first request in ELSP[1], it will, thanks to the * power of PI, be the highest priority of that context. */ - if (!list_is_last(&rq->link, &engine->timeline.requests) && - rq_prio(list_next_entry(rq, link)) > last_prio) + if (!list_is_last(&rq->sched.link, &engine->active.requests) && + rq_prio(list_next_entry(rq, sched.link)) > last_prio) return true; if (rb) { @@ -434,11 +434,11 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) struct list_head *uninitialized_var(pl); int prio = I915_PRIORITY_INVALID; - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); list_for_each_entry_safe_reverse(rq, rn, - &engine->timeline.requests, - link) { + &engine->active.requests, + sched.link) { struct intel_engine_cs *owner; if (i915_request_completed(rq)) @@ -465,7 +465,7 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) } GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root)); - list_add(&rq->sched.link, pl); + list_move(&rq->sched.link, pl); active = rq; } else { rq->engine = owner; @@ -933,11 +933,11 @@ static void execlists_dequeue(struct intel_engine_cs *engine) rb_entry(rb, typeof(*ve), nodes[engine->id].rb); struct i915_request *rq; - spin_lock(&ve->base.timeline.lock); + spin_lock(&ve->base.active.lock); rq = ve->request; if (unlikely(!rq)) { /* lost the race to a sibling */ - spin_unlock(&ve->base.timeline.lock); + spin_unlock(&ve->base.active.lock); rb_erase_cached(rb, &execlists->virtual); RB_CLEAR_NODE(rb); rb = rb_first_cached(&execlists->virtual); @@ -950,13 +950,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (rq_prio(rq) >= queue_prio(execlists)) { if (!virtual_matches(ve, rq, engine)) { - spin_unlock(&ve->base.timeline.lock); + spin_unlock(&ve->base.active.lock); rb = rb_next(rb); continue; } if (last && !can_merge_rq(last, rq)) { - spin_unlock(&ve->base.timeline.lock); + spin_unlock(&ve->base.active.lock); return; /* leave this rq for another engine */ } @@ -1011,7 +1011,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) last = rq; } - spin_unlock(&ve->base.timeline.lock); + spin_unlock(&ve->base.active.lock); break; } @@ -1068,8 +1068,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) GEM_BUG_ON(port_isset(port)); } - list_del_init(&rq->sched.link); - __i915_request_submit(rq); trace_i915_request_in(rq, port_index(port, execlists)); @@ -1170,7 +1168,7 @@ static void process_csb(struct intel_engine_cs *engine) const u8 num_entries = execlists->csb_size; u8 head, tail; - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); /* * Note that csb_write, csb_status may be either in HWSP or mmio. @@ -1330,7 +1328,7 @@ static void process_csb(struct intel_engine_cs *engine) static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) { - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); process_csb(engine); if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) @@ -1351,15 +1349,16 @@ static void execlists_submission_tasklet(unsigned long data) !!intel_wakeref_active(&engine->wakeref), engine->execlists.active); - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __execlists_submission_tasklet(engine); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void queue_request(struct intel_engine_cs *engine, struct i915_sched_node *node, int prio) { + GEM_BUG_ON(!list_empty(&node->link)); list_add_tail(&node->link, i915_sched_lookup_priolist(engine, prio)); } @@ -1390,7 +1389,7 @@ static void execlists_submit_request(struct i915_request *request) unsigned long flags; /* Will be called from irq-context when using foreign fences. */ - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); queue_request(engine, &request->sched, rq_prio(request)); @@ -1399,7 +1398,7 @@ static void execlists_submit_request(struct i915_request *request) submit_queue(engine, rq_prio(request)); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void __execlists_context_fini(struct intel_context *ce) @@ -2050,8 +2049,8 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine) intel_engine_stop_cs(engine); /* And flush any current direct submission. */ - spin_lock_irqsave(&engine->timeline.lock, flags); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static bool lrc_regs_ok(const struct i915_request *rq) @@ -2094,11 +2093,11 @@ static void reset_csb_pointers(struct intel_engine_execlists *execlists) static struct i915_request *active_request(struct i915_request *rq) { - const struct list_head * const list = &rq->engine->timeline.requests; + const struct list_head * const list = &rq->engine->active.requests; const struct intel_context * const context = rq->hw_context; struct i915_request *active = NULL; - list_for_each_entry_from_reverse(rq, list, link) { + list_for_each_entry_from_reverse(rq, list, sched.link) { if (i915_request_completed(rq)) break; @@ -2215,11 +2214,11 @@ static void execlists_reset(struct intel_engine_cs *engine, bool stalled) GEM_TRACE("%s\n", engine->name); - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __execlists_reset(engine, stalled); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void nop_submission_tasklet(unsigned long data) @@ -2250,12 +2249,12 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) * submission's irq state, we also wish to remind ourselves that * it is irq state.) */ - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __execlists_reset(engine, true); /* Mark all executing requests as skipped. */ - list_for_each_entry(rq, &engine->timeline.requests, link) { + list_for_each_entry(rq, &engine->active.requests, sched.link) { if (!i915_request_signaled(rq)) dma_fence_set_error(&rq->fence, -EIO); @@ -2286,7 +2285,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) rb_erase_cached(rb, &execlists->virtual); RB_CLEAR_NODE(rb); - spin_lock(&ve->base.timeline.lock); + spin_lock(&ve->base.active.lock); if (ve->request) { ve->request->engine = engine; __i915_request_submit(ve->request); @@ -2295,7 +2294,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) ve->base.execlists.queue_priority_hint = INT_MIN; ve->request = NULL; } - spin_unlock(&ve->base.timeline.lock); + spin_unlock(&ve->base.active.lock); } /* Remaining _unready_ requests will be nop'ed when submitted */ @@ -2307,7 +2306,7 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine) GEM_BUG_ON(__tasklet_is_enabled(&execlists->tasklet)); execlists->tasklet.func = nop_submission_tasklet; - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void execlists_reset_finish(struct intel_engine_cs *engine) @@ -3010,12 +3009,18 @@ static int execlists_context_deferred_alloc(struct intel_context *ce, return ret; } +static struct list_head *virtual_queue(struct virtual_engine *ve) +{ + return &ve->base.execlists.default_priolist.requests[0]; +} + static void virtual_context_destroy(struct kref *kref) { struct virtual_engine *ve = container_of(kref, typeof(*ve), context.ref); unsigned int n; + GEM_BUG_ON(!list_empty(virtual_queue(ve))); GEM_BUG_ON(ve->request); GEM_BUG_ON(ve->context.inflight); @@ -3026,13 +3031,13 @@ static void virtual_context_destroy(struct kref *kref) if (RB_EMPTY_NODE(node)) continue; - spin_lock_irq(&sibling->timeline.lock); + spin_lock_irq(&sibling->active.lock); /* Detachment is lazily performed in the execlists tasklet */ if (!RB_EMPTY_NODE(node)) rb_erase_cached(node, &sibling->execlists.virtual); - spin_unlock_irq(&sibling->timeline.lock); + spin_unlock_irq(&sibling->active.lock); } GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet)); @@ -3040,8 +3045,6 @@ static void virtual_context_destroy(struct kref *kref) __execlists_context_fini(&ve->context); kfree(ve->bonds); - - i915_timeline_fini(&ve->base.timeline); kfree(ve); } @@ -3161,16 +3164,16 @@ static void virtual_submission_tasklet(unsigned long data) if (unlikely(!(mask & sibling->mask))) { if (!RB_EMPTY_NODE(&node->rb)) { - spin_lock(&sibling->timeline.lock); + spin_lock(&sibling->active.lock); rb_erase_cached(&node->rb, &sibling->execlists.virtual); RB_CLEAR_NODE(&node->rb); - spin_unlock(&sibling->timeline.lock); + spin_unlock(&sibling->active.lock); } continue; } - spin_lock(&sibling->timeline.lock); + spin_lock(&sibling->active.lock); if (!RB_EMPTY_NODE(&node->rb)) { /* @@ -3214,7 +3217,7 @@ static void virtual_submission_tasklet(unsigned long data) tasklet_hi_schedule(&sibling->execlists.tasklet); } - spin_unlock(&sibling->timeline.lock); + spin_unlock(&sibling->active.lock); } local_irq_enable(); } @@ -3231,9 +3234,13 @@ static void virtual_submit_request(struct i915_request *rq) GEM_BUG_ON(ve->base.submit_request != virtual_submit_request); GEM_BUG_ON(ve->request); + GEM_BUG_ON(!list_empty(virtual_queue(ve))); + ve->base.execlists.queue_priority_hint = rq_prio(rq); WRITE_ONCE(ve->request, rq); + list_move_tail(&rq->sched.link, virtual_queue(ve)); + tasklet_schedule(&ve->base.execlists.tasklet); } @@ -3297,10 +3304,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx, snprintf(ve->base.name, sizeof(ve->base.name), "virtual"); - err = i915_timeline_init(ctx->i915, &ve->base.timeline, NULL); - if (err) - goto err_put; - i915_timeline_set_subclass(&ve->base.timeline, TIMELINE_VIRTUAL); + intel_engine_init_active(&ve->base, ENGINE_VIRTUAL); intel_engine_init_execlists(&ve->base); @@ -3311,6 +3315,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx, ve->base.submit_request = virtual_submit_request; ve->base.bond_execute = virtual_bond_execute; + INIT_LIST_HEAD(virtual_queue(ve)); ve->base.execlists.queue_priority_hint = INT_MIN; tasklet_init(&ve->base.execlists.tasklet, virtual_submission_tasklet, @@ -3465,11 +3470,11 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine, unsigned int count; struct rb_node *rb; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); last = NULL; count = 0; - list_for_each_entry(rq, &engine->timeline.requests, link) { + list_for_each_entry(rq, &engine->active.requests, sched.link) { if (count++ < max - 1) show_request(m, rq, "\t\tE "); else @@ -3532,7 +3537,7 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine, show_request(m, last, "\t\tV "); } - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } void intel_lr_context_reset(struct intel_engine_cs *engine, diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 6e6807a3f748..84c670bdb081 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -49,12 +49,12 @@ static void engine_skip_context(struct i915_request *rq) struct intel_engine_cs *engine = rq->engine; struct i915_gem_context *hung_ctx = rq->gem_context; - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); if (!i915_request_is_active(rq)) return; - list_for_each_entry_continue(rq, &engine->timeline.requests, link) + list_for_each_entry_continue(rq, &engine->active.requests, sched.link) if (rq->gem_context == hung_ctx) i915_request_skip(rq, -EIO); } @@ -130,7 +130,7 @@ void i915_reset_request(struct i915_request *rq, bool guilty) rq->fence.seqno, yesno(guilty)); - lockdep_assert_held(&rq->engine->timeline.lock); + lockdep_assert_held(&rq->engine->active.lock); GEM_BUG_ON(i915_request_completed(rq)); if (guilty) { @@ -785,10 +785,10 @@ static void nop_submit_request(struct i915_request *request) engine->name, request->fence.context, request->fence.seqno); dma_fence_set_error(&request->fence, -EIO); - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __i915_request_submit(request); i915_request_mark_complete(request); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); intel_engine_queue_breadcrumbs(engine); } diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c index cc901edec09a..019bf039f616 100644 --- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c @@ -730,14 +730,13 @@ static void reset_prepare(struct intel_engine_cs *engine) static void reset_ring(struct intel_engine_cs *engine, bool stalled) { - struct i915_timeline *tl = &engine->timeline; struct i915_request *pos, *rq; unsigned long flags; u32 head; rq = NULL; - spin_lock_irqsave(&tl->lock, flags); - list_for_each_entry(pos, &tl->requests, link) { + spin_lock_irqsave(&engine->active.lock, flags); + list_for_each_entry(pos, &engine->active.requests, sched.link) { if (!i915_request_completed(pos)) { rq = pos; break; @@ -791,7 +790,7 @@ static void reset_ring(struct intel_engine_cs *engine, bool stalled) } engine->buffer->head = intel_ring_wrap(engine->buffer, head); - spin_unlock_irqrestore(&tl->lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void reset_finish(struct intel_engine_cs *engine) @@ -877,10 +876,10 @@ static void cancel_requests(struct intel_engine_cs *engine) struct i915_request *request; unsigned long flags; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); /* Mark all submitted requests as skipped. */ - list_for_each_entry(request, &engine->timeline.requests, link) { + list_for_each_entry(request, &engine->active.requests, sched.link) { if (!i915_request_signaled(request)) dma_fence_set_error(&request->fence, -EIO); @@ -889,7 +888,7 @@ static void cancel_requests(struct intel_engine_cs *engine) /* Remaining _unready_ requests will be nop'ed when submitted */ - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void i9xx_submit_request(struct i915_request *request) @@ -1267,8 +1266,6 @@ intel_engine_create_ring(struct intel_engine_cs *engine, GEM_BUG_ON(!is_power_of_2(size)); GEM_BUG_ON(RING_CTL_SIZE(size) & ~RING_NR_PAGES); - GEM_BUG_ON(timeline == &engine->timeline); - lockdep_assert_held(&engine->i915->drm.struct_mutex); ring = kzalloc(sizeof(*ring), GFP_KERNEL); if (!ring) diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index d1ef515bac8d..086801b51441 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -229,17 +229,17 @@ static void mock_cancel_requests(struct intel_engine_cs *engine) struct i915_request *request; unsigned long flags; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); /* Mark all submitted requests as skipped. */ - list_for_each_entry(request, &engine->timeline.requests, sched.link) { + list_for_each_entry(request, &engine->active.requests, sched.link) { if (!i915_request_signaled(request)) dma_fence_set_error(&request->fence, -EIO); i915_request_mark_complete(request); } - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } struct intel_engine_cs *mock_engine(struct drm_i915_private *i915, @@ -285,28 +285,23 @@ int mock_engine_init(struct intel_engine_cs *engine) struct drm_i915_private *i915 = engine->i915; int err; + intel_engine_init_active(engine, ENGINE_MOCK); intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); intel_engine_init__pm(engine); - if (i915_timeline_init(i915, &engine->timeline, NULL)) - goto err_breadcrumbs; - i915_timeline_set_subclass(&engine->timeline, TIMELINE_ENGINE); - engine->kernel_context = i915_gem_context_get_engine(i915->kernel_context, engine->id); if (IS_ERR(engine->kernel_context)) - goto err_timeline; + goto err_breadcrumbs; err = intel_context_pin(engine->kernel_context); intel_context_put(engine->kernel_context); if (err) - goto err_timeline; + goto err_breadcrumbs; return 0; -err_timeline: - i915_timeline_fini(&engine->timeline); err_breadcrumbs: intel_engine_fini_breadcrumbs(engine); return -ENOMEM; @@ -340,7 +335,6 @@ void mock_engine_free(struct intel_engine_cs *engine) intel_context_unpin(engine->kernel_context); intel_engine_fini_breadcrumbs(engine); - i915_timeline_fini(&engine->timeline); kfree(engine); } diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 26c9c0595bdf..f411e3244208 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1275,7 +1275,7 @@ static void engine_record_requests(struct intel_engine_cs *engine, count = 0; request = first; - list_for_each_entry_from(request, &engine->timeline.requests, link) + list_for_each_entry_from(request, &engine->active.requests, sched.link) count++; if (!count) return; @@ -1288,7 +1288,8 @@ static void engine_record_requests(struct intel_engine_cs *engine, count = 0; request = first; - list_for_each_entry_from(request, &engine->timeline.requests, link) { + list_for_each_entry_from(request, + &engine->active.requests, sched.link) { if (count >= ee->num_requests) { /* * If the ring request list was changed in diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index c99136f78af9..9819483d1b5d 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -232,9 +232,9 @@ static bool i915_request_retire(struct i915_request *rq) local_irq_disable(); - spin_lock(&rq->engine->timeline.lock); - list_del(&rq->link); - spin_unlock(&rq->engine->timeline.lock); + spin_lock(&rq->engine->active.lock); + list_del(&rq->sched.link); + spin_unlock(&rq->engine->active.lock); spin_lock(&rq->lock); i915_request_mark_complete(rq); @@ -254,6 +254,7 @@ static bool i915_request_retire(struct i915_request *rq) intel_context_unpin(rq->hw_context); i915_request_remove_from_client(rq); + list_del(&rq->link); free_capture_list(rq); i915_sched_node_fini(&rq->sched); @@ -373,28 +374,17 @@ __i915_request_await_execution(struct i915_request *rq, return 0; } -static void move_to_timeline(struct i915_request *request, - struct i915_timeline *timeline) -{ - GEM_BUG_ON(request->timeline == &request->engine->timeline); - lockdep_assert_held(&request->engine->timeline.lock); - - spin_lock(&request->timeline->lock); - list_move_tail(&request->link, &timeline->requests); - spin_unlock(&request->timeline->lock); -} - void __i915_request_submit(struct i915_request *request) { struct intel_engine_cs *engine = request->engine; - GEM_TRACE("%s fence %llx:%lld -> current %d\n", + GEM_TRACE("%s fence %llx:%lld, current %d\n", engine->name, request->fence.context, request->fence.seqno, hwsp_seqno(request)); GEM_BUG_ON(!irqs_disabled()); - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); if (i915_gem_context_is_banned(request->gem_context)) i915_request_skip(request, -EIO); @@ -422,6 +412,8 @@ void __i915_request_submit(struct i915_request *request) /* We may be recursing from the signal callback of another i915 fence */ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING); + list_move_tail(&request->sched.link, &engine->active.requests); + GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)); set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags); @@ -437,9 +429,6 @@ void __i915_request_submit(struct i915_request *request) engine->emit_fini_breadcrumb(request, request->ring->vaddr + request->postfix); - /* Transfer from per-context onto the global per-engine timeline */ - move_to_timeline(request, &engine->timeline); - engine->serial++; trace_i915_request_execute(request); @@ -451,11 +440,11 @@ void i915_request_submit(struct i915_request *request) unsigned long flags; /* Will be called from irq-context when using foreign fences. */ - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __i915_request_submit(request); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } void __i915_request_unsubmit(struct i915_request *request) @@ -468,7 +457,7 @@ void __i915_request_unsubmit(struct i915_request *request) hwsp_seqno(request)); GEM_BUG_ON(!irqs_disabled()); - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); /* * Only unwind in reverse order, required so that the per-context list @@ -486,9 +475,6 @@ void __i915_request_unsubmit(struct i915_request *request) spin_unlock(&request->lock); - /* Transfer back from the global per-engine timeline to per-context */ - move_to_timeline(request, request->timeline); - /* We've already spun, don't charge on resubmitting. */ if (request->sched.semaphores && i915_request_started(request)) { request->sched.attr.priority |= I915_PRIORITY_NOSEMAPHORE; @@ -510,11 +496,11 @@ void i915_request_unsubmit(struct i915_request *request) unsigned long flags; /* Will be called from irq-context when using foreign fences. */ - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); __i915_request_unsubmit(request); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static int __i915_sw_fence_call @@ -669,7 +655,6 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq->engine = ce->engine; rq->ring = ce->ring; rq->timeline = tl; - GEM_BUG_ON(rq->timeline == &ce->engine->timeline); rq->hwsp_seqno = tl->hwsp_seqno; rq->hwsp_cacheline = tl->hwsp_cacheline; rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */ @@ -1136,9 +1121,7 @@ __i915_request_add_to_timeline(struct i915_request *rq) 0); } - spin_lock_irq(&timeline->lock); list_add_tail(&rq->link, &timeline->requests); - spin_unlock_irq(&timeline->lock); /* * Make sure that no request gazumped us - if it was allocated after diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index c9f7d07991c8..edbbdfec24ab 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -217,7 +217,7 @@ struct i915_request { bool waitboost; - /** engine->request_list entry for this request */ + /** timeline->request entry for this request */ struct list_head link; /** ring->request_list entry for this request */ diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index 78ceb56d7801..2e9b38bdc33c 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -77,7 +77,7 @@ i915_sched_lookup_priolist(struct intel_engine_cs *engine, int prio) bool first = true; int idx, i; - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); assert_priolists(execlists); /* buckets sorted from highest [in slot 0] to lowest priority */ @@ -162,9 +162,9 @@ sched_lock_engine(const struct i915_sched_node *node, * check that the rq still belongs to the newly locked engine. */ while (locked != (engine = READ_ONCE(rq->engine))) { - spin_unlock(&locked->timeline.lock); + spin_unlock(&locked->active.lock); memset(cache, 0, sizeof(*cache)); - spin_lock(&engine->timeline.lock); + spin_lock(&engine->active.lock); locked = engine; } @@ -189,7 +189,7 @@ static void kick_submission(struct intel_engine_cs *engine, int prio) * tasklet, i.e. we have not change the priority queue * sufficiently to oust the running context. */ - if (inflight && !i915_scheduler_need_preempt(prio, rq_prio(inflight))) + if (!inflight || !i915_scheduler_need_preempt(prio, rq_prio(inflight))) return; tasklet_hi_schedule(&engine->execlists.tasklet); @@ -278,7 +278,7 @@ static void __i915_schedule(struct i915_sched_node *node, memset(&cache, 0, sizeof(cache)); engine = node_to_request(node)->engine; - spin_lock(&engine->timeline.lock); + spin_lock(&engine->active.lock); /* Fifo and depth-first replacement ensure our deps execute before us */ engine = sched_lock_engine(node, engine, &cache); @@ -287,7 +287,7 @@ static void __i915_schedule(struct i915_sched_node *node, node = dep->signaler; engine = sched_lock_engine(node, engine, &cache); - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); /* Recheck after acquiring the engine->timeline.lock */ if (prio <= node->attr.priority || node_signaled(node)) @@ -296,14 +296,8 @@ static void __i915_schedule(struct i915_sched_node *node, GEM_BUG_ON(node_to_request(node)->engine != engine); node->attr.priority = prio; - if (!list_empty(&node->link)) { - GEM_BUG_ON(intel_engine_is_virtual(engine)); - if (!cache.priolist) - cache.priolist = - i915_sched_lookup_priolist(engine, - prio); - list_move_tail(&node->link, cache.priolist); - } else { + + if (list_empty(&node->link)) { /* * If the request is not in the priolist queue because * it is not yet runnable, then it doesn't contribute @@ -312,8 +306,16 @@ static void __i915_schedule(struct i915_sched_node *node, * queue; but in that case we may still need to reorder * the inflight requests. */ - if (!i915_sw_fence_done(&node_to_request(node)->submit)) - continue; + continue; + } + + if (!intel_engine_is_virtual(engine) && + !i915_request_is_active(node_to_request(node))) { + if (!cache.priolist) + cache.priolist = + i915_sched_lookup_priolist(engine, + prio); + list_move_tail(&node->link, cache.priolist); } if (prio <= engine->execlists.queue_priority_hint) @@ -325,7 +327,7 @@ static void __i915_schedule(struct i915_sched_node *node, kick_submission(engine, prio); } - spin_unlock(&engine->timeline.lock); + spin_unlock(&engine->active.lock); } void i915_schedule(struct i915_request *rq, const struct i915_sched_attr *attr) @@ -439,8 +441,6 @@ void i915_sched_node_fini(struct i915_sched_node *node) { struct i915_dependency *dep, *tmp; - GEM_BUG_ON(!list_empty(&node->link)); - spin_lock_irq(&schedule_lock); /* diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index 000e1a9b6750..c311ce9c6f9d 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -251,7 +251,6 @@ int i915_timeline_init(struct drm_i915_private *i915, timeline->fence_context = dma_fence_context_alloc(1); - spin_lock_init(&timeline->lock); mutex_init(&timeline->mutex); INIT_ACTIVE_REQUEST(&timeline->last_request); diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h index 27668a1a69a3..36e5e5a65155 100644 --- a/drivers/gpu/drm/i915/i915_timeline.h +++ b/drivers/gpu/drm/i915/i915_timeline.h @@ -36,25 +36,6 @@ int i915_timeline_init(struct drm_i915_private *i915, struct i915_vma *hwsp); void i915_timeline_fini(struct i915_timeline *tl); -static inline void -i915_timeline_set_subclass(struct i915_timeline *timeline, - unsigned int subclass) -{ - lockdep_set_subclass(&timeline->lock, subclass); - - /* - * Due to an interesting quirk in lockdep's internal debug tracking, - * after setting a subclass we must ensure the lock is used. Otherwise, - * nr_unused_locks is incremented once too often. - */ -#ifdef CONFIG_DEBUG_LOCK_ALLOC - local_irq_disable(); - lock_map_acquire(&timeline->lock.dep_map); - lock_map_release(&timeline->lock.dep_map); - local_irq_enable(); -#endif -} - struct i915_timeline * i915_timeline_create(struct drm_i915_private *i915, struct i915_vma *global_hwsp); diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h index 1688705f4a2b..fce5cb4f1090 100644 --- a/drivers/gpu/drm/i915/i915_timeline_types.h +++ b/drivers/gpu/drm/i915/i915_timeline_types.h @@ -23,10 +23,6 @@ struct i915_timeline { u64 fence_context; u32 seqno; - spinlock_t lock; -#define TIMELINE_CLIENT 0 /* default subclass */ -#define TIMELINE_ENGINE 1 -#define TIMELINE_VIRTUAL 2 struct mutex mutex; /* protects the flow of requests */ unsigned int pin_count; diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index 97f6970d8da8..db531ebc7704 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -740,7 +740,7 @@ static bool __guc_dequeue(struct intel_engine_cs *engine) bool submit = false; struct rb_node *rb; - lockdep_assert_held(&engine->timeline.lock); + lockdep_assert_held(&engine->active.lock); if (port_isset(port)) { if (intel_engine_has_preemption(engine)) { @@ -822,7 +822,7 @@ static void guc_submission_tasklet(unsigned long data) struct i915_request *rq; unsigned long flags; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); rq = port_request(port); while (rq && i915_request_completed(rq)) { @@ -847,7 +847,7 @@ static void guc_submission_tasklet(unsigned long data) if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_PREEMPT)) guc_dequeue(engine); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void guc_reset_prepare(struct intel_engine_cs *engine) @@ -884,7 +884,7 @@ static void guc_reset(struct intel_engine_cs *engine, bool stalled) struct i915_request *rq; unsigned long flags; - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); execlists_cancel_port_requests(execlists); @@ -900,7 +900,7 @@ static void guc_reset(struct intel_engine_cs *engine, bool stalled) intel_lr_context_reset(engine, rq->hw_context, rq->head, stalled); out_unlock: - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void guc_cancel_requests(struct intel_engine_cs *engine) @@ -926,13 +926,13 @@ static void guc_cancel_requests(struct intel_engine_cs *engine) * submission's irq state, we also wish to remind ourselves that * it is irq state.) */ - spin_lock_irqsave(&engine->timeline.lock, flags); + spin_lock_irqsave(&engine->active.lock, flags); /* Cancel the requests on the HW and clear the ELSP tracker. */ execlists_cancel_port_requests(execlists); /* Mark all executing requests as skipped. */ - list_for_each_entry(rq, &engine->timeline.requests, link) { + list_for_each_entry(rq, &engine->active.requests, sched.link) { if (!i915_request_signaled(rq)) dma_fence_set_error(&rq->fence, -EIO); @@ -961,7 +961,7 @@ static void guc_cancel_requests(struct intel_engine_cs *engine) execlists->queue = RB_ROOT_CACHED; GEM_BUG_ON(port_isset(execlists->port)); - spin_unlock_irqrestore(&engine->timeline.lock, flags); + spin_unlock_irqrestore(&engine->active.lock, flags); } static void guc_reset_finish(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c index e084476469ef..65b52be23d42 100644 --- a/drivers/gpu/drm/i915/selftests/mock_timeline.c +++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c @@ -13,7 +13,6 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context) timeline->i915 = NULL; timeline->fence_context = context; - spin_lock_init(&timeline->lock); mutex_init(&timeline->mutex); INIT_ACTIVE_REQUEST(&timeline->last_request);