From patchwork Thu Jul 30 09:37:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4120914DD for ; Thu, 30 Jul 2020 09:38:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2A09B20809 for ; Thu, 30 Jul 2020 09:38:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2A09B20809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3EEBB6E8BC; Thu, 30 Jul 2020 09:38:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id E150B6E8AD for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979049-1500050 for multiple; Thu, 30 Jul 2020 10:37:59 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:36 +0100 Message-Id: <20200730093756.16737-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/21] drm/i915: Add a couple of missing i915_active_fini() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We use i915_active_fini() as a debug check on the i915_active state before freeing. If we forget to call it, we may end up angering the debugobjects contained within. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/display/intel_frontbuffer.c | 2 ++ drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c index 2979ed2588eb..d898b370d7a4 100644 --- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c +++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c @@ -232,6 +232,8 @@ static void frontbuffer_release(struct kref *ref) RCU_INIT_POINTER(obj->frontbuffer, NULL); spin_unlock(&to_i915(obj->base.dev)->fb_tracking.lock); + i915_active_fini(&front->write); + i915_gem_object_put(obj); kfree_rcu(front, rcu); } diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c index 73243ba59c7d..e73854dd2fe0 100644 --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c @@ -47,7 +47,10 @@ static int pulse_active(struct i915_active *active) static void pulse_free(struct kref *kref) { - kfree(container_of(kref, struct pulse, kref)); + struct pulse *p = container_of(kref, typeof(*p), kref); + + i915_active_fini(&p->active); + kfree(p); } static void pulse_put(struct pulse *p) From patchwork Thu Jul 30 09:37:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692625 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D9844722 for ; Thu, 30 Jul 2020 09:38:14 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C20CA2075F for ; Thu, 30 Jul 2020 09:38:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C20CA2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 51B176E8B3; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id F14EE6E8B2 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979050-1500050 for multiple; Thu, 30 Jul 2020 10:37:59 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:37 +0100 Message-Id: <20200730093756.16737-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/21] drm/i915: Skip taking acquire mutex for no ref->active callback X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If no active callback is defined for i915_active, we do not need to serialise its enabling with the mutex. We still do only want to call the debug activate once, and must still serialise with a concurrent retire. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index d960d0be5bd2..500537889e66 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -416,6 +416,14 @@ bool i915_active_acquire_if_busy(struct i915_active *ref) return atomic_add_unless(&ref->count, 1, 0); } +static void __i915_active_activate(struct i915_active *ref) +{ + spin_lock_irq(&ref->tree_lock); /* __active_retire() */ + if (!atomic_fetch_inc(&ref->count)) + debug_active_activate(ref); + spin_unlock_irq(&ref->tree_lock); +} + int i915_active_acquire(struct i915_active *ref) { int err; @@ -423,19 +431,19 @@ int i915_active_acquire(struct i915_active *ref) if (i915_active_acquire_if_busy(ref)) return 0; + if (!ref->active) { + __i915_active_activate(ref); + return 0; + } + err = mutex_lock_interruptible(&ref->mutex); if (err) return err; if (likely(!i915_active_acquire_if_busy(ref))) { - if (ref->active) - err = ref->active(ref); - if (!err) { - spin_lock_irq(&ref->tree_lock); /* __active_retire() */ - debug_active_activate(ref); - atomic_inc(&ref->count); - spin_unlock_irq(&ref->tree_lock); - } + err = ref->active(ref); + if (!err) + __i915_active_activate(ref); } mutex_unlock(&ref->mutex); From patchwork Thu Jul 30 09:37:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 650AC14DD for ; Thu, 30 Jul 2020 09:38:30 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D8DE2075F for ; Thu, 30 Jul 2020 09:38:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D8DE2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 732196E8B9; Thu, 30 Jul 2020 09:38:25 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0C4176E8BA for ; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979051-1500050 for multiple; Thu, 30 Jul 2020 10:37:59 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:38 +0100 Message-Id: <20200730093756.16737-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/21] drm/i915: Export a preallocate variant of i915_active_acquire() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 4 +- drivers/gpu/drm/i915/i915_active.c | 150 ++++++++++++++---- drivers/gpu/drm/i915/i915_active.h | 12 +- 4 files changed, 130 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index b7a86cdec9b5..07cb2dd0f795 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1729,7 +1729,7 @@ __parser_mark_active(struct i915_vma *vma, { struct intel_gt_buffer_pool_node *node = vma->private; - return i915_active_ref(&node->active, tl, fence); + return i915_active_ref(&node->active, tl->fence_context, fence); } static int diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 46d20f5f3ddc..acb43aebd669 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -484,7 +484,9 @@ __intel_timeline_get_seqno(struct intel_timeline *tl, * free it after the current request is retired, which ensures that * all writes into the cacheline from previous requests are complete. */ - err = i915_active_ref(&tl->hwsp_cacheline->active, tl, &rq->fence); + err = i915_active_ref(&tl->hwsp_cacheline->active, + tl->fence_context, + &rq->fence); if (err) goto err_cacheline; diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 500537889e66..3a728401c09c 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -28,12 +28,14 @@ static struct i915_global_active { } global; struct active_node { + struct rb_node node; struct i915_active_fence base; struct i915_active *ref; - struct rb_node node; u64 timeline; }; +#define fetch_node(x) rb_entry(READ_ONCE(x), typeof(struct active_node), node) + static inline struct active_node * node_from_active(struct i915_active_fence *active) { @@ -216,12 +218,9 @@ excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb) active_retire(container_of(cb, struct i915_active, excl.cb)); } -static struct i915_active_fence * -active_instance(struct i915_active *ref, struct intel_timeline *tl) +static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) { - struct active_node *node, *prealloc; - struct rb_node **p, *parent; - u64 idx = tl->fence_context; + struct active_node *it; /* * We track the most recently used timeline to skip a rbtree search @@ -230,8 +229,39 @@ active_instance(struct i915_active *ref, struct intel_timeline *tl) * after the previous activity has been retired, or if it matches the * current timeline. */ - node = READ_ONCE(ref->cache); - if (node && node->timeline == idx) + it = READ_ONCE(ref->cache); + if (it && it->timeline == idx) + return it; + + BUILD_BUG_ON(offsetof(typeof(*it), node)); + + /* While active, the tree can only be built; not destroyed */ + GEM_BUG_ON(i915_active_is_idle(ref)); + + it = fetch_node(ref->tree.rb_node); + while (it) { + if (it->timeline < idx) { + it = fetch_node(it->node.rb_right); + } else if (it->timeline > idx) { + it = fetch_node(it->node.rb_left); + } else { + WRITE_ONCE(ref->cache, it); + break; + } + } + + /* NB: If the tree rotated beneath us, we may miss our target. */ + return it; +} + +static struct i915_active_fence * +active_instance(struct i915_active *ref, u64 idx) +{ + struct active_node *node, *prealloc; + struct rb_node **p, *parent; + + node = __active_lookup(ref, idx); + if (likely(node)) return &node->base; /* Preallocate a replacement, just in case */ @@ -268,10 +298,9 @@ active_instance(struct i915_active *ref, struct intel_timeline *tl) rb_insert_color(&node->node, &ref->tree); out: - ref->cache = node; + WRITE_ONCE(ref->cache, node); spin_unlock_irq(&ref->tree_lock); - BUILD_BUG_ON(offsetof(typeof(*node), base)); return &node->base; } @@ -353,63 +382,102 @@ __active_del_barrier(struct i915_active *ref, struct active_node *node) return ____active_del_barrier(ref, node, barrier_to_engine(node)); } -int i915_active_ref(struct i915_active *ref, - struct intel_timeline *tl, - struct dma_fence *fence) +static bool +replace_barrier(struct i915_active *ref, struct i915_active_fence *active) +{ + if (!is_barrier(active)) /* proto-node used by our idle barrier? */ + return false; + + /* + * This request is on the kernel_context timeline, and so + * we can use it to substitute for the pending idle-barrer + * request that we want to emit on the kernel_context. + */ + __active_del_barrier(ref, node_from_active(active)); + return true; +} + +int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) { struct i915_active_fence *active; int err; - lockdep_assert_held(&tl->mutex); - /* Prevent reaping in case we malloc/wait while building the tree */ err = i915_active_acquire(ref); if (err) return err; - active = active_instance(ref, tl); + active = active_instance(ref, idx); if (!active) { err = -ENOMEM; goto out; } - if (is_barrier(active)) { /* proto-node used by our idle barrier */ - /* - * This request is on the kernel_context timeline, and so - * we can use it to substitute for the pending idle-barrer - * request that we want to emit on the kernel_context. - */ - __active_del_barrier(ref, node_from_active(active)); + if (replace_barrier(ref, active)) { RCU_INIT_POINTER(active->fence, NULL); atomic_dec(&ref->count); } if (!__i915_active_fence_set(active, fence)) - atomic_inc(&ref->count); + __i915_active_acquire(ref); out: i915_active_release(ref); return err; } -struct dma_fence * -i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +static struct dma_fence * +__i915_active_set_fence(struct i915_active *ref, + struct i915_active_fence *active, + struct dma_fence *fence) { struct dma_fence *prev; - /* We expect the caller to manage the exclusive timeline ordering */ - GEM_BUG_ON(i915_active_is_idle(ref)); + if (replace_barrier(ref, active)) { + RCU_INIT_POINTER(active->fence, fence); + return NULL; + } rcu_read_lock(); - prev = __i915_active_fence_set(&ref->excl, f); + prev = __i915_active_fence_set(active, fence); if (prev) prev = dma_fence_get_rcu(prev); else - atomic_inc(&ref->count); + __i915_active_acquire(ref); rcu_read_unlock(); return prev; } +static struct i915_active_fence * +__active_fence(struct i915_active *ref, u64 idx) +{ + struct active_node *it; + + it = __active_lookup(ref, idx); + if (unlikely(!it)) { /* Contention with parallel tree builders! */ + spin_lock_irq(&ref->tree_lock); + it = __active_lookup(ref, idx); + spin_unlock_irq(&ref->tree_lock); + } + GEM_BUG_ON(!it); /* slot must be preallocated */ + + return &it->base; +} + +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) +{ + /* Only valid while active, see i915_active_acquire_for_context() */ + return __i915_active_set_fence(ref, __active_fence(ref, idx), fence); +} + +struct dma_fence * +i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +{ + /* We expect the caller to manage the exclusive timeline ordering */ + return __i915_active_set_fence(ref, &ref->excl, f); +} + bool i915_active_acquire_if_busy(struct i915_active *ref) { debug_active_assert(ref); @@ -451,6 +519,24 @@ int i915_active_acquire(struct i915_active *ref) return err; } +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx) +{ + struct i915_active_fence *active; + int err; + + err = i915_active_acquire(ref); + if (err) + return err; + + active = active_instance(ref, idx); + if (!active) { + i915_active_release(ref); + return -ENOMEM; + } + + return 0; /* return with active ref */ +} + void i915_active_release(struct i915_active *ref) { debug_active_assert(ref); @@ -754,7 +840,7 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) match: rb_erase(p, &ref->tree); /* Hide from waits and sibling allocations */ if (p == &ref->cache->node) - ref->cache = NULL; + WRITE_ONCE(ref->cache, NULL); spin_unlock_irq(&ref->tree_lock); return rb_entry(p, struct active_node, node); @@ -812,7 +898,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, */ RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN)); node->base.cb.node.prev = (void *)engine; - atomic_inc(&ref->count); + __i915_active_acquire(ref); } GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN)); diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index cf4058150966..73ded3c52a04 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -163,14 +163,16 @@ void __i915_active_init(struct i915_active *ref, __i915_active_init(ref, active, retire, &__mkey, &__wkey); \ } while (0) -int i915_active_ref(struct i915_active *ref, - struct intel_timeline *tl, - struct dma_fence *fence); +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); +int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); static inline int i915_active_add_request(struct i915_active *ref, struct i915_request *rq) { - return i915_active_ref(ref, i915_request_timeline(rq), &rq->fence); + return i915_active_ref(ref, + i915_request_timeline(rq)->fence_context, + &rq->fence); } struct dma_fence * @@ -198,7 +200,9 @@ int i915_request_await_active(struct i915_request *rq, #define I915_ACTIVE_AWAIT_BARRIER BIT(2) int i915_active_acquire(struct i915_active *ref); +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx); bool i915_active_acquire_if_busy(struct i915_active *ref); + void i915_active_release(struct i915_active *ref); static inline void __i915_active_acquire(struct i915_active *ref) From patchwork Thu Jul 30 09:37:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692643 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 667AB722 for ; Thu, 30 Jul 2020 09:38:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4F3B22075F for ; Thu, 30 Jul 2020 09:38:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4F3B22075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 99ABC6E8B0; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 281A96E8AF for ; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979052-1500050 for multiple; Thu, 30 Jul 2020 10:38:00 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:39 +0100 Message-Id: <20200730093756.16737-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/21] drm/i915: Keep the most recently used active-fence upon discard X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Whenever an i915_active idles, we prune its tree of old fence slots to prevent a gradual leak should it be used to track many, many timelines. The downside is that we then have to frequently reallocate the rbtree. A compromise is that we keep the most recently used fence slot, and reuse that for the next active reference as that is the most likely timeline to be reused. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 27 ++++++++++++++++++++------- drivers/gpu/drm/i915/i915_active.h | 4 ---- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 3a728401c09c..b9bd5578ff54 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -130,8 +130,8 @@ static inline void debug_active_assert(struct i915_active *ref) { } static void __active_retire(struct i915_active *ref) { + struct rb_root root = RB_ROOT; struct active_node *it, *n; - struct rb_root root; unsigned long flags; GEM_BUG_ON(i915_active_is_idle(ref)); @@ -143,9 +143,21 @@ __active_retire(struct i915_active *ref) GEM_BUG_ON(rcu_access_pointer(ref->excl.fence)); debug_active_deactivate(ref); - root = ref->tree; - ref->tree = RB_ROOT; - ref->cache = NULL; + /* Even if we have not used the cache, we may still have a barrier */ + if (!ref->cache) + ref->cache = fetch_node(ref->tree.rb_node); + + /* Keep the MRU cached node for reuse */ + if (ref->cache) { + /* Discard all other nodes in the tree */ + rb_erase(&ref->cache->node, &ref->tree); + root = ref->tree; + + /* Rebuild the tree with only the cached node */ + rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node); + rb_insert_color(&ref->cache->node, &ref->tree); + GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node); + } spin_unlock_irqrestore(&ref->tree_lock, flags); @@ -156,6 +168,7 @@ __active_retire(struct i915_active *ref) /* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref); + /* Finally free the discarded timeline tree */ rbtree_postorder_for_each_entry_safe(it, n, &root, node) { GEM_BUG_ON(i915_active_fence_isset(&it->base)); kmem_cache_free(global.slab_cache, it); @@ -745,16 +758,16 @@ int i915_sw_fence_await_active(struct i915_sw_fence *fence, return await_active(ref, flags, sw_await_fence, fence, fence); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref) { debug_active_fini(ref); GEM_BUG_ON(atomic_read(&ref->count)); GEM_BUG_ON(work_pending(&ref->work)); - GEM_BUG_ON(!RB_EMPTY_ROOT(&ref->tree)); mutex_destroy(&ref->mutex); + + if (ref->cache) + kmem_cache_free(global.slab_cache, ref->cache); } -#endif static inline bool is_idle_barrier(struct active_node *node, u64 idx) { diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index 73ded3c52a04..b9e0394e2975 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -217,11 +217,7 @@ i915_active_is_idle(const struct i915_active *ref) return !atomic_read(&ref->count); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref); -#else -static inline void i915_active_fini(struct i915_active *ref) { } -#endif int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine); From patchwork Thu Jul 30 09:37:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692645 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11BF214DD for ; Thu, 30 Jul 2020 09:38:29 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EE9412075F for ; Thu, 30 Jul 2020 09:38:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE9412075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BCF3A6E8BA; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1973E6E169 for ; Thu, 30 Jul 2020 09:38:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979053-1500050 for multiple; Thu, 30 Jul 2020 10:38:00 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:40 +0100 Message-Id: <20200730093756.16737-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/21] drm/i915: Make the stale cached active node available for any timeline X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Rather than require the next timeline after idling to match the MRU before idling, reset the index on the node and allow it to match the first request. However, this requires cmpxchg(u64) and so is not trivial on 32b, so for compatibility we just fallback to keeping the cached node pointing to the MRU timeline. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_active.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index b9bd5578ff54..7b51045c8461 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -157,6 +157,10 @@ __active_retire(struct i915_active *ref) rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node); rb_insert_color(&ref->cache->node, &ref->tree); GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node); + + /* Make the cached node available for reuse with any timeline */ + if (IS_ENABLED(CONFIG_64BIT)) + ref->cache->timeline = 0; /* needs cmpxchg(u64) */ } spin_unlock_irqrestore(&ref->tree_lock, flags); @@ -235,6 +239,8 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) { struct active_node *it; + GEM_BUG_ON(idx == 0); /* 0 is the unordered timeline, rsvd for cache */ + /* * We track the most recently used timeline to skip a rbtree search * for the common case, under typical loads we never need the rbtree @@ -243,8 +249,28 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) * current timeline. */ it = READ_ONCE(ref->cache); - if (it && it->timeline == idx) - return it; + if (it) { + u64 cached = READ_ONCE(it->timeline); + + /* Once claimed, this slot will only belong to this idx */ + if (cached == idx) + return it; + +#ifdef CONFIG_64BIT /* for cmpxchg(u64) */ + /* + * An unclaimed cache [.timeline=0] can only be claimed once. + * + * If the value is already non-zero, some other thread has + * claimed the cache and we know that is does not match our + * idx. If, and only if, the timeline is currently zero is it + * worth competing to claim it atomically for ourselves (for + * only the winner of that race will cmpxchg return the old + * value of 0). + */ + if (!cached && !cmpxchg(&it->timeline, 0, idx)) + return it; +#endif + } BUILD_BUG_ON(offsetof(typeof(*it), node)); From patchwork Thu Jul 30 09:37:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692637 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41EF5912 for ; Thu, 30 Jul 2020 09:38:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 298D62075F for ; Thu, 30 Jul 2020 09:38:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 298D62075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 32EE56E8AB; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id EA49E6E091 for ; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979054-1500050 for multiple; Thu, 30 Jul 2020 10:38:00 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:41 +0100 Message-Id: <20200730093756.16737-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/21] drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As the conversion between idle-barrier and full i915_active_fence is already serialised by explicit memory barriers, we can reduce the spinlock in i915_active_acquire_preallocate_barrier() for finding an idle-barrier to reuse to an RCU read lock to ensure the fence remains valid, only taking the spinlock for the update of the rbtree itself. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 7b51045c8461..5dd52bb6d38c 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -807,7 +807,6 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) if (RB_EMPTY_ROOT(&ref->tree)) return NULL; - spin_lock_irq(&ref->tree_lock); GEM_BUG_ON(i915_active_is_idle(ref)); /* @@ -872,11 +871,10 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) goto match; } - spin_unlock_irq(&ref->tree_lock); - return NULL; match: + spin_lock_irq(&ref->tree_lock); rb_erase(p, &ref->tree); /* Hide from waits and sibling allocations */ if (p == &ref->cache->node) WRITE_ONCE(ref->cache, NULL); @@ -911,7 +909,9 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct llist_node *prev = first; struct active_node *node; + rcu_read_lock(); node = reuse_idle_barrier(ref, idx); + rcu_read_unlock(); if (!node) { node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); if (!node) { From patchwork Thu Jul 30 09:37:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692631 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E9A2912 for ; Thu, 30 Jul 2020 09:38:18 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 173F72075F for ; Thu, 30 Jul 2020 09:38:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 173F72075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D49646E8B7; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2494C6E091 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979055-1500050 for multiple; Thu, 30 Jul 2020 10:38:00 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:42 +0100 Message-Id: <20200730093756.16737-8-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/21] drm/i915: Provide a fastpath for waiting on vma bindings X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Before we can execute a request, we must wait for all of its vma to be bound. This is a frequent operation for which we can optimise away a few atomic operations (notably a cmpxchg) in lieu of the RCU protection. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_active.h | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_vma.c | 9 +++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index b9e0394e2975..fb165d3f01cf 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -231,4 +231,19 @@ struct i915_active *i915_active_create(void); struct i915_active *i915_active_get(struct i915_active *ref); void i915_active_put(struct i915_active *ref); +static inline int __i915_request_await_exclusive(struct i915_request *rq, + struct i915_active *active) +{ + struct dma_fence *fence; + int err = 0; + + fence = i915_active_fence_get(&active->excl); + if (fence) { + err = i915_request_await_dma_fence(rq, fence); + dma_fence_put(fence); + } + + return err; +} + #endif /* _I915_ACTIVE_H_ */ diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index bc64f773dcdb..cd12047c7791 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1167,6 +1167,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma) list_del(&vma->obj->userfault_link); } +static int +__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma) +{ + return __i915_request_await_exclusive(rq, &vma->active); +} + int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) { int err; @@ -1174,8 +1180,7 @@ int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) GEM_BUG_ON(!i915_vma_is_pinned(vma)); /* Wait for the vma to be bound before we start! */ - err = i915_request_await_active(rq, &vma->active, - I915_ACTIVE_AWAIT_EXCL); + err = __i915_request_await_bind(rq, vma); if (err) return err; From patchwork Thu Jul 30 09:37:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692633 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 11ED8912 for ; Thu, 30 Jul 2020 09:38:19 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EEDA52075F for ; Thu, 30 Jul 2020 09:38:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EEDA52075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 986FC6E8B1; Thu, 30 Jul 2020 09:38:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id C37EB6E8B0 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979056-1500050 for multiple; Thu, 30 Jul 2020 10:38:00 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:43 +0100 Message-Id: <20200730093756.16737-9-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/21] drm/i915/gem: Reduce ctx->engine_mutex for reading the clone source X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" When cloning the engines from the source context, we need to ensure that the engines are not freed as we copy them, and that the flags we clone from the source correspond with the engines we copy across. To do this we need only take a reference to the src->engines, rather than hold the src->engine_mutex, so long as we verify that nothing changed under the read. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 24 +++++++++++++-------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index d0bdb6d447ed..b5b179f96d77 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -752,7 +752,8 @@ __create_context(struct drm_i915_private *i915) } static inline struct i915_gem_engines * -__context_engines_await(const struct i915_gem_context *ctx) +__context_engines_await(const struct i915_gem_context *ctx, + bool *user_engines) { struct i915_gem_engines *engines; @@ -761,6 +762,10 @@ __context_engines_await(const struct i915_gem_context *ctx) engines = rcu_dereference(ctx->engines); GEM_BUG_ON(!engines); + if (user_engines) + *user_engines = i915_gem_context_user_engines(ctx); + + /* successful await => strong mb */ if (unlikely(!i915_sw_fence_await(&engines->fence))) continue; @@ -784,7 +789,7 @@ context_apply_all(struct i915_gem_context *ctx, struct intel_context *ce; int err = 0; - e = __context_engines_await(ctx); + e = __context_engines_await(ctx, NULL); for_each_gem_engine(ce, e, it) { err = fn(ce, data); if (err) @@ -1117,7 +1122,7 @@ static int context_barrier_task(struct i915_gem_context *ctx, return err; } - e = __context_engines_await(ctx); + e = __context_engines_await(ctx, NULL); if (!e) { i915_active_release(&cb->base); return -ENOENT; @@ -2114,11 +2119,14 @@ static int copy_ring_size(struct intel_context *dst, static int clone_engines(struct i915_gem_context *dst, struct i915_gem_context *src) { - struct i915_gem_engines *e = i915_gem_context_lock_engines(src); - struct i915_gem_engines *clone; + struct i915_gem_engines *clone, *e; bool user_engines; unsigned long n; + e = __context_engines_await(src, &user_engines); + if (!e) + return -ENOENT; + clone = alloc_engines(e->num_engines); if (!clone) goto err_unlock; @@ -2160,9 +2168,7 @@ static int clone_engines(struct i915_gem_context *dst, } } clone->num_engines = n; - - user_engines = i915_gem_context_user_engines(src); - i915_gem_context_unlock_engines(src); + i915_sw_fence_complete(&e->fence); /* Serialised by constructor */ engines_idle_release(dst, rcu_replace_pointer(dst->engines, clone, 1)); @@ -2173,7 +2179,7 @@ static int clone_engines(struct i915_gem_context *dst, return 0; err_unlock: - i915_gem_context_unlock_engines(src); + i915_sw_fence_complete(&e->fence); return -ENOMEM; } From patchwork Thu Jul 30 09:37:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC51E722 for ; Thu, 30 Jul 2020 09:38:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A4CCF2075F for ; Thu, 30 Jul 2020 09:38:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A4CCF2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8B4A96E8BD; Thu, 30 Jul 2020 09:38:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 331056E8B1 for ; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979057-1500050 for multiple; Thu, 30 Jul 2020 10:38:01 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:44 +0100 Message-Id: <20200730093756.16737-10-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/21] drm/i915/gem: Reduce ctx->engines_mutex for get_engines() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Take a snapshot of the ctx->engines, so we can avoid taking the ctx->engines_mutex for a mere read in get_engines(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 +++++---------------- 1 file changed, 8 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index b5b179f96d77..f43f0ca4eec9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1862,27 +1862,6 @@ set_engines(struct i915_gem_context *ctx, return 0; } -static struct i915_gem_engines * -__copy_engines(struct i915_gem_engines *e) -{ - struct i915_gem_engines *copy; - unsigned int n; - - copy = alloc_engines(e->num_engines); - if (!copy) - return ERR_PTR(-ENOMEM); - - for (n = 0; n < e->num_engines; n++) { - if (e->engines[n]) - copy->engines[n] = intel_context_get(e->engines[n]); - else - copy->engines[n] = NULL; - } - copy->num_engines = n; - - return copy; -} - static int get_engines(struct i915_gem_context *ctx, struct drm_i915_gem_context_param *args) @@ -1890,19 +1869,17 @@ get_engines(struct i915_gem_context *ctx, struct i915_context_param_engines __user *user; struct i915_gem_engines *e; size_t n, count, size; + bool user_engines; int err = 0; - err = mutex_lock_interruptible(&ctx->engines_mutex); - if (err) - return err; + e = __context_engines_await(ctx, &user_engines); + if (!e) + return -ENOENT; - e = NULL; - if (i915_gem_context_user_engines(ctx)) - e = __copy_engines(i915_gem_context_engines(ctx)); - mutex_unlock(&ctx->engines_mutex); - if (IS_ERR_OR_NULL(e)) { + if (!user_engines) { + i915_sw_fence_complete(&e->fence); args->size = 0; - return PTR_ERR_OR_ZERO(e); + return 0; } count = e->num_engines; @@ -1953,7 +1930,7 @@ get_engines(struct i915_gem_context *ctx, args->size = size; err_free: - free_engines(e); + i915_sw_fence_complete(&e->fence); return err; } From patchwork Thu Jul 30 09:37:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692635 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F4A1722 for ; Thu, 30 Jul 2020 09:38:25 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 270C92075F for ; Thu, 30 Jul 2020 09:38:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 270C92075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 313E76E169; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 284866E8B0 for ; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979058-1500050 for multiple; Thu, 30 Jul 2020 10:38:01 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:45 +0100 Message-Id: <20200730093756.16737-11-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/21] drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since the breadcrumb enabling/cancelling itself is serialised by the breadcrumbs.irq_lock, with a bit of care we can remove the outer serialisation with i915_request.lock for concurrent dma_fence_enable_signaling(). This has the important side-effect of eliminating the nested i915_request.lock within request submission. The challenge in serialisation is around the unsubmission where we take an active request that wants a breadcrumb on the signaling engine and put it to sleep. We do not want a concurrent dma_fence_enable_signaling() to attach a breadcrumb as we unsubmit, so we must mark the request as no longer active before serialising with the concurrent enable-signaling. On retire, we serialise with the concurrent enable-signaling, but instead of clearing ACTIVE, we mark it as SIGNALED. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 130 +++++++++++++------- drivers/gpu/drm/i915/gt/intel_lrc.c | 14 --- drivers/gpu/drm/i915/i915_request.c | 39 +++--- 3 files changed, 100 insertions(+), 83 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 91786310c114..3d211a0c2b5a 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -220,17 +220,17 @@ static void signal_irq_work(struct irq_work *work) } } -static bool __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) +static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) { struct intel_engine_cs *engine = container_of(b, struct intel_engine_cs, breadcrumbs); lockdep_assert_held(&b->irq_lock); if (b->irq_armed) - return true; + return; if (!intel_gt_pm_get_if_awake(engine->gt)) - return false; + return; /* * The breadcrumb irq will be disarmed on the interrupt after the @@ -250,8 +250,6 @@ static bool __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) if (!b->irq_enabled++) irq_enable(engine); - - return true; } void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine) @@ -310,57 +308,99 @@ void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) { } -bool i915_request_enable_breadcrumb(struct i915_request *rq) +static void insert_breadcrumb(struct i915_request *rq, + struct intel_breadcrumbs *b) { - lockdep_assert_held(&rq->lock); + struct intel_context *ce = rq->context; + struct list_head *pos; - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags)) - return true; + if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) + return; - if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) { - struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; - struct intel_context *ce = rq->context; - struct list_head *pos; + __intel_breadcrumbs_arm_irq(b); - spin_lock(&b->irq_lock); + /* + * We keep the seqno in retirement order, so we can break + * inside intel_engine_signal_breadcrumbs as soon as we've + * passed the last completed request (or seen a request that + * hasn't event started). We could walk the timeline->requests, + * but keeping a separate signalers_list has the advantage of + * hopefully being much smaller than the full list and so + * provides faster iteration and detection when there are no + * more interrupts required for this context. + * + * We typically expect to add new signalers in order, so we + * start looking for our insertion point from the tail of + * the list. + */ + list_for_each_prev(pos, &ce->signals) { + struct i915_request *it = + list_entry(pos, typeof(*it), signal_link); - if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) - goto unlock; + if (i915_seqno_passed(rq->fence.seqno, it->fence.seqno)) + break; + } + list_add(&rq->signal_link, pos); + if (pos == &ce->signals) /* catch transitions from empty list */ + list_move_tail(&ce->signal_link, &b->signalers); + GEM_BUG_ON(!check_signal_order(ce, rq)); - if (!__intel_breadcrumbs_arm_irq(b)) - goto unlock; + set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); +} - /* - * We keep the seqno in retirement order, so we can break - * inside intel_engine_signal_breadcrumbs as soon as we've - * passed the last completed request (or seen a request that - * hasn't event started). We could walk the timeline->requests, - * but keeping a separate signalers_list has the advantage of - * hopefully being much smaller than the full list and so - * provides faster iteration and detection when there are no - * more interrupts required for this context. - * - * We typically expect to add new signalers in order, so we - * start looking for our insertion point from the tail of - * the list. - */ - list_for_each_prev(pos, &ce->signals) { - struct i915_request *it = - list_entry(pos, typeof(*it), signal_link); +bool i915_request_enable_breadcrumb(struct i915_request *rq) +{ + struct intel_breadcrumbs *b; - if (i915_seqno_passed(rq->fence.seqno, it->fence.seqno)) - break; - } - list_add(&rq->signal_link, pos); - if (pos == &ce->signals) /* catch transitions from empty list */ - list_move_tail(&ce->signal_link, &b->signalers); - GEM_BUG_ON(!check_signal_order(ce, rq)); + /* Serialises with i915_request_retire() using rq->lock */ + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags)) + return true; - set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); -unlock: + /* + * Peek at i915_request_submit()/i915_request_unsubmit() status. + * + * If the request is not yet active (and not signaled), we will + * attach the breadcrumb later. + */ + if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) + return true; + + /* + * rq->engine is locked by rq->engine->active.lock. That however + * is not known until after rq->engine has been dereferenced and + * the lock acquired. Hence we acquire the lock and then validate + * that rq->engine still matches the lock we hold for it. + * + * Here, we are using the breadcrumb lock as a proxy for the + * rq->engine->active.lock, and we know that since the breadcrumb + * will be serialised within i915_request_submit/i915_request_unsubmit, + * the engine cannot change while active as long as we hold the + * breadcrumb lock on that engine. + * + * From the dma_fence_enable_signaling() path, we are outside of the + * request submit/unsubmit path, and so we must be more careful to + * acquire the right lock. + */ + b = &READ_ONCE(rq->engine)->breadcrumbs; + spin_lock(&b->irq_lock); + while (unlikely(b != &READ_ONCE(rq->engine)->breadcrumbs)) { spin_unlock(&b->irq_lock); + b = &READ_ONCE(rq->engine)->breadcrumbs; + spin_lock(&b->irq_lock); } + /* + * Now that we are finally serialised with request submit/unsubmit, + * [with b->irq_lock] and with i915_request_retire() [via checking + * SIGNALED with rq->lock] confirm the request is indeed active. If + * it is no longer active, the breadcrumb will be attached upon + * i915_request_submit(). + */ + if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) + insert_breadcrumb(rq, b); + + spin_unlock(&b->irq_lock); + return !__request_completed(rq); } @@ -368,8 +408,6 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq) { struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; - lockdep_assert_held(&rq->lock); - /* * We must wait for b->irq_lock so that we know the interrupt handler * has released its reference to the intel_context and has completed diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 353b1717fe84..ae886081a431 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1148,20 +1148,6 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine) } else { struct intel_engine_cs *owner = rq->context->engine; - /* - * Decouple the virtual breadcrumb before moving it - * back to the virtual engine -- we don't want the - * request to complete in the background and try - * and cancel the breadcrumb on the virtual engine - * (instead of the old engine where it is linked)! - */ - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - &rq->fence.flags)) { - spin_lock_nested(&rq->lock, - SINGLE_DEPTH_NESTING); - i915_request_cancel_breadcrumb(rq); - spin_unlock(&rq->lock); - } WRITE_ONCE(rq->engine, owner); owner->submit_request(rq); active = NULL; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index a83641b7aa39..350bf6f158cb 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -304,11 +304,12 @@ bool i915_request_retire(struct i915_request *rq) dma_fence_signal_locked(&rq->fence); if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) i915_request_cancel_breadcrumb(rq); + spin_unlock_irq(&rq->lock); + if (i915_request_has_waitboost(rq)) { GEM_BUG_ON(!atomic_read(&rq->engine->gt->rps.num_waiters)); atomic_dec(&rq->engine->gt->rps.num_waiters); } - spin_unlock_irq(&rq->lock); /* * We only loosely track inflight requests across preemption, @@ -591,17 +592,9 @@ bool __i915_request_submit(struct i915_request *request) */ __notify_execute_cb_irq(request); - /* We may be recursing from the signal callback of another i915 fence */ - if (!i915_request_signaled(request)) { - spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING); - - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - &request->fence.flags) && - !i915_request_enable_breadcrumb(request)) - intel_engine_signal_breadcrumbs(engine); - - spin_unlock(&request->lock); - } + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) && + !i915_request_enable_breadcrumb(request)) + intel_engine_signal_breadcrumbs(engine); return result; } @@ -623,27 +616,27 @@ void __i915_request_unsubmit(struct i915_request *request) { struct intel_engine_cs *engine = request->engine; + /* + * Only unwind in reverse order, required so that the per-context list + * is kept in seqno/ring order. + */ RQ_TRACE(request, "\n"); GEM_BUG_ON(!irqs_disabled()); lockdep_assert_held(&engine->active.lock); /* - * Only unwind in reverse order, required so that the per-context list - * is kept in seqno/ring order. + * Before we remove this breadcrumb from the signal list, we have + * to ensure that a concurrent dma_fence_enable_signaling() does not + * attach itself. We first mark the request as no longer active and + * make sure that is visible to other cores, and then remove the + * breadcrumb if attached. */ - - /* We may be recursing from the signal callback of another i915 fence */ - spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING); - + GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)); + clear_bit_unlock(I915_FENCE_FLAG_ACTIVE, &request->fence.flags); if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags)) i915_request_cancel_breadcrumb(request); - GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)); - clear_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags); - - spin_unlock(&request->lock); - /* We've already spun, don't charge on resubmitting. */ if (request->sched.semaphores && i915_request_started(request)) request->sched.semaphores = 0; From patchwork Thu Jul 30 09:37:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03F20912 for ; Thu, 30 Jul 2020 09:38:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E0AD520809 for ; Thu, 30 Jul 2020 09:38:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E0AD520809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5D08D6E8BE; Thu, 30 Jul 2020 09:38:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id B6A626E8AF for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979059-1500050 for multiple; Thu, 30 Jul 2020 10:38:01 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:46 +0100 Message-Id: <20200730093756.16737-12-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/21] drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" After staring at the breadcrumb enabling/cancellation and coming to the conclusion that the cause of the mysterious stale breadcrumbs must the act of submitting a completed requests, we can then redirect those completed requests onto a dedicated signaled_list at the time of construction and so eliminate intel_engine_transfer_stale_breadcrumbs(). Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 50 ++++++++------------- drivers/gpu/drm/i915/gt/intel_engine.h | 3 -- drivers/gpu/drm/i915/gt/intel_lrc.c | 15 ------- drivers/gpu/drm/i915/i915_request.c | 5 +-- 4 files changed, 21 insertions(+), 52 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 3d211a0c2b5a..fbdc465a5870 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -142,16 +142,16 @@ static void add_retire(struct intel_breadcrumbs *b, struct intel_timeline *tl) intel_engine_add_retire(engine, tl); } -static void __signal_request(struct i915_request *rq, struct list_head *signals) +static bool __signal_request(struct i915_request *rq, struct list_head *signals) { - GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)); clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); if (!__dma_fence_signal(&rq->fence)) - return; + return false; i915_request_get(rq); list_add_tail(&rq->signal_link, signals); + return true; } static void signal_irq_work(struct irq_work *work) @@ -278,32 +278,6 @@ void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) spin_unlock_irqrestore(&b->irq_lock, flags); } -void intel_engine_transfer_stale_breadcrumbs(struct intel_engine_cs *engine, - struct intel_context *ce) -{ - struct intel_breadcrumbs *b = &engine->breadcrumbs; - unsigned long flags; - - spin_lock_irqsave(&b->irq_lock, flags); - if (!list_empty(&ce->signals)) { - struct i915_request *rq, *next; - - /* Queue for executing the signal callbacks in the irq_work */ - list_for_each_entry_safe(rq, next, &ce->signals, signal_link) { - GEM_BUG_ON(rq->engine != engine); - GEM_BUG_ON(!__request_completed(rq)); - - __signal_request(rq, &b->signaled_requests); - } - - INIT_LIST_HEAD(&ce->signals); - list_del_init(&ce->signal_link); - - irq_work_queue(&b->irq_work); - } - spin_unlock_irqrestore(&b->irq_lock, flags); -} - void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) { } @@ -317,6 +291,17 @@ static void insert_breadcrumb(struct i915_request *rq, if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) return; + /* + * If the request is already completed, we can transfer it + * straight onto a signaled list, and queue the irq worker for + * its signal completion. + */ + if (__request_completed(rq)) { + if (__signal_request(rq, &b->signaled_requests)) + irq_work_queue(&b->irq_work); + return; + } + __intel_breadcrumbs_arm_irq(b); /* @@ -344,8 +329,11 @@ static void insert_breadcrumb(struct i915_request *rq, if (pos == &ce->signals) /* catch transitions from empty list */ list_move_tail(&ce->signal_link, &b->signalers); GEM_BUG_ON(!check_signal_order(ce, rq)); - set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); + + /* Check after attaching to irq, interrupt may have already fired. */ + if (__request_completed(rq)) + irq_work_queue(&b->irq_work); } bool i915_request_enable_breadcrumb(struct i915_request *rq) @@ -401,7 +389,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq) spin_unlock(&b->irq_lock); - return !__request_completed(rq); + return true; } void i915_request_cancel_breadcrumb(struct i915_request *rq) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index a9249a23903a..faf00a353e25 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -237,9 +237,6 @@ intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine) void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine); void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine); -void intel_engine_transfer_stale_breadcrumbs(struct intel_engine_cs *engine, - struct intel_context *ce); - void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, struct drm_printer *p); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index ae886081a431..a4959e8229ac 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1805,18 +1805,6 @@ static bool virtual_matches(const struct virtual_engine *ve, return true; } -static void virtual_xfer_breadcrumbs(struct virtual_engine *ve) -{ - /* - * All the outstanding signals on ve->siblings[0] must have - * been completed, just pending the interrupt handler. As those - * signals still refer to the old sibling (via rq->engine), we must - * transfer those to the old irq_worker to keep our locking - * consistent. - */ - intel_engine_transfer_stale_breadcrumbs(ve->siblings[0], &ve->context); -} - #define for_each_waiter(p__, rq__) \ list_for_each_entry_lockless(p__, \ &(rq__)->sched.waiters_list, \ @@ -2275,9 +2263,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) virtual_update_register_offsets(regs, engine); - if (!list_empty(&ve->context.signals)) - virtual_xfer_breadcrumbs(ve); - /* * Move the bound engine to the top of the list * for future execution. We then kick this diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 350bf6f158cb..0cf2a10a24f1 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -592,9 +592,8 @@ bool __i915_request_submit(struct i915_request *request) */ __notify_execute_cb_irq(request); - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) && - !i915_request_enable_breadcrumb(request)) - intel_engine_signal_breadcrumbs(engine); + if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags)) + i915_request_enable_breadcrumb(request); return result; } From patchwork Thu Jul 30 09:37:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692627 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2539C912 for ; Thu, 30 Jul 2020 09:38:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0DAEE2075F for ; Thu, 30 Jul 2020 09:38:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0DAEE2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 662CF6E8B5; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id AEAB56E169 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979060-1500050 for multiple; Thu, 30 Jul 2020 10:38:01 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:47 +0100 Message-Id: <20200730093756.16737-13-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/21] drm/i915/gt: Only transfer the virtual context to the new engine if active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Nayana, Venkata Ramana" , thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" One more complication of preempt-to-busy with respect to the virtual engine is that we may have retired the last request along the virtual engine at the same time as preparing to submit the completed request to a new engine. That submit will be shortcircuited, but not before we have updated the context with the new register offsets and marked the virtual engine as bound to the new engine (by calling swap on ve->siblings[]). As we may have just retired the completed request, we may also be in the middle of calling virtual_context_exit() to turn off the power management associated with the virtual engine, and that in turn walks the ve->siblings[]. If we happen to call swap() on the array as we walk, we will call intel_engine_pm_put() twice on the same engine. In this patch, we prevent this by only updating the bound engine after a successful submission which weeds out the already completed requests. Alternatively, we could walk a non-volatile array for the pm, such as using the engine->mask. The small advantage to performing the update after the submit is that we then only have to do a swap for active requests. Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") References: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine" Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: "Nayana, Venkata Ramana" Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_lrc.c | 65 ++++++++++++++++++----------- 1 file changed, 40 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index a4959e8229ac..904e9b8bcbf6 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1805,6 +1805,33 @@ static bool virtual_matches(const struct virtual_engine *ve, return true; } +static void virtual_xfer_context(struct virtual_engine *ve, + struct intel_engine_cs *engine) +{ + unsigned int n; + + if (likely(engine == ve->siblings[0])) + return; + + GEM_BUG_ON(READ_ONCE(ve->context.inflight)); + if (!intel_engine_has_relative_mmio(engine)) + virtual_update_register_offsets(ve->context.lrc_reg_state, + engine); + + /* + * Move the bound engine to the top of the list for + * future execution. We then kick this tasklet first + * before checking others, so that we preferentially + * reuse this set of bound registers. + */ + for (n = 1; n < ve->num_siblings; n++) { + if (ve->siblings[n] == engine) { + swap(ve->siblings[n], ve->siblings[0]); + break; + } + } +} + #define for_each_waiter(p__, rq__) \ list_for_each_entry_lockless(p__, \ &(rq__)->sched.waiters_list, \ @@ -2253,35 +2280,23 @@ static void execlists_dequeue(struct intel_engine_cs *engine) GEM_BUG_ON(!(rq->execution_mask & engine->mask)); WRITE_ONCE(rq->engine, engine); - if (engine != ve->siblings[0]) { - u32 *regs = ve->context.lrc_reg_state; - unsigned int n; - - GEM_BUG_ON(READ_ONCE(ve->context.inflight)); - - if (!intel_engine_has_relative_mmio(engine)) - virtual_update_register_offsets(regs, - engine); - + if (__i915_request_submit(rq)) { /* - * Move the bound engine to the top of the list - * for future execution. We then kick this - * tasklet first before checking others, so that - * we preferentially reuse this set of bound - * registers. + * Only after we confirm that we will submit + * this request (i.e. it has not already + * completed), do we want to update the context. + * + * This serves two purposes. It avoids + * unnecessary work if we are resubmitting an + * already completed request after timeslicing. + * But more importantly, it prevents us altering + * ve->siblings[] on an idle context, where + * we may be using ve->siblings[] in + * virtual_context_enter / virtual_context_exit. */ - for (n = 1; n < ve->num_siblings; n++) { - if (ve->siblings[n] == engine) { - swap(ve->siblings[n], - ve->siblings[0]); - break; - } - } - + virtual_xfer_context(ve, engine); GEM_BUG_ON(ve->siblings[0] != engine); - } - if (__i915_request_submit(rq)) { submit = true; last = rq; } From patchwork Thu Jul 30 09:37:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692647 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB869722 for ; Thu, 30 Jul 2020 09:38:29 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A44912075F for ; Thu, 30 Jul 2020 09:38:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A44912075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C526C6E8BB; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 001B06E091 for ; Thu, 30 Jul 2020 09:38:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979061-1500050 for multiple; Thu, 30 Jul 2020 10:38:02 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:48 +0100 Message-Id: <20200730093756.16737-14-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/21] drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 76 ++++++++++--------- drivers/gpu/drm/i915/gt/intel_breadcrumbs.h | 36 +++++++++ .../gpu/drm/i915/gt/intel_breadcrumbs_types.h | 47 ++++++++++++ drivers/gpu/drm/i915/gt/intel_engine.h | 17 ----- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 14 +++- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 3 +- drivers/gpu/drm/i915/gt/intel_engine_types.h | 31 +------- drivers/gpu/drm/i915/gt/intel_gt_irq.c | 1 + drivers/gpu/drm/i915/gt/intel_lrc.c | 11 ++- drivers/gpu/drm/i915/gt/intel_reset.c | 1 + .../gpu/drm/i915/gt/intel_ring_submission.c | 3 +- drivers/gpu/drm/i915/gt/intel_rps.c | 1 + drivers/gpu/drm/i915/gt/mock_engine.c | 10 ++- drivers/gpu/drm/i915/i915_irq.c | 1 + drivers/gpu/drm/i915/i915_request.c | 1 + drivers/gpu/drm/i915/i915_request.h | 4 - 16 files changed, 162 insertions(+), 95 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_breadcrumbs.h create mode 100644 drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index fbdc465a5870..2ffd47a86656 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -28,6 +28,7 @@ #include "i915_drv.h" #include "i915_trace.h" +#include "intel_breadcrumbs.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" @@ -55,30 +56,28 @@ static void irq_disable(struct intel_engine_cs *engine) static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) { - struct intel_engine_cs *engine = - container_of(b, struct intel_engine_cs, breadcrumbs); - lockdep_assert_held(&b->irq_lock); + if (!b->irq_engine || !b->irq_armed) + return; + GEM_BUG_ON(!b->irq_enabled); if (!--b->irq_enabled) - irq_disable(engine); + irq_disable(b->irq_engine); WRITE_ONCE(b->irq_armed, false); - intel_gt_pm_put_async(engine->gt); + intel_gt_pm_put_async(b->irq_engine->gt); } -void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine) +void intel_breadcrumbs_park(struct intel_breadcrumbs *b) { - struct intel_breadcrumbs *b = &engine->breadcrumbs; unsigned long flags; if (!READ_ONCE(b->irq_armed)) return; spin_lock_irqsave(&b->irq_lock, flags); - if (b->irq_armed) - __intel_breadcrumbs_disarm_irq(b); + __intel_breadcrumbs_disarm_irq(b); spin_unlock_irqrestore(&b->irq_lock, flags); } @@ -133,13 +132,8 @@ __dma_fence_signal__notify(struct dma_fence *fence, static void add_retire(struct intel_breadcrumbs *b, struct intel_timeline *tl) { - struct intel_engine_cs *engine = - container_of(b, struct intel_engine_cs, breadcrumbs); - - if (unlikely(intel_engine_is_virtual(engine))) - engine = intel_virtual_engine_get_sibling(engine, 0); - - intel_engine_add_retire(engine, tl); + if (b->irq_engine) + intel_engine_add_retire(b->irq_engine, tl); } static bool __signal_request(struct i915_request *rq, struct list_head *signals) @@ -164,7 +158,7 @@ static void signal_irq_work(struct irq_work *work) spin_lock(&b->irq_lock); - if (b->irq_armed && list_empty(&b->signalers)) + if (list_empty(&b->signalers)) __intel_breadcrumbs_disarm_irq(b); list_splice_init(&b->signaled_requests, &signal); @@ -222,14 +216,12 @@ static void signal_irq_work(struct irq_work *work) static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) { - struct intel_engine_cs *engine = - container_of(b, struct intel_engine_cs, breadcrumbs); - lockdep_assert_held(&b->irq_lock); - if (b->irq_armed) + + if (!b->irq_engine || b->irq_armed) return; - if (!intel_gt_pm_get_if_awake(engine->gt)) + if (!intel_gt_pm_get_if_awake(b->irq_engine->gt)) return; /* @@ -249,37 +241,49 @@ static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) */ if (!b->irq_enabled++) - irq_enable(engine); + irq_enable(b->irq_engine); } -void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine) +struct intel_breadcrumbs * +intel_breadcrumbs_create(struct intel_engine_cs *irq_engine) { - struct intel_breadcrumbs *b = &engine->breadcrumbs; + struct intel_breadcrumbs *b; + + b = kzalloc(sizeof(*b), GFP_KERNEL); + if (!b) + return NULL; spin_lock_init(&b->irq_lock); INIT_LIST_HEAD(&b->signalers); INIT_LIST_HEAD(&b->signaled_requests); init_irq_work(&b->irq_work, signal_irq_work); + + b->irq_engine = irq_engine; + + return b; } -void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine) +void intel_breadcrumbs_reset(struct intel_breadcrumbs *b) { - struct intel_breadcrumbs *b = &engine->breadcrumbs; unsigned long flags; + if (!b->irq_engine) + return; + spin_lock_irqsave(&b->irq_lock, flags); if (b->irq_enabled) - irq_enable(engine); + irq_enable(b->irq_engine); else - irq_disable(engine); + irq_disable(b->irq_engine); spin_unlock_irqrestore(&b->irq_lock, flags); } -void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) +void intel_breadcrumbs_free(struct intel_breadcrumbs *b) { + kfree(b); } static void insert_breadcrumb(struct i915_request *rq, @@ -369,11 +373,11 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq) * request submit/unsubmit path, and so we must be more careful to * acquire the right lock. */ - b = &READ_ONCE(rq->engine)->breadcrumbs; + b = READ_ONCE(rq->engine)->breadcrumbs; spin_lock(&b->irq_lock); - while (unlikely(b != &READ_ONCE(rq->engine)->breadcrumbs)) { + while (unlikely(b != READ_ONCE(rq->engine)->breadcrumbs)) { spin_unlock(&b->irq_lock); - b = &READ_ONCE(rq->engine)->breadcrumbs; + b = READ_ONCE(rq->engine)->breadcrumbs; spin_lock(&b->irq_lock); } @@ -394,7 +398,7 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq) void i915_request_cancel_breadcrumb(struct i915_request *rq) { - struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; + struct intel_breadcrumbs *b = rq->engine->breadcrumbs; /* * We must wait for b->irq_lock so that we know the interrupt handler @@ -418,11 +422,11 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq) void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, struct drm_printer *p) { - struct intel_breadcrumbs *b = &engine->breadcrumbs; + struct intel_breadcrumbs *b = engine->breadcrumbs; struct intel_context *ce; struct i915_request *rq; - if (list_empty(&b->signalers)) + if (!b || list_empty(&b->signalers)) return; drm_printf(p, "Signals:\n"); diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h new file mode 100644 index 000000000000..ed3d1deabfbd --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __INTEL_BREADCRUMBS__ +#define __INTEL_BREADCRUMBS__ + +#include + +#include "intel_engine_types.h" + +struct drm_printer; +struct i915_request; +struct intel_breadcrumbs; + +struct intel_breadcrumbs * +intel_breadcrumbs_create(struct intel_engine_cs *irq_engine); +void intel_breadcrumbs_free(struct intel_breadcrumbs *b); + +void intel_breadcrumbs_reset(struct intel_breadcrumbs *b); +void intel_breadcrumbs_park(struct intel_breadcrumbs *b); + +static inline void +intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine) +{ + irq_work_queue(&engine->breadcrumbs->irq_work); +} + +void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, + struct drm_printer *p); + +bool i915_request_enable_breadcrumb(struct i915_request *request); +void i915_request_cancel_breadcrumb(struct i915_request *request); + +#endif /* __INTEL_BREADCRUMBS__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h new file mode 100644 index 000000000000..8e53b9942695 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __INTEL_BREADCRUMBS_TYPES__ +#define __INTEL_BREADCRUMBS_TYPES__ + +#include +#include +#include +#include + +/* + * Rather than have every client wait upon all user interrupts, + * with the herd waking after every interrupt and each doing the + * heavyweight seqno dance, we delegate the task (of being the + * bottom-half of the user interrupt) to the first client. After + * every interrupt, we wake up one client, who does the heavyweight + * coherent seqno read and either goes back to sleep (if incomplete), + * or wakes up all the completed clients in parallel, before then + * transferring the bottom-half status to the next client in the queue. + * + * Compared to walking the entire list of waiters in a single dedicated + * bottom-half, we reduce the latency of the first waiter by avoiding + * a context switch, but incur additional coherent seqno reads when + * following the chain of request breadcrumbs. Since it is most likely + * that we have a single client waiting on each seqno, then reducing + * the overhead of waking that client is much preferred. + */ +struct intel_breadcrumbs { + spinlock_t irq_lock; /* protects the lists used in hardirq context */ + + /* Not all breadcrumbs are attached to physical HW */ + struct intel_engine_cs *irq_engine; + + struct list_head signalers; + struct list_head signaled_requests; + + struct irq_work irq_work; /* for use from inside irq_lock */ + + unsigned int irq_enabled; + + bool irq_armed; +}; + +#endif /* __INTEL_BREADCRUMBS_TYPES__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index faf00a353e25..08e2c000dcc3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -223,23 +223,6 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, void intel_engine_init_execlists(struct intel_engine_cs *engine); -void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine); -void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine); - -void intel_engine_disarm_breadcrumbs(struct intel_engine_cs *engine); - -static inline void -intel_engine_signal_breadcrumbs(struct intel_engine_cs *engine) -{ - irq_work_queue(&engine->breadcrumbs.irq_work); -} - -void intel_engine_reset_breadcrumbs(struct intel_engine_cs *engine); -void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine); - -void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, - struct drm_printer *p); - static inline u32 *__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset) { memset(batch, 0, 6 * sizeof(u32)); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index dd1a42c4d344..c20a91c1318f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -28,6 +28,7 @@ #include "i915_drv.h" +#include "intel_breadcrumbs.h" #include "intel_context.h" #include "intel_engine.h" #include "intel_engine_pm.h" @@ -700,8 +701,13 @@ static int engine_setup_common(struct intel_engine_cs *engine) if (err) return err; + engine->breadcrumbs = intel_breadcrumbs_create(engine); + if (!engine->breadcrumbs) { + err = -ENOMEM; + goto err_status; + } + intel_engine_init_active(engine, ENGINE_PHYSICAL); - intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); intel_engine_init_cmd_parser(engine); intel_engine_init__pm(engine); @@ -716,6 +722,10 @@ static int engine_setup_common(struct intel_engine_cs *engine) intel_engine_init_ctx_wa(engine); return 0; + +err_status: + cleanup_status_page(engine); + return err; } struct measure_breadcrumb { @@ -902,9 +912,9 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) tasklet_kill(&engine->execlists.tasklet); /* flush the callback */ cleanup_status_page(engine); + intel_breadcrumbs_free(engine->breadcrumbs); intel_engine_fini_retire(engine); - intel_engine_fini_breadcrumbs(engine); intel_engine_cleanup_cmd_parser(engine); if (engine->default_state) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 8ec3eecf3e39..f7b2e07e2229 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -6,6 +6,7 @@ #include "i915_drv.h" +#include "intel_breadcrumbs.h" #include "intel_context.h" #include "intel_engine.h" #include "intel_engine_heartbeat.h" @@ -247,7 +248,7 @@ static int __engine_park(struct intel_wakeref *wf) call_idle_barriers(engine); /* cleanup after wedging */ intel_engine_park_heartbeat(engine); - intel_engine_disarm_breadcrumbs(engine); + intel_breadcrumbs_park(engine->breadcrumbs); /* Must be reset upon idling, or we may miss the busy wakeup. */ GEM_BUG_ON(engine->execlists.queue_priority_hint != INT_MIN); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 8de92fd7d392..c400aaa2287b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -22,6 +22,7 @@ #include "i915_pmu.h" #include "i915_priolist_types.h" #include "i915_selftest.h" +#include "intel_breadcrumbs_types.h" #include "intel_sseu.h" #include "intel_timeline_types.h" #include "intel_uncore.h" @@ -373,34 +374,8 @@ struct intel_engine_cs { */ struct ewma__engine_latency latency; - /* Rather than have every client wait upon all user interrupts, - * with the herd waking after every interrupt and each doing the - * heavyweight seqno dance, we delegate the task (of being the - * bottom-half of the user interrupt) to the first client. After - * every interrupt, we wake up one client, who does the heavyweight - * coherent seqno read and either goes back to sleep (if incomplete), - * or wakes up all the completed clients in parallel, before then - * transferring the bottom-half status to the next client in the queue. - * - * Compared to walking the entire list of waiters in a single dedicated - * bottom-half, we reduce the latency of the first waiter by avoiding - * a context switch, but incur additional coherent seqno reads when - * following the chain of request breadcrumbs. Since it is most likely - * that we have a single client waiting on each seqno, then reducing - * the overhead of waking that client is much preferred. - */ - struct intel_breadcrumbs { - spinlock_t irq_lock; - struct list_head signalers; - - struct list_head signaled_requests; - - struct irq_work irq_work; /* for use from inside irq_lock */ - - unsigned int irq_enabled; - - bool irq_armed; - } breadcrumbs; + /* Keep track of all the seqno used, a trail of breadcrumbs */ + struct intel_breadcrumbs *breadcrumbs; struct intel_engine_pmu { /** diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index b05da68e52f4..257063a57101 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -8,6 +8,7 @@ #include "i915_drv.h" #include "i915_irq.h" +#include "intel_breadcrumbs.h" #include "intel_gt.h" #include "intel_gt_irq.h" #include "intel_uncore.h" diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 904e9b8bcbf6..03cac9303dfc 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -137,6 +137,7 @@ #include "i915_perf.h" #include "i915_trace.h" #include "i915_vgpu.h" +#include "intel_breadcrumbs.h" #include "intel_context.h" #include "intel_engine_pm.h" #include "intel_gt.h" @@ -4119,7 +4120,7 @@ static int execlists_resume(struct intel_engine_cs *engine) { intel_mocs_init_engine(engine); - intel_engine_reset_breadcrumbs(engine); + intel_breadcrumbs_reset(engine->breadcrumbs); if (GEM_SHOW_DEBUG() && unexpected_starting_state(engine)) { struct drm_printer p = drm_debug_printer(__func__); @@ -5715,9 +5716,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings, snprintf(ve->base.name, sizeof(ve->base.name), "virtual"); intel_engine_init_active(&ve->base, ENGINE_VIRTUAL); - intel_engine_init_breadcrumbs(&ve->base); intel_engine_init_execlists(&ve->base); - ve->base.breadcrumbs.irq_armed = true; /* fake HW, used for irq_work */ ve->base.cops = &virtual_context_ops; ve->base.request_alloc = execlists_request_alloc; @@ -5734,6 +5733,12 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings, intel_context_init(&ve->context, &ve->base); + ve->base.breadcrumbs = intel_breadcrumbs_create(NULL); + if (!ve->base.breadcrumbs) { + err = -ENOMEM; + goto err_put; + } + for (n = 0; n < count; n++) { struct intel_engine_cs *sibling = siblings[n]; diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 46a5ceffc22f..ac36b67fb46b 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -15,6 +15,7 @@ #include "i915_drv.h" #include "i915_gpu_error.h" #include "i915_irq.h" +#include "intel_breadcrumbs.h" #include "intel_engine_pm.h" #include "intel_gt.h" #include "intel_gt_pm.h" diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 94915f668715..186aa2d3a83e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -32,6 +32,7 @@ #include "gen6_ppgtt.h" #include "gen7_renderclear.h" #include "i915_drv.h" +#include "intel_breadcrumbs.h" #include "intel_context.h" #include "intel_gt.h" #include "intel_reset.h" @@ -255,7 +256,7 @@ static int xcs_resume(struct intel_engine_cs *engine) else ring_setup_status_page(engine); - intel_engine_reset_breadcrumbs(engine); + intel_breadcrumbs_reset(engine->breadcrumbs); /* Enforce ordering by reading HEAD register back */ ENGINE_POSTING_READ(engine, RING_HEAD); diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 97ba14ad52e4..e6a00eea0631 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -7,6 +7,7 @@ #include #include "i915_drv.h" +#include "intel_breadcrumbs.h" #include "intel_gt.h" #include "intel_gt_clock_utils.h" #include "intel_gt_irq.h" diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 06303ba98c19..d1971ffdca42 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -261,11 +261,12 @@ static void mock_engine_release(struct intel_engine_cs *engine) GEM_BUG_ON(timer_pending(&mock->hw_delay)); + intel_breadcrumbs_free(engine->breadcrumbs); + intel_context_unpin(engine->kernel_context); intel_context_put(engine->kernel_context); intel_engine_fini_retire(engine); - intel_engine_fini_breadcrumbs(engine); } struct intel_engine_cs *mock_engine(struct drm_i915_private *i915, @@ -323,11 +324,14 @@ int mock_engine_init(struct intel_engine_cs *engine) struct intel_context *ce; intel_engine_init_active(engine, ENGINE_MOCK); - intel_engine_init_breadcrumbs(engine); intel_engine_init_execlists(engine); intel_engine_init__pm(engine); intel_engine_init_retire(engine); + engine->breadcrumbs = intel_breadcrumbs_create(NULL); + if (!engine->breadcrumbs) + return -ENOMEM; + ce = create_kernel_context(engine); if (IS_ERR(ce)) goto err_breadcrumbs; @@ -339,7 +343,7 @@ int mock_engine_init(struct intel_engine_cs *engine) return 0; err_breadcrumbs: - intel_engine_fini_breadcrumbs(engine); + intel_breadcrumbs_free(engine->breadcrumbs); return -ENOMEM; } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 1fa67700d8f4..f113fe44572b 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -41,6 +41,7 @@ #include "display/intel_lpe_audio.h" #include "display/intel_psr.h" +#include "gt/intel_breadcrumbs.h" #include "gt/intel_gt.h" #include "gt/intel_gt_irq.h" #include "gt/intel_gt_pm_irq.h" diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 0cf2a10a24f1..3b121fa38c60 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -31,6 +31,7 @@ #include #include "gem/i915_gem_context.h" +#include "gt/intel_breadcrumbs.h" #include "gt/intel_context.h" #include "gt/intel_ring.h" #include "gt/intel_rps.h" diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index fc18378c685d..16b721080195 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -361,10 +361,6 @@ void i915_request_submit(struct i915_request *request); void __i915_request_unsubmit(struct i915_request *request); void i915_request_unsubmit(struct i915_request *request); -/* Note: part of the intel_breadcrumbs family */ -bool i915_request_enable_breadcrumb(struct i915_request *request); -void i915_request_cancel_breadcrumb(struct i915_request *request); - long i915_request_wait(struct i915_request *rq, unsigned int flags, long timeout) From patchwork Thu Jul 30 09:37:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692661 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5EEFF722 for ; Thu, 30 Jul 2020 09:38:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47BCE20809 for ; Thu, 30 Jul 2020 09:38:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47BCE20809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BE3A26E8C2; Thu, 30 Jul 2020 09:38:53 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id D60166E091 for ; Thu, 30 Jul 2020 09:38:08 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979062-1500050 for multiple; Thu, 30 Jul 2020 10:38:02 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:49 +0100 Message-Id: <20200730093756.16737-15-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/21] drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Move the __intel_breadcrumbs_arm_irq earlier, next to the disarm_irq, so that we can make use of it in the following patch. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 84 ++++++++++----------- 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 2ffd47a86656..9dd99969fd07 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -54,6 +54,36 @@ static void irq_disable(struct intel_engine_cs *engine) spin_unlock(&engine->gt->irq_lock); } +static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) +{ + lockdep_assert_held(&b->irq_lock); + + if (!b->irq_engine || b->irq_armed) + return; + + if (!intel_gt_pm_get_if_awake(b->irq_engine->gt)) + return; + + /* + * The breadcrumb irq will be disarmed on the interrupt after the + * waiters are signaled. This gives us a single interrupt window in + * which we can add a new waiter and avoid the cost of re-enabling + * the irq. + */ + WRITE_ONCE(b->irq_armed, true); + + /* + * Since we are waiting on a request, the GPU should be busy + * and should have its own rpm reference. This is tracked + * by i915->gt.awake, we can forgo holding our own wakref + * for the interrupt as before i915->gt.awake is released (when + * the driver is idle) we disarm the breadcrumbs. + */ + + if (!b->irq_enabled++) + irq_enable(b->irq_engine); +} + static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) { lockdep_assert_held(&b->irq_lock); @@ -69,18 +99,6 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) intel_gt_pm_put_async(b->irq_engine->gt); } -void intel_breadcrumbs_park(struct intel_breadcrumbs *b) -{ - unsigned long flags; - - if (!READ_ONCE(b->irq_armed)) - return; - - spin_lock_irqsave(&b->irq_lock, flags); - __intel_breadcrumbs_disarm_irq(b); - spin_unlock_irqrestore(&b->irq_lock, flags); -} - static inline bool __request_completed(const struct i915_request *rq) { return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno); @@ -214,36 +232,6 @@ static void signal_irq_work(struct irq_work *work) } } -static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b) -{ - lockdep_assert_held(&b->irq_lock); - - if (!b->irq_engine || b->irq_armed) - return; - - if (!intel_gt_pm_get_if_awake(b->irq_engine->gt)) - return; - - /* - * The breadcrumb irq will be disarmed on the interrupt after the - * waiters are signaled. This gives us a single interrupt window in - * which we can add a new waiter and avoid the cost of re-enabling - * the irq. - */ - WRITE_ONCE(b->irq_armed, true); - - /* - * Since we are waiting on a request, the GPU should be busy - * and should have its own rpm reference. This is tracked - * by i915->gt.awake, we can forgo holding our own wakref - * for the interrupt as before i915->gt.awake is released (when - * the driver is idle) we disarm the breadcrumbs. - */ - - if (!b->irq_enabled++) - irq_enable(b->irq_engine); -} - struct intel_breadcrumbs * intel_breadcrumbs_create(struct intel_engine_cs *irq_engine) { @@ -281,6 +269,18 @@ void intel_breadcrumbs_reset(struct intel_breadcrumbs *b) spin_unlock_irqrestore(&b->irq_lock, flags); } +void intel_breadcrumbs_park(struct intel_breadcrumbs *b) +{ + unsigned long flags; + + if (!READ_ONCE(b->irq_armed)) + return; + + spin_lock_irqsave(&b->irq_lock, flags); + __intel_breadcrumbs_disarm_irq(b); + spin_unlock_irqrestore(&b->irq_lock, flags); +} + void intel_breadcrumbs_free(struct intel_breadcrumbs *b) { kfree(b); From patchwork Thu Jul 30 09:37:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692641 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C1FC722 for ; Thu, 30 Jul 2020 09:38:27 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 74B852075F for ; Thu, 30 Jul 2020 09:38:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 74B852075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 721A06E8AF; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0B6076E8B3 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979063-1500050 for multiple; Thu, 30 Jul 2020 10:38:02 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:50 +0100 Message-Id: <20200730093756.16737-16-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 15/21] drm/i915/gt: Hold context/request reference while breadcrumbs are active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Currently we hold no actual reference to the request nor context while they are attached to a breadcrumb. To avoid freeing the request/context too early, we serialise with cancel-breadcrumbs by taking the irq spinlock in i915_request_retire(). The alternative is to take a reference for a new breadcrumb and release it upon signaling; removing the more frequently hit contention point in i915_request_retire(). Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 57 ++++++++++++++++----- drivers/gpu/drm/i915/i915_request.c | 9 ++-- 2 files changed, 47 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 9dd99969fd07..fc6f0223d2c8 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -29,6 +29,7 @@ #include "i915_drv.h" #include "i915_trace.h" #include "intel_breadcrumbs.h" +#include "intel_context.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" @@ -99,6 +100,22 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) intel_gt_pm_put_async(b->irq_engine->gt); } +static void add_signaling_context(struct intel_breadcrumbs *b, + struct intel_context *ce) +{ + intel_context_get(ce); + list_add_tail(&ce->signal_link, &b->signalers); + if (list_is_first(&ce->signal_link, &b->signalers)) + __intel_breadcrumbs_arm_irq(b); +} + +static void remove_signaling_context(struct intel_breadcrumbs *b, + struct intel_context *ce) +{ + list_del(&ce->signal_link); + intel_context_put(ce); +} + static inline bool __request_completed(const struct i915_request *rq) { return i915_seqno_passed(__hwsp_seqno(rq), rq->fence.seqno); @@ -107,6 +124,9 @@ static inline bool __request_completed(const struct i915_request *rq) __maybe_unused static bool check_signal_order(struct intel_context *ce, struct i915_request *rq) { + if (rq->context != ce) + return false; + if (!list_is_last(&rq->signal_link, &ce->signals) && i915_seqno_passed(rq->fence.seqno, list_next_entry(rq, signal_link)->fence.seqno)) @@ -158,10 +178,11 @@ static bool __signal_request(struct i915_request *rq, struct list_head *signals) { clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); - if (!__dma_fence_signal(&rq->fence)) + if (!__dma_fence_signal(&rq->fence)) { + i915_request_put(rq); return false; + } - i915_request_get(rq); list_add_tail(&rq->signal_link, signals); return true; } @@ -209,8 +230,8 @@ static void signal_irq_work(struct irq_work *work) /* Advance the list to the first incomplete request */ __list_del_many(&ce->signals, pos); if (&ce->signals == pos) { /* now empty */ - list_del_init(&ce->signal_link); add_retire(b, ce->timeline); + remove_signaling_context(b, ce); } } } @@ -279,6 +300,9 @@ void intel_breadcrumbs_park(struct intel_breadcrumbs *b) spin_lock_irqsave(&b->irq_lock, flags); __intel_breadcrumbs_disarm_irq(b); spin_unlock_irqrestore(&b->irq_lock, flags); + + if (!list_empty(&b->signalers)) + irq_work_queue(&b->irq_work); } void intel_breadcrumbs_free(struct intel_breadcrumbs *b) @@ -295,6 +319,8 @@ static void insert_breadcrumb(struct i915_request *rq, if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) return; + i915_request_get(rq); + /* * If the request is already completed, we can transfer it * straight onto a signaled list, and queue the irq worker for @@ -306,8 +332,6 @@ static void insert_breadcrumb(struct i915_request *rq, return; } - __intel_breadcrumbs_arm_irq(b); - /* * We keep the seqno in retirement order, so we can break * inside intel_engine_signal_breadcrumbs as soon as we've @@ -322,16 +346,19 @@ static void insert_breadcrumb(struct i915_request *rq, * start looking for our insertion point from the tail of * the list. */ - list_for_each_prev(pos, &ce->signals) { - struct i915_request *it = - list_entry(pos, typeof(*it), signal_link); - - if (i915_seqno_passed(rq->fence.seqno, it->fence.seqno)) - break; + if (list_empty(&ce->signals)) { + add_signaling_context(b, ce); + pos = &ce->signals; + } else { + list_for_each_prev(pos, &ce->signals) { + struct i915_request *it = + list_entry(pos, typeof(*it), signal_link); + + if (i915_seqno_passed(rq->fence.seqno, it->fence.seqno)) + break; + } } list_add(&rq->signal_link, pos); - if (pos == &ce->signals) /* catch transitions from empty list */ - list_move_tail(&ce->signal_link, &b->signalers); GEM_BUG_ON(!check_signal_order(ce, rq)); set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); @@ -412,9 +439,10 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq) list_del(&rq->signal_link); if (list_empty(&ce->signals)) - list_del_init(&ce->signal_link); + remove_signaling_context(b, ce); clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); + i915_request_put(rq); } spin_unlock(&b->irq_lock); } @@ -429,6 +457,7 @@ void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, if (!b || list_empty(&b->signalers)) return; + drm_printf(p, "IRQ: %s\n", enableddisabled(b->irq_armed)); drm_printf(p, "Signals:\n"); spin_lock_irq(&b->irq_lock); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 3b121fa38c60..e36253bb6124 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -300,12 +300,11 @@ bool i915_request_retire(struct i915_request *rq) __i915_request_fill(rq, POISON_FREE); rq->ring->head = rq->postfix; - spin_lock_irq(&rq->lock); - if (!i915_request_signaled(rq)) + if (!i915_request_signaled(rq)) { + spin_lock_irq(&rq->lock); dma_fence_signal_locked(&rq->fence); - if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &rq->fence.flags)) - i915_request_cancel_breadcrumb(rq); - spin_unlock_irq(&rq->lock); + spin_unlock_irq(&rq->lock); + } if (i915_request_has_waitboost(rq)) { GEM_BUG_ON(!atomic_read(&rq->engine->gt->rps.num_waiters)); From patchwork Thu Jul 30 09:37:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC08D722 for ; Thu, 30 Jul 2020 09:38:41 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C3E0820809 for ; Thu, 30 Jul 2020 09:38:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C3E0820809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2FB4F6E8AD; Thu, 30 Jul 2020 09:38:41 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 392326E8AD for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979064-1500050 for multiple; Thu, 30 Jul 2020 10:38:02 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:51 +0100 Message-Id: <20200730093756.16737-17-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 16/21] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Make b->signaled_requests a lockless-list so that we can manipulate it outside of the b->irq_lock. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 30 +++++++++++-------- .../gpu/drm/i915/gt/intel_breadcrumbs_types.h | 2 +- drivers/gpu/drm/i915/i915_request.h | 6 +++- 3 files changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index fc6f0223d2c8..6a278bf0fc6b 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -174,16 +174,13 @@ static void add_retire(struct intel_breadcrumbs *b, struct intel_timeline *tl) intel_engine_add_retire(b->irq_engine, tl); } -static bool __signal_request(struct i915_request *rq, struct list_head *signals) +static bool __signal_request(struct i915_request *rq) { - clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); - if (!__dma_fence_signal(&rq->fence)) { i915_request_put(rq); return false; } - list_add_tail(&rq->signal_link, signals); return true; } @@ -191,17 +188,19 @@ static void signal_irq_work(struct irq_work *work) { struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work); const ktime_t timestamp = ktime_get(); + struct llist_node *signal, *sn; struct intel_context *ce, *cn; struct list_head *pos, *next; - LIST_HEAD(signal); + + signal = NULL; + if (unlikely(!llist_empty(&b->signaled_requests))) + signal = llist_del_all(&b->signaled_requests); spin_lock(&b->irq_lock); - if (list_empty(&b->signalers)) + if (!signal && list_empty(&b->signalers)) __intel_breadcrumbs_disarm_irq(b); - list_splice_init(&b->signaled_requests, &signal); - list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) { GEM_BUG_ON(list_empty(&ce->signals)); @@ -218,7 +217,11 @@ static void signal_irq_work(struct irq_work *work) * spinlock as the callback chain may end up adding * more signalers to the same context or engine. */ - __signal_request(rq, &signal); + clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); + if (__signal_request(rq)) { + rq->signal_node.next = signal; + signal = &rq->signal_node; + } } /* @@ -238,9 +241,9 @@ static void signal_irq_work(struct irq_work *work) spin_unlock(&b->irq_lock); - list_for_each_safe(pos, next, &signal) { + llist_for_each_safe(signal, sn, signal) { struct i915_request *rq = - list_entry(pos, typeof(*rq), signal_link); + llist_entry(signal, typeof(*rq), signal_node); struct list_head cb_list; spin_lock(&rq->lock); @@ -264,7 +267,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine) spin_lock_init(&b->irq_lock); INIT_LIST_HEAD(&b->signalers); - INIT_LIST_HEAD(&b->signaled_requests); + init_llist_head(&b->signaled_requests); init_irq_work(&b->irq_work, signal_irq_work); @@ -327,7 +330,8 @@ static void insert_breadcrumb(struct i915_request *rq, * its signal completion. */ if (__request_completed(rq)) { - if (__signal_request(rq, &b->signaled_requests)) + if (__signal_request(rq) && + llist_add(&rq->signal_node, &b->signaled_requests)) irq_work_queue(&b->irq_work); return; } diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h index 8e53b9942695..3fa19820b37a 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h @@ -35,7 +35,7 @@ struct intel_breadcrumbs { struct intel_engine_cs *irq_engine; struct list_head signalers; - struct list_head signaled_requests; + struct llist_head signaled_requests; struct irq_work irq_work; /* for use from inside irq_lock */ diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 16b721080195..874af6db6103 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -176,7 +176,11 @@ struct i915_request { struct intel_context *context; struct intel_ring *ring; struct intel_timeline __rcu *timeline; - struct list_head signal_link; + + union { + struct list_head signal_link; + struct llist_node signal_node; + }; /* * The rcu epoch of when this request was allocated. Used to judiciously From patchwork Thu Jul 30 09:37:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692623 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6641E722 for ; Thu, 30 Jul 2020 09:38:13 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4EB3F2075F for ; Thu, 30 Jul 2020 09:38:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4EB3F2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 419EF6E8B2; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id E98C26E8B1 for ; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979065-1500050 for multiple; Thu, 30 Jul 2020 10:38:02 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:52 +0100 Message-Id: <20200730093756.16737-18-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 17/21] drm/i915/gt: Protect context lifetime with RCU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allow a brief period for continued access to a dead intel_context by deferring the release of the struct until after an RCU grace period. As we are using a dedicated slab cache for the contexts, we can defer the release of the slab pages via RCU, with the caveat that individual structs may be reused from the freelist within an RCU grace period. To handle that, we have to avoid clearing members of the zombie struct. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_context.c | 330 +++++++++++++----------- drivers/gpu/drm/i915/i915_active.c | 10 + drivers/gpu/drm/i915/i915_active.h | 2 + drivers/gpu/drm/i915/i915_utils.h | 7 + 4 files changed, 202 insertions(+), 147 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 52db2bde44a3..4e7924640ffa 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -22,7 +22,7 @@ static struct i915_global_context { static struct intel_context *intel_context_alloc(void) { - return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL); + return kmem_cache_alloc(global.slab_ce, GFP_KERNEL); } void intel_context_free(struct intel_context *ce) @@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce) kmem_cache_free(global.slab_ce, ce); } +static int __context_pin_state(struct i915_vma *vma) +{ + unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; + int err; + + err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH); + if (err) + return err; + + err = i915_active_acquire(&vma->active); + if (err) + goto err_unpin; + + /* + * And mark it as a globally pinned object to let the shrinker know + * it cannot reclaim the object until we release it. + */ + i915_vma_make_unshrinkable(vma); + vma->obj->mm.dirty = true; + + return 0; + +err_unpin: + i915_vma_unpin(vma); + return err; +} + +static void __context_unpin_state(struct i915_vma *vma) +{ + i915_vma_make_shrinkable(vma); + i915_active_release(&vma->active); + __i915_vma_unpin(vma); +} + +static int __ring_active(struct intel_ring *ring) +{ + int err; + + err = intel_ring_pin(ring); + if (err) + return err; + + err = i915_active_acquire(&ring->vma->active); + if (err) + goto err_pin; + + return 0; + +err_pin: + intel_ring_unpin(ring); + return err; +} + +static void __ring_retire(struct intel_ring *ring) +{ + i915_active_release(&ring->vma->active); + intel_ring_unpin(ring); +} + +__i915_active_call +static void __intel_context_retire(struct i915_active *active) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n", + intel_context_get_total_runtime_ns(ce), + intel_context_get_avg_runtime_ns(ce)); + + set_bit(CONTEXT_VALID_BIT, &ce->flags); + if (ce->state) + __context_unpin_state(ce->state); + + intel_timeline_unpin(ce->timeline); + __ring_retire(ce->ring); + + intel_context_put(ce); +} + +static int __intel_context_active(struct i915_active *active) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + int err; + + CE_TRACE(ce, "active\n"); + + intel_context_get(ce); + + err = __ring_active(ce->ring); + if (err) + goto err_put; + + err = intel_timeline_pin(ce->timeline); + if (err) + goto err_ring; + + if (!ce->state) + return 0; + + err = __context_pin_state(ce->state); + if (err) + goto err_timeline; + + return 0; + +err_timeline: + intel_timeline_unpin(ce->timeline); +err_ring: + __ring_retire(ce->ring); +err_put: + intel_context_put(ce); + return err; +} + +static void __intel_context_ctor(void *arg) +{ + struct intel_context *ce = arg; + + INIT_LIST_HEAD(&ce->signal_link); + INIT_LIST_HEAD(&ce->signals); + + atomic_set(&ce->pin_count, 0); + mutex_init(&ce->pin_mutex); + + ce->active_count = 0; + i915_active_init(&ce->active, + __intel_context_active, __intel_context_retire); + + ce->inflight = NULL; + ce->lrc_reg_state = NULL; + ce->lrc.desc = 0; +} + +static void +__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) +{ + GEM_BUG_ON(!engine->cops); + GEM_BUG_ON(!engine->gt->vm); + + kref_init(&ce->ref); + i915_active_reinit(&ce->active); + mutex_reinit(&ce->pin_mutex); + + ce->engine = engine; + ce->ops = engine->cops; + ce->sseu = engine->sseu; + + ce->wa_bb_page = 0; + ce->flags = 0; + ce->tag = 0; + + memset(&ce->runtime, 0, sizeof(ce->runtime)); + + ce->vm = i915_vm_get(engine->gt->vm); + ce->gem_context = NULL; + + ce->ring = __intel_context_ring_size(SZ_4K); + ce->timeline = NULL; + ce->state = NULL; + + GEM_BUG_ON(atomic_read(&ce->pin_count)); + GEM_BUG_ON(ce->active_count); + GEM_BUG_ON(ce->inflight); +} + +void +intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) +{ + __intel_context_ctor(ce); + __intel_context_init(ce, engine); +} + struct intel_context * intel_context_create(struct intel_engine_cs *engine) { @@ -39,7 +210,7 @@ intel_context_create(struct intel_engine_cs *engine) if (!ce) return ERR_PTR(-ENOMEM); - intel_context_init(ce, engine); + __intel_context_init(ce, engine); return ce; } @@ -158,154 +329,13 @@ void intel_context_unpin(struct intel_context *ce) /* * Once released, we may asynchronously drop the active reference. * As that may be the only reference keeping the context alive, - * take an extra now so that it is not freed before we finish + * hold onto RCU so that it is not freed before we finish * dereferencing it. */ - intel_context_get(ce); + rcu_read_lock(); intel_context_active_release(ce); - intel_context_put(ce); -} - -static int __context_pin_state(struct i915_vma *vma) -{ - unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; - int err; - - err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH); - if (err) - return err; - - err = i915_active_acquire(&vma->active); - if (err) - goto err_unpin; - - /* - * And mark it as a globally pinned object to let the shrinker know - * it cannot reclaim the object until we release it. - */ - i915_vma_make_unshrinkable(vma); - vma->obj->mm.dirty = true; - - return 0; - -err_unpin: - i915_vma_unpin(vma); - return err; -} - -static void __context_unpin_state(struct i915_vma *vma) -{ - i915_vma_make_shrinkable(vma); - i915_active_release(&vma->active); - __i915_vma_unpin(vma); -} - -static int __ring_active(struct intel_ring *ring) -{ - int err; - - err = intel_ring_pin(ring); - if (err) - return err; - - err = i915_active_acquire(&ring->vma->active); - if (err) - goto err_pin; - - return 0; - -err_pin: - intel_ring_unpin(ring); - return err; -} - -static void __ring_retire(struct intel_ring *ring) -{ - i915_active_release(&ring->vma->active); - intel_ring_unpin(ring); + rcu_read_unlock(); } - -__i915_active_call -static void __intel_context_retire(struct i915_active *active) -{ - struct intel_context *ce = container_of(active, typeof(*ce), active); - - CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n", - intel_context_get_total_runtime_ns(ce), - intel_context_get_avg_runtime_ns(ce)); - - set_bit(CONTEXT_VALID_BIT, &ce->flags); - if (ce->state) - __context_unpin_state(ce->state); - - intel_timeline_unpin(ce->timeline); - __ring_retire(ce->ring); - - intel_context_put(ce); -} - -static int __intel_context_active(struct i915_active *active) -{ - struct intel_context *ce = container_of(active, typeof(*ce), active); - int err; - - CE_TRACE(ce, "active\n"); - - intel_context_get(ce); - - err = __ring_active(ce->ring); - if (err) - goto err_put; - - err = intel_timeline_pin(ce->timeline); - if (err) - goto err_ring; - - if (!ce->state) - return 0; - - err = __context_pin_state(ce->state); - if (err) - goto err_timeline; - - return 0; - -err_timeline: - intel_timeline_unpin(ce->timeline); -err_ring: - __ring_retire(ce->ring); -err_put: - intel_context_put(ce); - return err; -} - -void -intel_context_init(struct intel_context *ce, - struct intel_engine_cs *engine) -{ - GEM_BUG_ON(!engine->cops); - GEM_BUG_ON(!engine->gt->vm); - - kref_init(&ce->ref); - - ce->engine = engine; - ce->ops = engine->cops; - ce->sseu = engine->sseu; - ce->ring = __intel_context_ring_size(SZ_4K); - - ewma_runtime_init(&ce->runtime.avg); - - ce->vm = i915_vm_get(engine->gt->vm); - - INIT_LIST_HEAD(&ce->signal_link); - INIT_LIST_HEAD(&ce->signals); - - mutex_init(&ce->pin_mutex); - - i915_active_init(&ce->active, - __intel_context_active, __intel_context_retire); -} - void intel_context_fini(struct intel_context *ce) { if (ce->timeline) @@ -333,7 +363,13 @@ static struct i915_global_context global = { { int __init i915_global_context_init(void) { - global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN); + global.slab_ce = + kmem_cache_create("intel_context", + sizeof(struct intel_context), + __alignof__(struct intel_context), + SLAB_HWCACHE_ALIGN | + SLAB_TYPESAFE_BY_RCU, + __intel_context_ctor); if (!global.slab_ce) return -ENOMEM; diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 5dd52bb6d38c..878fe6664f19 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -795,6 +795,16 @@ void i915_active_fini(struct i915_active *ref) kmem_cache_free(global.slab_cache, ref->cache); } +void i915_active_reinit(struct i915_active *ref) +{ + GEM_BUG_ON(!i915_active_is_idle(ref)); + debug_active_init(ref); + mutex_reinit(&ref->mutex); + + ref->cache = NULL; + ref->tree = RB_ROOT; +} + static inline bool is_idle_barrier(struct active_node *node, u64 idx) { return node->timeline == idx && !i915_active_fence_isset(&node->base); diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index fb165d3f01cf..6df7e721616d 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -219,6 +219,8 @@ i915_active_is_idle(const struct i915_active *ref) void i915_active_fini(struct i915_active *ref); +void i915_active_reinit(struct i915_active *ref); + int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine); void i915_active_acquire_barrier(struct i915_active *ref); diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index 54773371e6bd..ef8db3aa75c7 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -443,6 +443,13 @@ static inline bool timer_expired(const struct timer_list *t) return READ_ONCE(t->expires) && !timer_pending(t); } +static inline void mutex_reinit(struct mutex *lock) +{ +#if IS_ENABLED(CONFIG_DEBUG_MUTEXES) + lock->magic = lock; +#endif +} + /* * This is a lookalike for IS_ENABLED() that takes a kconfig value, * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero From patchwork Thu Jul 30 09:37:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692639 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBDD414DD for ; Thu, 30 Jul 2020 09:38:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B427C2075F for ; Thu, 30 Jul 2020 09:38:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B427C2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 23DE36E091; Thu, 30 Jul 2020 09:38:24 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2532C6E8AD for ; Thu, 30 Jul 2020 09:38:10 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979066-1500050 for multiple; Thu, 30 Jul 2020 10:38:03 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:53 +0100 Message-Id: <20200730093756.16737-19-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 18/21] drm/i915/gt: Split the breadcrumb spinlock between global and contexts X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As we funnel more and more contexts into the breadcrumbs on an engine, the hold time of b->irq_lock grows. As we may then contend with the b->irq_lock during request submission, this increases the burden upon the engine->active.lock and so directly impacts both our execution latency and client latency. If we split the b->irq_lock by introducing a per-context spinlock to manage the signalers within a context, we then only need the b->irq_lock for enabling/disabling the interrupt and can avoid taking the lock for walking the list of contexts within the signal worker. Even with the current setup, this greatly reduces the number of times we have to take and fight for b->irq_lock. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 149 ++++++++++-------- drivers/gpu/drm/i915/gt/intel_context.c | 1 + drivers/gpu/drm/i915/gt/intel_context_types.h | 1 + 3 files changed, 84 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 6a278bf0fc6b..23fe647b50b3 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -103,17 +103,19 @@ static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b) static void add_signaling_context(struct intel_breadcrumbs *b, struct intel_context *ce) { - intel_context_get(ce); - list_add_tail(&ce->signal_link, &b->signalers); - if (list_is_first(&ce->signal_link, &b->signalers)) + lockdep_assert_held(&b->irq_lock); + + list_add_rcu(&ce->signal_link, &b->signalers); + if (list_is_last(&ce->signal_link, &b->signalers)) __intel_breadcrumbs_arm_irq(b); } static void remove_signaling_context(struct intel_breadcrumbs *b, struct intel_context *ce) { - list_del(&ce->signal_link); - intel_context_put(ce); + spin_lock(&b->irq_lock); + list_del_rcu(&ce->signal_link); + spin_unlock(&b->irq_lock); } static inline bool __request_completed(const struct i915_request *rq) @@ -189,20 +191,30 @@ static void signal_irq_work(struct irq_work *work) struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work); const ktime_t timestamp = ktime_get(); struct llist_node *signal, *sn; - struct intel_context *ce, *cn; - struct list_head *pos, *next; + struct intel_context *ce; signal = NULL; if (unlikely(!llist_empty(&b->signaled_requests))) signal = llist_del_all(&b->signaled_requests); - spin_lock(&b->irq_lock); + if (!signal && READ_ONCE(b->irq_armed) && list_empty(&b->signalers)) { + if (spin_trylock(&b->irq_lock)) { + if (list_empty(&b->signalers)) + __intel_breadcrumbs_disarm_irq(b); + spin_unlock(&b->irq_lock); + } + } + + rcu_read_lock(); + list_for_each_entry_rcu(ce, &b->signalers, signal_link) { + struct list_head *pos, *next; + bool release = false; - if (!signal && list_empty(&b->signalers)) - __intel_breadcrumbs_disarm_irq(b); + if (!spin_trylock(&ce->signal_lock)) + continue; - list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) { - GEM_BUG_ON(list_empty(&ce->signals)); + if (list_empty(&ce->signals)) + goto unlock; list_for_each_safe(pos, next, &ce->signals) { struct i915_request *rq = @@ -235,11 +247,16 @@ static void signal_irq_work(struct irq_work *work) if (&ce->signals == pos) { /* now empty */ add_retire(b, ce->timeline); remove_signaling_context(b, ce); + release = true; } } - } - spin_unlock(&b->irq_lock); +unlock: + spin_unlock(&ce->signal_lock); + if (release) + intel_context_put(ce); + } + rcu_read_unlock(); llist_for_each_safe(signal, sn, signal) { struct i915_request *rq = @@ -313,9 +330,9 @@ void intel_breadcrumbs_free(struct intel_breadcrumbs *b) kfree(b); } -static void insert_breadcrumb(struct i915_request *rq, - struct intel_breadcrumbs *b) +static void insert_breadcrumb(struct i915_request *rq) { + struct intel_breadcrumbs *b = READ_ONCE(rq->engine)->breadcrumbs; struct intel_context *ce = rq->context; struct list_head *pos; @@ -351,7 +368,33 @@ static void insert_breadcrumb(struct i915_request *rq, * the list. */ if (list_empty(&ce->signals)) { + /* + * rq->engine is locked by rq->engine->active.lock. That + * however is not known until after rq->engine has been + * dereferenced and the lock acquired. Hence we acquire the + * lock and then validate that rq->engine still matches the + * lock we hold for it. + * + * Here, we are using the breadcrumb lock as a proxy for the + * rq->engine->active.lock, and we know that since the + * breadcrumb will be serialised within i915_request_submit + * the engine cannot change while active as long as we hold + * the breadcrumb lock on that engine. + * + * From the dma_fence_enable_signaling() path, we are outside + * of the request submit/unsubmit path, and so we must be more + * careful to acquire the right lock. + */ + intel_context_get(ce); + spin_lock(&b->irq_lock); + while (unlikely(b != READ_ONCE(rq->engine)->breadcrumbs)) { + spin_unlock(&b->irq_lock); + b = READ_ONCE(rq->engine)->breadcrumbs; + spin_lock(&b->irq_lock); + } add_signaling_context(b, ce); + spin_unlock(&b->irq_lock); + pos = &ce->signals; } else { list_for_each_prev(pos, &ce->signals) { @@ -373,7 +416,7 @@ static void insert_breadcrumb(struct i915_request *rq, bool i915_request_enable_breadcrumb(struct i915_request *rq) { - struct intel_breadcrumbs *b; + struct intel_context *ce = rq->context; /* Serialises with i915_request_retire() using rq->lock */ if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags)) @@ -388,67 +431,37 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq) if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) return true; - /* - * rq->engine is locked by rq->engine->active.lock. That however - * is not known until after rq->engine has been dereferenced and - * the lock acquired. Hence we acquire the lock and then validate - * that rq->engine still matches the lock we hold for it. - * - * Here, we are using the breadcrumb lock as a proxy for the - * rq->engine->active.lock, and we know that since the breadcrumb - * will be serialised within i915_request_submit/i915_request_unsubmit, - * the engine cannot change while active as long as we hold the - * breadcrumb lock on that engine. - * - * From the dma_fence_enable_signaling() path, we are outside of the - * request submit/unsubmit path, and so we must be more careful to - * acquire the right lock. - */ - b = READ_ONCE(rq->engine)->breadcrumbs; - spin_lock(&b->irq_lock); - while (unlikely(b != READ_ONCE(rq->engine)->breadcrumbs)) { - spin_unlock(&b->irq_lock); - b = READ_ONCE(rq->engine)->breadcrumbs; - spin_lock(&b->irq_lock); - } - - /* - * Now that we are finally serialised with request submit/unsubmit, - * [with b->irq_lock] and with i915_request_retire() [via checking - * SIGNALED with rq->lock] confirm the request is indeed active. If - * it is no longer active, the breadcrumb will be attached upon - * i915_request_submit(). - */ + spin_lock(&ce->signal_lock); if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) - insert_breadcrumb(rq, b); - - spin_unlock(&b->irq_lock); + insert_breadcrumb(rq); + spin_unlock(&ce->signal_lock); return true; } void i915_request_cancel_breadcrumb(struct i915_request *rq) { - struct intel_breadcrumbs *b = rq->engine->breadcrumbs; + struct intel_context *ce = rq->context; + bool release = false; - /* - * We must wait for b->irq_lock so that we know the interrupt handler - * has released its reference to the intel_context and has completed - * the DMA_FENCE_FLAG_SIGNALED_BIT/I915_FENCE_FLAG_SIGNAL dance (if - * required). - */ - spin_lock(&b->irq_lock); + if (!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) + return; + + spin_lock(&ce->signal_lock); if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) { - struct intel_context *ce = rq->context; + clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); list_del(&rq->signal_link); - if (list_empty(&ce->signals)) - remove_signaling_context(b, ce); + if (list_empty(&ce->signals)) { + remove_signaling_context(rq->engine->breadcrumbs, ce); + release = true; + } - clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); i915_request_put(rq); } - spin_unlock(&b->irq_lock); + spin_unlock(&ce->signal_lock); + if (release) + intel_context_put(ce); } void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, @@ -464,8 +477,9 @@ void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, drm_printf(p, "IRQ: %s\n", enableddisabled(b->irq_armed)); drm_printf(p, "Signals:\n"); - spin_lock_irq(&b->irq_lock); - list_for_each_entry(ce, &b->signalers, signal_link) { + rcu_read_lock(); + list_for_each_entry_rcu(ce, &b->signalers, signal_link) { + spin_lock_irq(&ce->signal_lock); list_for_each_entry(rq, &ce->signals, signal_link) { drm_printf(p, "\t[%llx:%llx%s] @ %dms\n", rq->fence.context, rq->fence.seqno, @@ -474,6 +488,7 @@ void intel_engine_print_breadcrumbs(struct intel_engine_cs *engine, "", jiffies_to_msecs(jiffies - rq->emitted_jiffies)); } + spin_unlock_irq(&ce->signal_lock); } - spin_unlock_irq(&b->irq_lock); + rcu_read_unlock(); } diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 4e7924640ffa..cde356c7754d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -147,6 +147,7 @@ static void __intel_context_ctor(void *arg) { struct intel_context *ce = arg; + spin_lock_init(&ce->signal_lock); INIT_LIST_HEAD(&ce->signal_link); INIT_LIST_HEAD(&ce->signals); diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 4954b0df4864..a78c1c225ce3 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -51,6 +51,7 @@ struct intel_context { struct i915_address_space *vm; struct i915_gem_context __rcu *gem_context; + spinlock_t signal_lock; struct list_head signal_link; struct list_head signals; From patchwork Thu Jul 30 09:37:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692659 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 62D5B14DD for ; Thu, 30 Jul 2020 09:38:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4B9B120809 for ; Thu, 30 Jul 2020 09:38:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B9B120809 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 94F396E8BF; Thu, 30 Jul 2020 09:38:48 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id DCF446E169 for ; Thu, 30 Jul 2020 09:38:08 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979067-1500050 for multiple; Thu, 30 Jul 2020 10:38:03 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:54 +0100 Message-Id: <20200730093756.16737-20-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 19/21] drm/i915: Drop i915_request.lock serialisation around await_start X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Originally, we used the signal->lock as a means of following the previous link in its timeline and peeking at the previous fence. However, we have replaced the explicit serialisation with a series of very careful probes that anticipate the links being deleted and the fences recycled before we are able to acquire a strong reference to it. We do not need the signal->lock crutch anymore, nor want the contention. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_request.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index e36253bb6124..247e021203c2 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -955,9 +955,16 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) if (i915_request_started(signal)) return 0; + /* + * The caller holds a reference on @signal, but we do not serialise + * against it being retired and removed from the lists. + * + * We do not hold a reference to the request before @signal, and + * so must be very careful to ensure that it is not _recycled_ as + * we follow the link backwards. + */ fence = NULL; rcu_read_lock(); - spin_lock_irq(&signal->lock); do { struct list_head *pos = READ_ONCE(signal->link.prev); struct i915_request *prev; @@ -988,7 +995,6 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) fence = &prev->fence; } while (0); - spin_unlock_irq(&signal->lock); rcu_read_unlock(); if (!fence) return 0; From patchwork Thu Jul 30 09:37:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692621 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 432B3912 for ; Thu, 30 Jul 2020 09:38:10 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2AF332075F for ; Thu, 30 Jul 2020 09:38:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2AF332075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 40AFC6E8AE; Thu, 30 Jul 2020 09:38:09 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC0E588E7E for ; Thu, 30 Jul 2020 09:38:07 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979068-1500050 for multiple; Thu, 30 Jul 2020 10:38:03 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:55 +0100 Message-Id: <20200730093756.16737-21-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 20/21] drm/i915: Drop i915_request.lock requirement for intel_rps_boost() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since we use a flag within i915_request.flags to indicate when we have boosted the request (so that we only apply the boost) once, this can be used as the serialisation with i915_request_retire() to avoid having to explicitly take the i915_request.lock which is more heavily contended. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_rps.c | 15 ++++++--------- drivers/gpu/drm/i915/i915_request.c | 4 +--- 2 files changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index e6a00eea0631..2a43e216e0d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -889,17 +889,15 @@ void intel_rps_park(struct intel_rps *rps) void intel_rps_boost(struct i915_request *rq) { - struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps; - unsigned long flags; - - if (i915_request_signaled(rq) || !intel_rps_is_active(rps)) + if (i915_request_signaled(rq) || i915_request_has_waitboost(rq)) return; /* Serializes with i915_request_retire() */ - spin_lock_irqsave(&rq->lock, flags); - if (!i915_request_has_waitboost(rq) && - !dma_fence_is_signaled_locked(&rq->fence)) { - set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags); + if (!test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) { + struct intel_rps *rps = &READ_ONCE(rq->engine)->gt->rps; + + if (!intel_rps_is_active(rps)) + return; GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", rq->fence.context, rq->fence.seqno); @@ -910,7 +908,6 @@ void intel_rps_boost(struct i915_request *rq) atomic_inc(&rps->boosts); } - spin_unlock_irqrestore(&rq->lock, flags); } int intel_rps_set(struct intel_rps *rps, u8 val) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 247e021203c2..43614d8fa18d 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -306,10 +306,8 @@ bool i915_request_retire(struct i915_request *rq) spin_unlock_irq(&rq->lock); } - if (i915_request_has_waitboost(rq)) { - GEM_BUG_ON(!atomic_read(&rq->engine->gt->rps.num_waiters)); + if (test_and_set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags)) atomic_dec(&rq->engine->gt->rps.num_waiters); - } /* * We only loosely track inflight requests across preemption, From patchwork Thu Jul 30 09:37:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11692629 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4614A722 for ; Thu, 30 Jul 2020 09:38:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2EC9E2075F for ; Thu, 30 Jul 2020 09:38:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2EC9E2075F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7CE586E8B6; Thu, 30 Jul 2020 09:38:11 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id CCA0088E7E for ; Thu, 30 Jul 2020 09:38:08 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21979070-1500050 for multiple; Thu, 30 Jul 2020 10:38:03 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 30 Jul 2020 10:37:56 +0100 Message-Id: <20200730093756.16737-22-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200730093756.16737-1-chris@chris-wilson.co.uk> References: <20200730093756.16737-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 21/21] drm/i915/gem: Delay tracking the GEM context until it is registered X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: stable@vger.kernel.org, Chris Wilson , thomas.hellstrom@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Avoid exposing a partially constructed context by deferring the list_add() from the initial construction to the end of registration. Otherwise, if we peek into the list of contexts from inside debugfs, we may see the partially constructed context and chase down some dangling incomplete pointers. Reported-by: CQ Tang Fixes: 3aa9945a528e ("drm/i915: Separate GEM context construction and registration to userspace") References: f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: CQ Tang Cc: # v5.2+ Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index f43f0ca4eec9..d308cefb2b34 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -713,6 +713,7 @@ __create_context(struct drm_i915_private *i915) ctx->i915 = i915; ctx->sched.priority = I915_USER_PRIORITY(I915_PRIORITY_NORMAL); mutex_init(&ctx->mutex); + INIT_LIST_HEAD(&ctx->link); spin_lock_init(&ctx->stale.lock); INIT_LIST_HEAD(&ctx->stale.engines); @@ -740,10 +741,6 @@ __create_context(struct drm_i915_private *i915) for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; - spin_lock(&i915->gem.contexts.lock); - list_add_tail(&ctx->link, &i915->gem.contexts.list); - spin_unlock(&i915->gem.contexts.lock); - return ctx; err_free: @@ -936,6 +933,7 @@ static int gem_context_register(struct i915_gem_context *ctx, struct drm_i915_file_private *fpriv, u32 *id) { + struct drm_i915_private *i915 = ctx->i915; struct i915_address_space *vm; int ret; @@ -954,8 +952,16 @@ static int gem_context_register(struct i915_gem_context *ctx, /* And finally expose ourselves to userspace via the idr */ ret = xa_alloc(&fpriv->context_xa, id, ctx, xa_limit_32b, GFP_KERNEL); if (ret) - put_pid(fetch_and_zero(&ctx->pid)); + goto err_pid; + + spin_lock(&i915->gem.contexts.lock); + list_add_tail(&ctx->link, &i915->gem.contexts.list); + spin_unlock(&i915->gem.contexts.lock); + + return 0; +err_pid: + put_pid(fetch_and_zero(&ctx->pid)); return ret; }