From patchwork Tue Jul 28 15:30:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689409 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFDA4138C for ; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B84DA206F5 for ; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B84DA206F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 59C986E348; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id BD35D6E348 for ; Tue, 28 Jul 2020 15:31:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959606-1500050 for multiple; Tue, 28 Jul 2020 16:30:48 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:43 +0100 Message-Id: <20200728153049.27682-1-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/7] drm/i915: Add a couple of missing i915_active_fini() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We use i915_active_fini() as a debug check on the i915_active state before freeing. If we forget to call it, we may end up angering the debugobjects contained within. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/display/intel_frontbuffer.c | 2 ++ drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c | 5 ++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c index 2979ed2588eb..d898b370d7a4 100644 --- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c +++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c @@ -232,6 +232,8 @@ static void frontbuffer_release(struct kref *ref) RCU_INIT_POINTER(obj->frontbuffer, NULL); spin_unlock(&to_i915(obj->base.dev)->fb_tracking.lock); + i915_active_fini(&front->write); + i915_gem_object_put(obj); kfree_rcu(front, rcu); } diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c index 73243ba59c7d..e73854dd2fe0 100644 --- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c @@ -47,7 +47,10 @@ static int pulse_active(struct i915_active *active) static void pulse_free(struct kref *kref) { - kfree(container_of(kref, struct pulse, kref)); + struct pulse *p = container_of(kref, typeof(*p), kref); + + i915_active_fini(&p->active); + kfree(p); } static void pulse_put(struct pulse *p) From patchwork Tue Jul 28 15:30:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689415 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A439138C for ; Tue, 28 Jul 2020 15:31:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 23338206D4 for ; Tue, 28 Jul 2020 15:31:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 23338206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E79626E375; Tue, 28 Jul 2020 15:31:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id C99DC6E370 for ; Tue, 28 Jul 2020 15:31:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959607-1500050 for multiple; Tue, 28 Jul 2020 16:30:49 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:44 +0100 Message-Id: <20200728153049.27682-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/7] drm/i915: Skip taking acquire mutex for no ref->active callback X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If no active callback is defined for i915_active, we do not need to serialise its enabling with the mutex. We still do only want to call the debug activate once, and must still serialise with a concurrent retire. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index d960d0be5bd2..500537889e66 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -416,6 +416,14 @@ bool i915_active_acquire_if_busy(struct i915_active *ref) return atomic_add_unless(&ref->count, 1, 0); } +static void __i915_active_activate(struct i915_active *ref) +{ + spin_lock_irq(&ref->tree_lock); /* __active_retire() */ + if (!atomic_fetch_inc(&ref->count)) + debug_active_activate(ref); + spin_unlock_irq(&ref->tree_lock); +} + int i915_active_acquire(struct i915_active *ref) { int err; @@ -423,19 +431,19 @@ int i915_active_acquire(struct i915_active *ref) if (i915_active_acquire_if_busy(ref)) return 0; + if (!ref->active) { + __i915_active_activate(ref); + return 0; + } + err = mutex_lock_interruptible(&ref->mutex); if (err) return err; if (likely(!i915_active_acquire_if_busy(ref))) { - if (ref->active) - err = ref->active(ref); - if (!err) { - spin_lock_irq(&ref->tree_lock); /* __active_retire() */ - debug_active_activate(ref); - atomic_inc(&ref->count); - spin_unlock_irq(&ref->tree_lock); - } + err = ref->active(ref); + if (!err) + __i915_active_activate(ref); } mutex_unlock(&ref->mutex); From patchwork Tue Jul 28 15:30:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE4AF138C for ; Tue, 28 Jul 2020 15:31:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D70B6206D4 for ; Tue, 28 Jul 2020 15:31:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D70B6206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 392ED6E372; Tue, 28 Jul 2020 15:31:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id D1F026E372 for ; Tue, 28 Jul 2020 15:31:02 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959608-1500050 for multiple; Tue, 28 Jul 2020 16:30:49 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:45 +0100 Message-Id: <20200728153049.27682-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/7] drm/i915: Export a preallocate variant of i915_active_acquire() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Sometimes we have to be very careful not to allocate underneath a mutex (or spinlock) and yet still want to track activity. Enter i915_active_acquire_for_context(). This raises the activity counter on i915_active prior to use and ensures that the fence-tree contains a slot for the context. v2: Refactor active_lookup() so it can be called again before/after locking to resolve contention. Since we protect the rbtree until we idle, we can do a lockfree lookup, with the caveat that if another thread performs a concurrent insertion, the rotations from the insert may cause us to not find our target. A second pass holding the treelock will find the target if it exists, or the place to perform our insertion. Signed-off-by: Chris Wilson --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 4 +- drivers/gpu/drm/i915/i915_active.c | 150 ++++++++++++++---- drivers/gpu/drm/i915/i915_active.h | 12 +- 4 files changed, 130 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6b4ec66cb558..719ba9fe3e85 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1729,7 +1729,7 @@ __parser_mark_active(struct i915_vma *vma, { struct intel_gt_buffer_pool_node *node = vma->private; - return i915_active_ref(&node->active, tl, fence); + return i915_active_ref(&node->active, tl->fence_context, fence); } static int diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 46d20f5f3ddc..acb43aebd669 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -484,7 +484,9 @@ __intel_timeline_get_seqno(struct intel_timeline *tl, * free it after the current request is retired, which ensures that * all writes into the cacheline from previous requests are complete. */ - err = i915_active_ref(&tl->hwsp_cacheline->active, tl, &rq->fence); + err = i915_active_ref(&tl->hwsp_cacheline->active, + tl->fence_context, + &rq->fence); if (err) goto err_cacheline; diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 500537889e66..3a728401c09c 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -28,12 +28,14 @@ static struct i915_global_active { } global; struct active_node { + struct rb_node node; struct i915_active_fence base; struct i915_active *ref; - struct rb_node node; u64 timeline; }; +#define fetch_node(x) rb_entry(READ_ONCE(x), typeof(struct active_node), node) + static inline struct active_node * node_from_active(struct i915_active_fence *active) { @@ -216,12 +218,9 @@ excl_retire(struct dma_fence *fence, struct dma_fence_cb *cb) active_retire(container_of(cb, struct i915_active, excl.cb)); } -static struct i915_active_fence * -active_instance(struct i915_active *ref, struct intel_timeline *tl) +static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) { - struct active_node *node, *prealloc; - struct rb_node **p, *parent; - u64 idx = tl->fence_context; + struct active_node *it; /* * We track the most recently used timeline to skip a rbtree search @@ -230,8 +229,39 @@ active_instance(struct i915_active *ref, struct intel_timeline *tl) * after the previous activity has been retired, or if it matches the * current timeline. */ - node = READ_ONCE(ref->cache); - if (node && node->timeline == idx) + it = READ_ONCE(ref->cache); + if (it && it->timeline == idx) + return it; + + BUILD_BUG_ON(offsetof(typeof(*it), node)); + + /* While active, the tree can only be built; not destroyed */ + GEM_BUG_ON(i915_active_is_idle(ref)); + + it = fetch_node(ref->tree.rb_node); + while (it) { + if (it->timeline < idx) { + it = fetch_node(it->node.rb_right); + } else if (it->timeline > idx) { + it = fetch_node(it->node.rb_left); + } else { + WRITE_ONCE(ref->cache, it); + break; + } + } + + /* NB: If the tree rotated beneath us, we may miss our target. */ + return it; +} + +static struct i915_active_fence * +active_instance(struct i915_active *ref, u64 idx) +{ + struct active_node *node, *prealloc; + struct rb_node **p, *parent; + + node = __active_lookup(ref, idx); + if (likely(node)) return &node->base; /* Preallocate a replacement, just in case */ @@ -268,10 +298,9 @@ active_instance(struct i915_active *ref, struct intel_timeline *tl) rb_insert_color(&node->node, &ref->tree); out: - ref->cache = node; + WRITE_ONCE(ref->cache, node); spin_unlock_irq(&ref->tree_lock); - BUILD_BUG_ON(offsetof(typeof(*node), base)); return &node->base; } @@ -353,63 +382,102 @@ __active_del_barrier(struct i915_active *ref, struct active_node *node) return ____active_del_barrier(ref, node, barrier_to_engine(node)); } -int i915_active_ref(struct i915_active *ref, - struct intel_timeline *tl, - struct dma_fence *fence) +static bool +replace_barrier(struct i915_active *ref, struct i915_active_fence *active) +{ + if (!is_barrier(active)) /* proto-node used by our idle barrier? */ + return false; + + /* + * This request is on the kernel_context timeline, and so + * we can use it to substitute for the pending idle-barrer + * request that we want to emit on the kernel_context. + */ + __active_del_barrier(ref, node_from_active(active)); + return true; +} + +int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) { struct i915_active_fence *active; int err; - lockdep_assert_held(&tl->mutex); - /* Prevent reaping in case we malloc/wait while building the tree */ err = i915_active_acquire(ref); if (err) return err; - active = active_instance(ref, tl); + active = active_instance(ref, idx); if (!active) { err = -ENOMEM; goto out; } - if (is_barrier(active)) { /* proto-node used by our idle barrier */ - /* - * This request is on the kernel_context timeline, and so - * we can use it to substitute for the pending idle-barrer - * request that we want to emit on the kernel_context. - */ - __active_del_barrier(ref, node_from_active(active)); + if (replace_barrier(ref, active)) { RCU_INIT_POINTER(active->fence, NULL); atomic_dec(&ref->count); } if (!__i915_active_fence_set(active, fence)) - atomic_inc(&ref->count); + __i915_active_acquire(ref); out: i915_active_release(ref); return err; } -struct dma_fence * -i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +static struct dma_fence * +__i915_active_set_fence(struct i915_active *ref, + struct i915_active_fence *active, + struct dma_fence *fence) { struct dma_fence *prev; - /* We expect the caller to manage the exclusive timeline ordering */ - GEM_BUG_ON(i915_active_is_idle(ref)); + if (replace_barrier(ref, active)) { + RCU_INIT_POINTER(active->fence, fence); + return NULL; + } rcu_read_lock(); - prev = __i915_active_fence_set(&ref->excl, f); + prev = __i915_active_fence_set(active, fence); if (prev) prev = dma_fence_get_rcu(prev); else - atomic_inc(&ref->count); + __i915_active_acquire(ref); rcu_read_unlock(); return prev; } +static struct i915_active_fence * +__active_fence(struct i915_active *ref, u64 idx) +{ + struct active_node *it; + + it = __active_lookup(ref, idx); + if (unlikely(!it)) { /* Contention with parallel tree builders! */ + spin_lock_irq(&ref->tree_lock); + it = __active_lookup(ref, idx); + spin_unlock_irq(&ref->tree_lock); + } + GEM_BUG_ON(!it); /* slot must be preallocated */ + + return &it->base; +} + +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence) +{ + /* Only valid while active, see i915_active_acquire_for_context() */ + return __i915_active_set_fence(ref, __active_fence(ref, idx), fence); +} + +struct dma_fence * +i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) +{ + /* We expect the caller to manage the exclusive timeline ordering */ + return __i915_active_set_fence(ref, &ref->excl, f); +} + bool i915_active_acquire_if_busy(struct i915_active *ref) { debug_active_assert(ref); @@ -451,6 +519,24 @@ int i915_active_acquire(struct i915_active *ref) return err; } +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx) +{ + struct i915_active_fence *active; + int err; + + err = i915_active_acquire(ref); + if (err) + return err; + + active = active_instance(ref, idx); + if (!active) { + i915_active_release(ref); + return -ENOMEM; + } + + return 0; /* return with active ref */ +} + void i915_active_release(struct i915_active *ref) { debug_active_assert(ref); @@ -754,7 +840,7 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) match: rb_erase(p, &ref->tree); /* Hide from waits and sibling allocations */ if (p == &ref->cache->node) - ref->cache = NULL; + WRITE_ONCE(ref->cache, NULL); spin_unlock_irq(&ref->tree_lock); return rb_entry(p, struct active_node, node); @@ -812,7 +898,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, */ RCU_INIT_POINTER(node->base.fence, ERR_PTR(-EAGAIN)); node->base.cb.node.prev = (void *)engine; - atomic_inc(&ref->count); + __i915_active_acquire(ref); } GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN)); diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index cf4058150966..73ded3c52a04 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -163,14 +163,16 @@ void __i915_active_init(struct i915_active *ref, __i915_active_init(ref, active, retire, &__mkey, &__wkey); \ } while (0) -int i915_active_ref(struct i915_active *ref, - struct intel_timeline *tl, - struct dma_fence *fence); +struct dma_fence * +__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); +int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence); static inline int i915_active_add_request(struct i915_active *ref, struct i915_request *rq) { - return i915_active_ref(ref, i915_request_timeline(rq), &rq->fence); + return i915_active_ref(ref, + i915_request_timeline(rq)->fence_context, + &rq->fence); } struct dma_fence * @@ -198,7 +200,9 @@ int i915_request_await_active(struct i915_request *rq, #define I915_ACTIVE_AWAIT_BARRIER BIT(2) int i915_active_acquire(struct i915_active *ref); +int i915_active_acquire_for_context(struct i915_active *ref, u64 idx); bool i915_active_acquire_if_busy(struct i915_active *ref); + void i915_active_release(struct i915_active *ref); static inline void __i915_active_acquire(struct i915_active *ref) From patchwork Tue Jul 28 15:30:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689417 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 07A7C913 for ; Tue, 28 Jul 2020 15:31:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E44BE206D4 for ; Tue, 28 Jul 2020 15:31:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E44BE206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9C7016E373; Tue, 28 Jul 2020 15:31:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id D0D856E370 for ; Tue, 28 Jul 2020 15:31:02 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959609-1500050 for multiple; Tue, 28 Jul 2020 16:30:49 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:46 +0100 Message-Id: <20200728153049.27682-4-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/7] drm/i915: Keep the most recently used active-fence upon discard X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Whenever an i915_active idles, we prune its tree of old fence slots to prevent a gradual leak should it be used to track many, many timelines. The downside is that we then have to frequently reallocate the rbtree. A compromise is that we keep the most recently used fence slot, and reuse that for the next active reference as that is the most likely timeline to be reused. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 27 ++++++++++++++++++++------- drivers/gpu/drm/i915/i915_active.h | 4 ---- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 3a728401c09c..b9bd5578ff54 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -130,8 +130,8 @@ static inline void debug_active_assert(struct i915_active *ref) { } static void __active_retire(struct i915_active *ref) { + struct rb_root root = RB_ROOT; struct active_node *it, *n; - struct rb_root root; unsigned long flags; GEM_BUG_ON(i915_active_is_idle(ref)); @@ -143,9 +143,21 @@ __active_retire(struct i915_active *ref) GEM_BUG_ON(rcu_access_pointer(ref->excl.fence)); debug_active_deactivate(ref); - root = ref->tree; - ref->tree = RB_ROOT; - ref->cache = NULL; + /* Even if we have not used the cache, we may still have a barrier */ + if (!ref->cache) + ref->cache = fetch_node(ref->tree.rb_node); + + /* Keep the MRU cached node for reuse */ + if (ref->cache) { + /* Discard all other nodes in the tree */ + rb_erase(&ref->cache->node, &ref->tree); + root = ref->tree; + + /* Rebuild the tree with only the cached node */ + rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node); + rb_insert_color(&ref->cache->node, &ref->tree); + GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node); + } spin_unlock_irqrestore(&ref->tree_lock, flags); @@ -156,6 +168,7 @@ __active_retire(struct i915_active *ref) /* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref); + /* Finally free the discarded timeline tree */ rbtree_postorder_for_each_entry_safe(it, n, &root, node) { GEM_BUG_ON(i915_active_fence_isset(&it->base)); kmem_cache_free(global.slab_cache, it); @@ -745,16 +758,16 @@ int i915_sw_fence_await_active(struct i915_sw_fence *fence, return await_active(ref, flags, sw_await_fence, fence, fence); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref) { debug_active_fini(ref); GEM_BUG_ON(atomic_read(&ref->count)); GEM_BUG_ON(work_pending(&ref->work)); - GEM_BUG_ON(!RB_EMPTY_ROOT(&ref->tree)); mutex_destroy(&ref->mutex); + + if (ref->cache) + kmem_cache_free(global.slab_cache, ref->cache); } -#endif static inline bool is_idle_barrier(struct active_node *node, u64 idx) { diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index 73ded3c52a04..b9e0394e2975 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -217,11 +217,7 @@ i915_active_is_idle(const struct i915_active *ref) return !atomic_read(&ref->count); } -#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref); -#else -static inline void i915_active_fini(struct i915_active *ref) { } -#endif int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine); From patchwork Tue Jul 28 15:30:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689419 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E7B59138C for ; Tue, 28 Jul 2020 15:31:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D0B34206D4 for ; Tue, 28 Jul 2020 15:31:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D0B34206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 854EB6E378; Tue, 28 Jul 2020 15:31:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id DDEC86E375 for ; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959610-1500050 for multiple; Tue, 28 Jul 2020 16:30:49 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:47 +0100 Message-Id: <20200728153049.27682-5-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/7] drm/i915: Make the stale cached active node available for any timeline X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Rather than require the next timeline after idling to match the MRU before idling, reset the index on the node and allow it to match the first request. However, this requires cmpxchg(u64) and so is not trivial on 32b, so for compatibility we just fallback to keeping the cached node pointing to the MRU timeline. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index b9bd5578ff54..03b246cb8466 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -157,6 +157,10 @@ __active_retire(struct i915_active *ref) rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node); rb_insert_color(&ref->cache->node, &ref->tree); GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node); + + /* Make the cached node available for reuse with any timeline */ + if (IS_ENABLED(CONFIG_64BIT)) + ref->cache->timeline = 0; /* needs cmpxchg(u64) */ } spin_unlock_irqrestore(&ref->tree_lock, flags); @@ -235,6 +239,8 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) { struct active_node *it; + GEM_BUG_ON(idx == 0); /* 0 is the unordered timeline, rsvd for cache */ + /* * We track the most recently used timeline to skip a rbtree search * for the common case, under typical loads we never need the rbtree @@ -243,8 +249,17 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) * current timeline. */ it = READ_ONCE(ref->cache); - if (it && it->timeline == idx) - return it; + if (it) { + u64 cached = READ_ONCE(it->timeline); + + if (cached == idx) + return it; + +#ifdef CONFIG_64BIT /* for cmpxchg(u64) */ + if (!cached && !cmpxchg(&it->timeline, 0, idx)) + return it; +#endif + } BUILD_BUG_ON(offsetof(typeof(*it), node)); From patchwork Tue Jul 28 15:30:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689421 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C36D9138C for ; Tue, 28 Jul 2020 15:31:09 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AC51B206D4 for ; Tue, 28 Jul 2020 15:31:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC51B206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E305C6E37F; Tue, 28 Jul 2020 15:31:05 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id E30576E378 for ; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959611-1500050 for multiple; Tue, 28 Jul 2020 16:30:50 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:48 +0100 Message-Id: <20200728153049.27682-6-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 6/7] drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As the conversion between idle-barrier and full i915_active_fence is already serialised by explicit memory barriers, we can reduce the spinlock in i915_active_acquire_preallocate_barrier() for finding an idle-barrier to reuse to an RCU read lock to ensure the fence remains valid, only taking the spinlock for the update of the rbtree itself. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_active.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index 03b246cb8466..d22b44f9470a 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -796,7 +796,6 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) if (RB_EMPTY_ROOT(&ref->tree)) return NULL; - spin_lock_irq(&ref->tree_lock); GEM_BUG_ON(i915_active_is_idle(ref)); /* @@ -861,11 +860,10 @@ static struct active_node *reuse_idle_barrier(struct i915_active *ref, u64 idx) goto match; } - spin_unlock_irq(&ref->tree_lock); - return NULL; match: + spin_lock_irq(&ref->tree_lock); rb_erase(p, &ref->tree); /* Hide from waits and sibling allocations */ if (p == &ref->cache->node) WRITE_ONCE(ref->cache, NULL); @@ -900,7 +898,9 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct llist_node *prev = first; struct active_node *node; + rcu_read_lock(); node = reuse_idle_barrier(ref, idx); + rcu_read_unlock(); if (!node) { node = kmem_cache_alloc(global.slab_cache, GFP_KERNEL); if (!node) { From patchwork Tue Jul 28 15:30:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11689411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 92846159A for ; Tue, 28 Jul 2020 15:31:04 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7AC3E206F5 for ; Tue, 28 Jul 2020 15:31:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7AC3E206F5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9A3876E370; Tue, 28 Jul 2020 15:31:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9D12F6E348 for ; Tue, 28 Jul 2020 15:31:02 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21959612-1500050 for multiple; Tue, 28 Jul 2020 16:30:50 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Tue, 28 Jul 2020 16:30:49 +0100 Message-Id: <20200728153049.27682-7-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200728153049.27682-1-chris@chris-wilson.co.uk> References: <20200728153049.27682-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 7/7] drm/i915: Provide a fastpath for waiting on vma bindings X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.hellstrom@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Before we can execute a request, we must wait for all of its vma to be bound. This is a frequent operation for which we can optimise away a few atomic operations (notably a cmpxchg) in lieu of the RCU protection. Signed-off-by: Chris Wilson Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/i915_active.h | 15 +++++++++++++++ drivers/gpu/drm/i915/i915_vma.c | 9 +++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index b9e0394e2975..fb165d3f01cf 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -231,4 +231,19 @@ struct i915_active *i915_active_create(void); struct i915_active *i915_active_get(struct i915_active *ref); void i915_active_put(struct i915_active *ref); +static inline int __i915_request_await_exclusive(struct i915_request *rq, + struct i915_active *active) +{ + struct dma_fence *fence; + int err = 0; + + fence = i915_active_fence_get(&active->excl); + if (fence) { + err = i915_request_await_dma_fence(rq, fence); + dma_fence_put(fence); + } + + return err; +} + #endif /* _I915_ACTIVE_H_ */ diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index bc64f773dcdb..cd12047c7791 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -1167,6 +1167,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma) list_del(&vma->obj->userfault_link); } +static int +__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma) +{ + return __i915_request_await_exclusive(rq, &vma->active); +} + int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) { int err; @@ -1174,8 +1180,7 @@ int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq) GEM_BUG_ON(!i915_vma_is_pinned(vma)); /* Wait for the vma to be bound before we start! */ - err = i915_request_await_active(rq, &vma->active, - I915_ACTIVE_AWAIT_EXCL); + err = __i915_request_await_bind(rq, vma); if (err) return err;